Incorporating Speculative Execution into Scheduling of Control-flow Intensive Behavioral Descriptions

Size: px
Start display at page:

Download "Incorporating Speculative Execution into Scheduling of Control-flow Intensive Behavioral Descriptions"

Transcription

1 Inorporating Speulative Exeution into Sheduling of Control-flow Intenive Behavioral Deription Ganeh Lakhminarayana, Anand Raghunathan, and Niraj K. Jha Dept. of Eletrial Engineering C&C Reearh Laboratorie Prineton Univerity NEC USA Prineton, NJ Prineton, NJ Abtrat Speulative exeution refer to the exeution of part of a omputation before the exeution of the onditional operation that deide whether it need to be exeuted. It ha been hown to be a promiing tehnique for eliminating performane bottlenek impoed by ontrol flow in hardware and oftware implementation alike. In thi paper, we preent tehnique to inorporate peulative exeution in a fine-grained manner into heduling of ontrol-flow intenive behavioral deription. We demontrate that failing to take into aount information uh a reoure ontraint and branh probabilitie an lead to ignifiantly ub-optimal performane. We alo demontrate that it may be neeary to peulate imultaneouly along multiple path, ubjet to reoure ontraint, in order to minimize the delay overhead inurred when predition error our. Experimental reult on everal benhmark how that our peulative heduling algorithm an reult in ignifiant (upto even-fold) improvement in performane (meaured in term of the average number of lok yle) a ompared to heduling without peulative exeution. Alo, the bet and wort ae exeution time for the peulatively performed hedule are the ame a or better than the orreponding value for the hedule obtained without peulative exeution. 1 Introdution Speulative exeution refer to the exeution of a part of a omputation before it i known if the ontrol path to whih it belong will be exeuted (for example, exeution of the ode after a branh tatement before the branh ondition itelf i evaluated). It ha been ued to overome, to ome extent, the heduling bottlenek impoed by ontrol-flow. There ha been previou work on peulative exeution in the area of high-level ynthei [1, 2, 3] a well a high-performane ompilation [4, 5]. Previou work [1, 2, 3] in high-level ynthei ha attempted to loate ingle or multiple path for peulation prior to heduling. Thi paper preent tehnique to integrate peulative exeution into heduling during high-level ynthei of ontrol-flow intenive deign. In that ontext, we demontrate that not uing information uh a reoure ontraint and branh probabil- Thi work wa upported in part by NSF under Grant No and in part by Alternative Sytem Conept, In. under an SBIR ontrat from Air Fore Rome Laboratorie. Permiion to make digital/hard opy of all or part of thi work for peronal or laroom ue i granted without fee provided that opie are not made or ditributed for profit or ommerial advantage, the opyright notie, the title of the publiation and it date appear, and notie i given that opying i by permiion of ACM, In. To opy otherwie, to republih, to pot on erver or to reditribute to lit, require prior peifi permiion and/or a fee. DAC 98, San Franio, California () 1998 ACM /97/06..$3.50 itie while deiding when to peulate an lead to ignifiantly uboptimal performane. We alo demontrate that it i neeary to perform peulative exeution along multiple path at a fine-grain level during the oure of heduling, in order to obtain maximal benefit. In addition, we preent tehnique to automatially manage the additional peulative reult that are generated by peulatively exeuted operation. We how how to inorporate peulative exeution into a generi heduling methodology, and in partiular preent the reult of it integration into an effiient heduler Wavehed [6]. Experimental reult for variou benhmark and example are preented that indiate upto even-fold improvement in performane (average number of lok yle required to perform the omputation). 2 Bakground and Motivation Sheduling tool typially work uing one or more intermediate repreentation of the behavioral deription, uh a a data flow graph (DFG), ontrol flow graph (CFG), or ontrol-data flow graph (CDFG). In thi paper, we ue the CDFG a the intermediate repreentation of a behavioral deription, and tate tranition graph (STG) to repreent the heduled behavioral deription, a explained in later etion. In addition to the behavioral deription, our heduler alo aept the following information: A ontraint on the number of reoure of eah type available (reoure alloation ontraint). The target lok period for the implementation, or ontraint that limit the extent of data and ontrol haining allowed. Profiling information that indiate the branh probabilitie for the variou onditional ontrut preent in the behavioral deription. We now preent ome motivational example to illutrate the ue of peulative exeution during heduling. Example 1: Conider a part of a behavioral deription and the orreponding CDFG fragment hown in Figure 1, that ontain a while loop. The CDFG ontain vertie orreponding to operation of the behavioral deription, where olid line indiate data dependenie, and dotted line indiate ontrol dependenie. Control edge in the CDFG are annotated with a variable that repreent the reult of the onditional operation that generate them. For example, the ontrol edge fed by operation > 1 are marked in Figure 1. The initial value of variable i and t4 ued in the loop body are indiated in parenthee beide the orreponding CDFG data edge. Let u now onider the tak of heduling the CDFG hown in Figure 1. Suppoe we have the following ontraint to be ued during heduling.

2 _1 S8 (0) i i := 0; t4 := 0; while (k > t4) { i := i + 1; t1 := M1[i]; t2 := t1 * C1; t3 := t2 * C2; t4 := t3 + C3 M2[i] := t4; }... k >1 M1 t1 t2 C1 * t3 2 C2 M2 Figure 1: A CDFG to illutrate peulative exeution S1 S2 S3 S5 S6 S7 ++1_0 M1_0 *1_0 *1_0 *2_0 *2_0 +1_0 M2_0, >1_1 >1_0 _0 _1 S9 _0 (a) S5 S6 (0) +1 >1_0, ++1_0/_0 _0 S1 ++1_1/ _1, M1_0 C3 t4 ++1_2/ (_1 _2), S2 M1_1/_1, *1_0 _0 S3 ++1_3, M1_2, *1_0, *1_1 ++1_4, M1_3, *1_1, *1_2, *2_0 ++1_5, M1_4, *1_2, *1_3, *2_0, *2_1 ++1_6, M1_5, *1_3, *1_4, *2_1, *2_2, +1_0 S7 >1_1, ++1_7, M1_6, *1_4, *1_5, *2_2, *2_3, +1_1, M2_0 S9 _1 _2 _2 S8 >1_2, ++1_8, M1_7, *1_5, *1_6, *2_3, *2_4, +1_2, M2_1 (b) Figure 2: (a) Non-peulative hedule for the CDFG of Figure 1, and (b) hedule inorporating peulative exeution The target lok period allow the exeution of +, ++, >, and memory ae operation in one lok yle, while the operation require two lok yle. In addition, we aume that the operation will be implemented uing a 2-tage pipelined multiplier. No operation haining i allowed, ine it lead to a violation of the target lok period ontraint (in general, however, our algorithm an handle haining). _1 The aim i to optimize the performane of the deign a muh a poible. Hene, no reoure ontraint are peified for the purpoe of illutration for thi example. Thi i not a limitation of our heduling algorithm, whih doe handle reoure ontraint a deribed in later etion. A hedule for the CDFG that doe not inorporate peulative exeution i hown in Figure 2(a). Thi hedule an be obtained by applying either the loop-direted heduling [7] tehnique or the Wavehed [6] tehnique to the CDFG. Vertie in the STG repreent hedule tate, that diretly orrepond to tate in the ontroller of the RTL implementation. Eah tate i annotated with the name of the CDFG operation that are performed in that tate, inluding a uffix that repreent a ymboli iteration index of the CDFG loop that the operation belong to. For example, onider operation > 1 of the CDFG. When > 1 i enountered the firt time during heduling, it i aigned a ubript 0, reulting in operation > 1 0 in the STG of Figure 2(a). In general, multiple opie of an operation may be generated during heduling, orreponding to different onditional path, or different iteration of a loop. For example, operation > 1 1 in the STG of Figure 2(a) orrepond to the exeution of the firt unrolled intane of CDFG operation > 1. An edge in the STG repreent a ontroller tate tranition, and i annotated with the ondition that ativate the tranition. Eah iteration of the loop in the heduled CDFG require eight lok yle. For thi example, the data dependenie among the operation within the loop require them to be performed erially. In addition, the ontrol dependenie between the omparion operation > 1 and operation and M1, together with the interiteration data dependeny 1 from +1 to > 1, prevent the parallel omputation of multiple loop iteration, even when loop unrolling i employed. A hedule for the CDFG of Figure 1 that inorporate peulative exeution i hown in the STG of Figure 2(b). Thi hedule wa derived by tehnique we preent in later etion. Speulatively exeuted operation are annotated with the onditional operation whoe reult they depend upon, uing the following notation. op/ond repreent an operation op that i exeuted auming that the peulation ondition ond will evaluate to true. The peulation ondition ond ould, in general, be an expreion that i a onjuntion of the reult of variou onditional operation in the STG. For example, onider operation / 1 in tate S1 of Figure 2(b). Thi i a peulatively exeuted operation, that orrepond to the eond intane of CDFG operation in the hedule, and aume that the reult of onditional operation > 1 1, whih i exeuted only in tate S7, i going to be true. State S7 and S8 repreent the teady tate of the hedule. Note that, when in the teady tate, a new iteration i initiated every yle, a oppoed to one in eight yle. The following example illutrate the impat of branh probabilitie and reoure ontraint on the performane of peulatively derived hedule and make a ae for the integration of peulation into the heduling proe. Example 2: Conider the example CDFG hown in Figure 3. The elet operation Sel1 elet the data operand at it l (r) port if the value at it port i 1 (0). Figure 4 how three different hedule that ue peulative exeution, that were generated uing different reoure ontraint and branh probabilitie. The STG of Figure 4(a) wa generated auming the following information. Available reoure onit of one inrementer (++), one adder(+), 1 An intra-iteration data or ontrol dependeny i between operation that orrepond to the ame iteration of a loop, while an inter-iteration dependeny i between operation in different (e.g., oneutive) iteration. We refer to intra-iteration data and ontrol dependenie imply a data and ontrol dependenie.

3 a >1 21 b () +1 () >> d Expeted Number of Cyle CC a CC b CC l r Sel1 Figure 3: CDFG demontrating the effet of reoure ontraint and branh probabilitie on peulative exeution ++1, +2 / >>1 S2 S3 >1, /, +1 / S1 ++1, +1 / * 1 S2 S3 >>1 /, >1, +2 / S1 out e ++1, +1 /, +2 / >>1 /, >1, / S1 S2 1 * S3 (a) (b) () Figure 4: Three peulative hedule derived uing different reoure ontraint or branh probabilitie Probability (P) Figure 5: Comparion of the peulative hedule The value of CC a, CC b, and CC for variou value of P ranging from 0 to 1 are plotted in Figure 5. A expeted, the hedule of Figure 4(a) outperform the hedule of Figure 4(b) when P() < 0.5, and the hedule of Figure 4(b) perform better when P() > 0.5. Moreover, the hedule of Figure 4(), whih wa derived uing one extra adder, outperform the other two hedule for all value of P(). Thu, we an onlude that branh probabilitie and reoure ontraint do influene the trade-off involved in deiding whih onditional path to peulate upon, making the ae for the integration of peulative exeution into the heduling tep where uh information i available. The following example illutrate that it i neeary to perform peulative exeution along multiple path, in a fine-grained manner, in order to obtain maximal performane improvement. Example 3: The hedule hown in Figure 4 were all generated ++1, +1 / one omparator (>), one hifter (>>), and one multiplier ( ), all of whih require one yle. Alo, the probability of omparion > 1 evaluating to f ale i higher than it evaluating to true. Sine the reult of > 1 evaluate to f ale more often, the hedule of Figure 4(a) give preferene to exeuting operation from the orreponding ontrol path (e.g., +2). A a reult, +2 i heduled to be performed on the ole adder in tate, a oppoed to +1, even though the data operand for both operation are available. The average number of lok yle, CC a, required for the STG in Figure 4(a) an be alulated a follow. >>1 /, >1 S2 S1 +2 S3 CC a = 4.P()+2.(1? P()) = 2.P()+2 (1) S5 In the above equation, P() repreent the probability that the reult of omparion > 1 evaluate to true. The STG of Figure 4(b) wa derived with the ame information a above, exept that it wa aumed that omparion > 1 evaluate to true more often than it evaluate to f ale. Hene, operation +1 i given preferene over operation +2 and i heduled in. The average number of lok yle, CC b, required for the STG in Figure 4(b) i given by the following expreion CC b = 3.P()+3.(1? P()) = 3 Suppoe the reoure ontraint were relaxed to allow two adder. The peulative hedule that reult i hown in Figure 4(). The average number of lok yle, CC, required for the STG in Figure 4() i given by the following expreion. CC = 3.P()+2.(1? P()) = P()+2 (2) Figure 6: Speulation along a ingle path by peulatively exeuting operation from both the onditional path of the CDFG in a fine-grained manner, a allowed by the reoure ontraint. For the purpoe of omparion, we heduled the CDFG hown in Figure 3, auming the ame heduling information that wa aumed to derive the hedule of Figure 4(b). However, in thi ae, we retrited the heduler to allow peulative exeution along only one path. The reulting hedule i hown in Figure 6. The average number of lok yle, CC d, required for the STG in Figure 6 i given by the following expreion. CC d = 3.P()+4.(1? P()) = 4? P() (3) Comparing the expreion for CC d to the expreion for CC b from the previou example indiate that CC d CC b for all feaible value of P(). Thu, in thi example, imultaneouly peulating

4 along multiple path aording to reoure availability reult in a hedule that i provably better than one derived by peulating along only the mot probable path. Our heduling algorithm automatially deide the bet path to peulate upon for the given reoure ontraint and branh probabilitie. 3 The Algorithm In thi etion, we preent the hange that need to be made to a generi heduling algorithm to upport peulative exeution. 3.1 A generi heduling algorithm Figure 7 how the peudoode for a generi heduling algorithm. The input to the heduler are a CDFG, G, to be hed- Generi heduler (CDFG G, ALLOCATION CONSTRAINT K, MODULE SELECTION INFO M inf, CLOCK PERIOD lk) f SET<OPERATION> Unheduled operation; SET<OPERATION> Shedulable operation; 1 while (junheduled operationj > 0) f 2 op = Selet hedulable operation (Shedulable operation, K, M inf, lk); //Selet an operation for heduling. The eleted // operation mut honor alloation and lok yle ontraint 3 Shedule(op); 4 Unheduled operation.remove operation(op); 5 Shedulable operation.remove operation(op); 6 SET<OPERATION> hedulable ueor = Computehedulable ueor(op);//find the et of operation // in op fanout whih beome hedulable when op i heduled 7 Shedulable operation.append(hedulable ueor); //Augment Shedulable operation by addition of //operation in hedulable ueor gg Figure 7: Peudoode for a generi heduling algorithm uled, the target lok period of the deign, alloation ontraint, whih peify the number and type of funtional unit available, and module eletion information, whih give the type of funtional unit an operation i mapped to. The output of the heduler i an STG whih deribe the hedule. At any point, a generi heduler maintain (a) the et of unheduled operation whoe data and ontrol dependenie have been atified, and an therefore be heduled (Shedulable operation), and (b) the et of operation whih are unheduled (Unheduled operation). The heduling proe proeed a follow: an operation from Shedulable operation i eleted for heduling in a given tate (tatement 2). The eletion hould honor alloation and lok yle ontraint. The manner in whih the eletion i done varie from one heduling algorithm to another. The eleted operation, op, i heduled in the tate. Sine op no longer belong to either Shedulable operation or Unheduled operation, it i removed from thee et (tatement 4 and 5). Alo, the heduling of op might render ome of the operation in it fanout hedulable. The routine Compute hedulable ueor (tatement 6) identifie uh operation, and thee operation are ubequently inluded in the et Shedulable operation (tatement 7). 3.2 Inorporating peulative exeution into a generi heduler: An overview We now provide an overview of the hange that need to be made to inorporate peulative exeution into the framework of the generi heduler hown in Figure 7. To upport peulative exeution, the generi heduler hown in Figure 7 need to be modified a follow (the detail of thee tep are provided in Setion 3.3). 1. When an operation i heduled, one need to reognize all it hedulable ueor, inluding the one whih an be (op1) op2 op1 l r Sel1 (op1) Figure 8: A CDFG fragment illutrating peulative exeution op3 op4 peulatively heduled. In addition, peulatively exeuted operation and their ueor need to be peially marked. Clearly, proedure Compute hedulable ueor need to be augmented to onider uh ae. Note that, at any tage, every peulatively hedulable operation i added to the lit of hedulable operation. However, few of them are atually heduled. Operation whih are not worth being peulated on are ignored, and eventually removed from the lit of hedulable operation, uing proedure deribed later in thi etion. Example 4: Conider the CDFG fragment hown in Figure 8. We aume that operation op0 i heduled, operation op2 ha jut been heduled, and operation op1, op3, Sel1, and op4 are unheduled. The output of the routine Compute hedulable ueor(op2) mut inlude operation op4, whih an now be peulatively exeuted, i.e., it operand an be aumed to be the reult of operation op2 and op0. 2. When operation are heduled, ontrol and data dependenie of peulatively exeuted operation are reolved. Thi would potentially validate or invalidate peulatively performed operation. Operation whih are validated hould be onidered normal, i.e., they need not be peially marked any longer. Operation in Unheduled operation and Shedulable operation whih are invalidated need no longer be onidered for heduling. They an, therefore, be removed from thee et. In general, the reolution of the ontrol or data dependenie of a peulatively performed operation reate two eparate thread of exeution, whih orrepond to the ue and failure of the peulation. Example 5: Conider again, the CDFG fragment hown in Figure 8. Suppoe operation op0, op2 and op4 have been heduled, and operation op3 i unheduled. Operation op4 ue a it operand, the reult of operation op2 and op0. Aume that operation op1 ha jut been heduled. If op1 evaluate to true, then the exeution of op4 an be onidered fruitful, beaue the operand hoen for it omputation are orret. Therefore, op4, and it heduled and hedulable ueor need not be onidered onditional on the reult of op1 anymore, and the data truture an be modified to reflet thi fat. If, however, op1 evaluate to fale, then op4 hould ue a it operand, the reult of operation op3 and op0, thu invalidating the reult of our peulation. Therefore, hedulable operation, whoe omputation are influened by the reult omputed by op4 are invalid, and an be removed from the et Shedulable operation. 3. The et, Shedulable operation, from whih an operation i eleted for heduling, ontain operation whoe exeution i peulative, i.e., whoe reult are not alway ueful. The eletion proedure, repreented by the routine Selet hedulable operation() (tatement 2), need to be modified to aount for thi fat. For example, operation, whoe exeution i extremely improbable, would make poor eletion andidate, a the reoure onumed by them might be op0

5 better utilized by operation whoe exeution i more probable. Alo, operation, whih fall on ritial path, would be better andidate for eletion than thoe on off-ritial path. 3.3 Inorporating peulative exeution into a generi heduler: A loer look In thi etion, we fill in the detail of the hange outlined in Setion 3.2. Thi i preeded by a formal treatment of onept related to peulative exeution. A heduler whih upport peulative exeution work with onditioned operation a it atomi hedulable unit, jut a a normal heduler ue operation. Therefore, the fanin-fanout relationhip between operation, aptured by the CDFG, need to be defined for onditioned operation. Sine all peulatively performed operation are onditioned on ome event, the adjetive peulatively performed when applied to an operation, implie that it i onditioned on ome event or ombination of event. A mentioned in Setion 3.2, when an operation i heduled, it hedulable ueor need to be omputed. (op1) op2 l op1 Sel1 (op1) r op3 op7 (op4) op5 l op4 Sel2 (op4) Figure 9: Illutrating the heduling of ueor of peulatively performed operation Example 6: Conider the CDFG fragment hown in Figure 9. Aume that operation op5 and op6 have been heduled, operation op1, op3, and op4 are unheduled, and op2 ha jut been heduled. It i now poible to hedule two verion of operation op7, with the firt verion, op7 0, uing op2 and op5 a it operand, and the eond, op7 00 uing op2 and op6. op7 0 i onditioned on (op1) (op4), and op7 00 i onditioned on (op1) (op4). The following analyi preent a trutured mean of identifying uh relationhip. We now preent a reult whih help derive fanin-fanout relationhip among peulatively performed operation. Lemma 1: Conider an operation, op, whoe fanin are op1, op2,..., opn. If the fanin of op have been peulatively heduled, o an op. In partiular, if the ith fanin, opi, i onditioned on C i, then op would be onditioned on V n i=1 C i. We now preent detail of Step 1, 2, and 3, outlined in Setion 3.2. Step 1: Thi tep addree the iue of deriving all hedulable ueor of a heduled operation, op0. The reult of Lemma 1 i ued for thi proedure. Obervation 1 Every et, S = fop0,op1,...,opig of heduled operation, whih atifie the following ondition oure a hedulable operation. Condition: There exit an operation, fanout, in the CDFG, all of whoe fanin are reahable from the output of the operation in S through path whih onit exluively (if at all) of elet operation. The path onneting the output of an operation opj in S to an input of fanout i denoted by Pj, and the operation on Pj are Selj 1, Selj 2,..., Selj aj. Note that aj an equal 0. C j repreent the ondition that path Pj i eleted, i.e., the reult of operation op j i r op6 propagated through path Pj to the appropriate input of fanout. Operation fanout i onditioned on V i k=0 (C(opk) C k ) where C(opk) repreent the expreion opk i onditioned on. Obervation 1 an be ued to infer the hedulable ueor of an operation. The proedure Compute hedulable ueor, whih i alled in tatement 6 of the peudoode hown in Figure 7, i appropriately augmented. So far, we have deribed the tehnique ued to identify all hedulable ueor of an operation. Thi wa aomplihed by tagging operation with the ondition under whih their reult would be valid. Note that our proedure allow u to peulate on all poible outome of a branh, and arbitrarily deeply into neted branhe. If integrated with a heduler whih upport loop unrolling, the peulation ould alo ro loop boundarie. We now preent the tehnique ued to validate or invalidate peulatively performed operation whoe dependenie have jut been reolved. Step 2: Suppoe operation op, whih reolve a ondition, ha jut been heduled. The reolution of reult in the reation of two different thread of exeution, where (i) = true, and (ii) = fale. The following proedure i arried out for every operation, op, whih belong either to the et, Shedulable operation, or the et of heduled operation. Let op be onditioned on C = V i j=1 j. In the true (fale) branh, C i evaluated auming a value of 1 (0) for, and the reultant expreion i the new expreion that op i onditioned on. Step 3: We now deribe the proedure employed by the heduler to elet an operation to hedule, from a pool of hedulable operation, Shedulable operation. Shedulable operation an ontain operation whih are onditioned on different et of event, i.e., we an hooe different path to peulate upon. We need to deide the bet andidate to map to a given reoure, where, by bet, we mean the operation whoe mapping on the given reoure would minimize the expeted number of yle for the hedule. Formally, the problem an be tated a follow: given (i) a partial hedule, (ii) a funtional unit, fu, (iii) a et of operation, S (ome of whih may be peulative), whih an exeute on the funtional unit, and (iv) typial input trae, elet the operation, whih, if mapped to fu, would minimize the expeted number of yle. The above problem ha been proven to be NP-omplete, even for onditional- and loop-free behavioral deription [8]. We, therefore, ue the following heuriti, whoe guiding priniple ha been uefully employed by everal heduling algorithm [9]. The heuriti i baed on the following premie: operation in the CDFG whih feed primary output through long path are more ritial than operation whih feed primary output through hort path and, therefore, need to be heduled earlier. The rationale behind thi heuriti i that operation whih belong to hort path are more mobile than thoe on long path, i.e., the total hedule length i le enitive to variation in their hedule. The length of a path i meaured a the um of the delay of it ontituent operation. In data-dominated deription, with no loop and onditional operation, the longet path between any pair of operation i fixed. In ontrol-flow intenive deription, ome path ould be inputdependent. Therefore, the longet path between a pair of operation mut be defined with repet to a given input. For example, for the CDFG hown in Figure 3, the longet path onneting primary input with output out depend upon the value taken by operation > 1. Sine our heduling algorithm i geared toward minimizing the average exeution time, we ue the expeted length of the longet path from an operation to a primary output a a metri to rank different operation. We ue the notation λ(op) to denote thi quantity for operation op. Speulation add a new dimenion to thi problem: the reult omputed by an operation i not guaranteed to be ueful. For an

6 Table 1: Expeted number of yle, number of tate, betand wort-ae number of yle reult Ciruit E.N.C. #tate b w WS SP WS SP WS SP WS SP Barode GCD Tet TLC Findmin Table 2: Alloation ontraint for the example in Table 2 Ciruit add1 ub1 mult1 omp1 eq in Barode GCD Tet TLC Findmin operation, op, we aount for thi effet by multiplying the probability that an operation output i utilized with λ(op) to derive a metri of an operation ritiality. Thi i expreed by mean of the following equation: i ritiality(op) = λ(op) P( j ) (4) j=1 where ritiality(op) meaure the deirability of heduling op, i j=1 P( j) i the produt of the probabilitie of the event that op i onditioned on, and λ(op) i a defined above. 4 Experimental Reult The tehnique deribed in thi paper were implemented in a program alled Wavehed-pe, written in C++. We evaluated thi program by uing it to produe hedule for everal ommonly available benhmark. Thee hedule were ompared againt thoe produed by the heduling algorithm, Wavehed [6], without the ue of peulative exeution, with repet to the following metri: (a) expeted number of yle, (b) number of tate in the STG produed, () the mallet number of yle taken to exeute the behavioral deription, and (d) the larget number of yle taken to exeute the behavioral deription. In general, finding the larget number of yle taken to exeute a behavioral deription i a hard problem. However, for the example onidered in thi paper, tati analyi of the deription wa uffiient to find the number. Table 1 ummarize the reult obtained. The olumn labeled E.N.C., #tate, b, and w repreent, repetively, the expeted number of yle, the number of tate in the STG produed, mallet number of yle taken to exeute the STG, and the larget number of yle taken to exeute the STG. Minor olumn WS and SP repreent hedule produed by Wavehed and Wavehedpe, repetively. We ued a library of funtional unit whih onited of (a) an adder, add1, (b) a ubtrater, ub1, () a multiplier, mult1, (d) a le-than omparator, omp1, (e) an equality omparator, eq, and (f) an inrementer, in. Unlimited number of ingle-input logi gate (OR, AND, and NOT) were aumed to be available. All funtional unit exept mult1, whih exeute in two yle, take one yle to exeute. The alloation ontraint for an example an be found by looking up the entry orreponding to the example in Table 2. For example, the alloation ontraint for GCD are two ub1, one omp1, and two eq. The expeted number of yle for the final deign wa meaured by imulating a VHDL deription of the hedule uing the Synopy VSS imulator. The input trae ued for imulation were obtained a zero-mean Gauian equene. Of our example, Barode, GCD, TLC, and Findmin are borrowed from the literature. Tet1 i the example hown in Figure 1. Barode repreent a barode reader, GCD ompute the greatet ommon divior of it input, TLC repreent a traffi light ontroller, and Findmin return the index of the minimum element in an array. The reult obtained indiate that Wavehed-pe produed an average expeted hedule length peedup of 2.8 over hedule obtained uing Wavehed. Note that Wavehed [6] wa reported to have ahieved an average peedup of 2 over hedule produed by exiting heduling algorithm, uh a path-baed heduling [10], and loop-direted heduling [7]. To get an idea of the area overhead of thi tehnique, we obtained a 16-bit RTL implementation for the GCD example uing an in-houe high-level ynthei ytem, for the hedule produed by Wavehed-pe and Wavehed. Thee RTL iruit were tehnology-mapped uing the MSU library, and the area of the gate-level iruit were obtained. The area overhead for the iruit produed from Wavehed-pe wa found to be only 3.1%. We alo note that for Wavehed-pe, the number of yle in the hortet and longet path i maller than or equal to the orreponding number for Wavehed. 5 Conluion In thi paper, we preented a tehnique for inorporating peulative exeution into heduling of ontrol-flow intenive deign. We demontrated that in order to fully exploit the power of peulative exeution, one need to integrate it with heduling. We introdued a node-tagging heme for the identifiation of operation whih an be peulatively heduled in a given tate, and a heuriti to elet the bet operation to hedule. Our tehnique were fully integrated into an exiting heduling algorithm whih an upport impliit unrolling of loop, funtional pipelining of ontrolflow intenive behavior, and an parallelize the exeution of independent loop whoe bodie hare reoure. Experimental reult demontrate that the preented tehnique an improve the performane of the generated hedule ignifiantly. Shedule produed uing peulative exeution were, on an average, 2.8 time fater than hedule produed without it benefit. Referene [1] U. Holtmann and R. Ernt, Experiment with low-level peulative omputation baed on multiple branh predition, IEEE Tran. VLSI Sytem, vol. 1, pp , Sept [2] K. Wakabayahi and H. Tanaka, Global heduling independent of ontrol dependeniebaed on ondition vetor, in Pro. Deign Automation Conf., pp , June [3] U. Holtmann and R. Ernt, Combining MBP-peulative omputation and loop pipelining in high-level ynthei, in Pro. European Deign & Tet Conf., pp , Mar [4] J. A. Fiher, Trae heduling: A tehnique for global miroode ompation, IEEE Tran. Computer, vol. 30, pp , July [5] S. A. Mahlke et al., Sentinel heduling: A model for ompilerontrolled peulative exeution, IEEE Tran. Computer, vol. 11, pp , Nov [6] G. Lakhminarayana, K. S. Khouri, and N. K. Jha, Wavehed: A novel heduling tehnique for ontrol-flow intenive behavioral deription, in Pro. Int. Conf. Computer-Aided Deign, pp , Nov [7] S. Bhattaharya, S. Dey, and F. Brglez, Performane analyi and optimization of hedule for onditional and loop-intenive peifiation, in Pro. Deign Automation Conf., pp , June [8] M. Garey and D. Johnon, Computer and Intratibility. W.H. Freeman & Company, New York, [9] R. Jain, A. Majumdar, A. Sharma, and H. Wang, Empirial evaluation of ome high-level ynthei heduling heuriti, in Pro. Deign Automation Conf., pp , June [10] R. Campoano, Path-baed heduling for ynthei, IEEE Tran. Computer-Aided Deign, vol. 10, pp , Jan

Combined Radix-10 and Radix-16 Division Unit

Combined Radix-10 and Radix-16 Division Unit Combined adix- and adix-6 Diviion Unit Tomá ang and Alberto Nannarelli Dept. of Eletrial Engineering and Computer Siene, Univerity of California, Irvine, USA Dept. of Informati & Math. Modelling, Tehnial

More information

Parametric Micro-level Performance Models for Parallel Computing

Parametric Micro-level Performance Models for Parallel Computing Computer Siene Tehnial Report Computer Siene 12-5-1994 Parametri Miro-level Performane Model for Parallel Computing Youngtae Kim Iowa State Univerity Mark Fienup Iowa State Univerity Jeffrey S. Clary Iowa

More information

Relayer Selection Strategies in Cellular Networks with Peer-to-Peer Relaying

Relayer Selection Strategies in Cellular Networks with Peer-to-Peer Relaying Relayer Seletion Strategie in Cellular Network with Peer-to-Peer Relaying V. Sreng, H. Yanikomeroglu, and D. D. Faloner Broadband Communiation and Wirele Sytem (BCWS) Centre Dept. of Sytem and Computer

More information

Macrohomogenous Li-Ion-Battery Modeling - Strengths and Limitations

Macrohomogenous Li-Ion-Battery Modeling - Strengths and Limitations Marohomogenou Li-Ion-Battery Modeling - Strength and Limitation Marku Lindner Chritian Wieer Adam Opel AG Sope Purpoe of the reearh: undertand and quantify impat of implifiation in marohomogeneou model

More information

Description of Traffic in ATM Networks by the First Erlang Formula

Description of Traffic in ATM Networks by the First Erlang Formula 5th International Conferene on Information Tehnology and Appliation (ICITA 8) Deription of Traffi in ATM Network by the Firt Erlang Formula Erik Chromý, Matej Kavaký and Ivan Baroňák Abtrat In the paper

More information

1. Introduction. Abstract

1. Introduction. Abstract Automati Ontology Derivation Uing Clutering for Image Claifiation 1 Latifur Khan and Lei Wang Department of Computer Siene Univerity of Texa at Dalla, TX 75083-0688 Email: [lkhan, leiwang]@utdalla.edu

More information

KINEMATIC ANALYSIS OF VARIOUS ROBOT CONFIGURATIONS

KINEMATIC ANALYSIS OF VARIOUS ROBOT CONFIGURATIONS International Reearh Journal of Engineering and Tehnology (IRJET) e-in: 39-6 Volume: 4 Iue: May -7 www.irjet.net p-in: 39-7 KINEMATI ANALYI OF VARIOU ROBOT ONFIGURATION Game R. U., Davkhare A. A., Pakhale..

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each type of circuit will be implemented in two

More information

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies International Journal of Innovative Trend and Emerging Tehnologie ROBUST SCAN TECHNIQUE FOR SECURED AES AGAINST DIFFERENTIAL CRYPTANALYSIS BASED SIDE CHANNEL ATTACK A.TAMILARASAN 1, MR.A.ANBARASAN 2 1

More information

COURSEWORK 1 FOR INF2B: FINDING THE DISTANCE OF CLOSEST PAIRS OF POINTS ISSUED: 9FEBRUARY 2017

COURSEWORK 1 FOR INF2B: FINDING THE DISTANCE OF CLOSEST PAIRS OF POINTS ISSUED: 9FEBRUARY 2017 COURSEWORK 1 FOR INF2B: FINDING THE DISTANCE OF CLOSEST PAIRS OF POINTS ISSUED: 9FEBRUARY 2017 Submiion Deadline: The ourework onit of two part (of a different nature) relating to one problem. A hown below

More information

Pruning Game Tree by Rollouts

Pruning Game Tree by Rollouts Pruning Game Tree by Rollout Bojun Huang Mirooft Reearh bojhuang@mirooft.om Abtrat In thi paper we how that the α-β algorithm and it ueor MT-SSS*, a two lai minimax earh algorithm, an be implemented a

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier a a The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each b c circuit will be decribed in Verilog

More information

Pipelined Multipliers for Reconfigurable Hardware

Pipelined Multipliers for Reconfigurable Hardware Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,

More information

Q1:Choose the correct answer:

Q1:Choose the correct answer: Q:Chooe the orret anwer:. Purpoe of an OS i a. Create abtration b. Multiple proee ompete for ue of proeor. Coordination. Sheduler deide a. whih proee get to ue the proeor b. when proee get to ue the proeor.

More information

1 The secretary problem

1 The secretary problem Thi i new material: if you ee error, pleae email jtyu at tanford dot edu 1 The ecretary problem We will tart by analyzing the expected runtime of an algorithm, a you will be expected to do on your homework.

More information

Inverse Kinematics 1 1/29/2018

Inverse Kinematics 1 1/29/2018 Invere Kinemati 1 Invere Kinemati 2 given the poe of the end effetor, find the joint variable that produe the end effetor poe for a -joint robot, given find 1 o R T 3 2 1,,,,, q q q q q q RPP + Spherial

More information

Course Project: Adders, Subtractors, and Multipliers a

Course Project: Adders, Subtractors, and Multipliers a In the name Allah Department of Computer Engineering 215 Spring emeter Computer Architecture Coure Intructor: Dr. Mahdi Abbai Coure Project: Adder, Subtractor, and Multiplier a a The purpoe of thi p roject

More information

Automatic design of robust PID controllers based on QFT specifications

Automatic design of robust PID controllers based on QFT specifications IFAC Conferene on Advane in PID Control PID'1 Breia (Italy), Marh 8-3, 1 Automati deign of robut PID ontroller baed on QFT peifiation R. Comaòliva*, T. Eobet* J. Quevedo* * Advaned Control Sytem (SAC),

More information

Shortest Paths in Directed Graphs

Shortest Paths in Directed Graphs Shortet Path in Direted Graph Jonathan Turner January, 0 Thi note i adapted from Data Struture and Network Algorithm y Tarjan. Let G = (V, E) e a direted graph and let length e a real-valued funtion on

More information

Lecture 14: Minimum Spanning Tree I

Lecture 14: Minimum Spanning Tree I COMPSCI 0: Deign and Analyi of Algorithm October 4, 07 Lecture 4: Minimum Spanning Tree I Lecturer: Rong Ge Scribe: Fred Zhang Overview Thi lecture we finih our dicuion of the hortet path problem and introduce

More information

Using Bayesian Networks for Cleansing Trauma Data

Using Bayesian Networks for Cleansing Trauma Data Uing Bayeian Network for Cleaning Trauma Data Prahant J. Dohi pdohi@.ui.edu Dept. of Computer Siene Univ of Illinoi, Chiago, IL 60607 Lloyd G. Greenwald lgreenwa@.drexel.edu Dept. of Computer Siene Drexel

More information

This fact makes it difficult to evaluate the cost function to be minimized

This fact makes it difficult to evaluate the cost function to be minimized RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess

More information

Visual Targeted Advertisement System Based on User Profiling and Content Consumption for Mobile Broadcasting Television

Visual Targeted Advertisement System Based on User Profiling and Content Consumption for Mobile Broadcasting Television Viual Targeted Advertiement Sytem Baed on Uer Profiling and ontent onumption for Mobile Broadating Televiion Silvia Uribe Federio Alvarez Joé Manuel Menéndez Guillermo inero Abtrat ontent peronaliation

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

About this Topic. Topic 4. Arithmetic Circuits. Different adder architectures. Basic Ripple Carry Adder

About this Topic. Topic 4. Arithmetic Circuits. Different adder architectures. Basic Ripple Carry Adder About thi Topi Topi 4 Arithmeti Ciruit Peter Cheung Department of Eletrial & Eletroni Engineering Imperial College London URL: www.ee.imperial.a.uk/pheung/ E-mail: p.heung@imperial.a.uk Comparion of adder

More information

OSI Model. SS7 Protocol Model. Application TCAP. Presentation Session Transport. ISDN-UP Null SCCP. Network. MTP Level 3 MTP Level 2 MTP Level 1

OSI Model. SS7 Protocol Model. Application TCAP. Presentation Session Transport. ISDN-UP Null SCCP. Network. MTP Level 3 MTP Level 2 MTP Level 1 Direte Event Simulation of CCS7 DAP Benjamin, AE Krzeinki and S Staven Department of Computer Siene Univerity of Stellenboh 7600 Stellenboh, South Afria fbenj,aek,taveng@.un.a.za ABSTRACT: Complex imulation

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each circuit will be decribed in Verilog and implemented

More information

Topics. FPGA Design EECE 277. Number Representation and Adders. Class Exercise. Laboratory Assignment #2

Topics. FPGA Design EECE 277. Number Representation and Adders. Class Exercise. Laboratory Assignment #2 FPGA Deign EECE 277 Number Repreentation and Adder Dr. William H. Robinon Februar 2, 25 Topi There are kind of people in the world, thoe that undertand binar and thoe that don't. Unknown Adminitrative

More information

Datum Transformations of NAV420 Reference Frames

Datum Transformations of NAV420 Reference Frames NA4CA Appliation Note Datum ranformation of NA4 Referene Frame Giri Baleri, Sr. Appliation Engineer Crobow ehnology, In. http://www.xbow.om hi appliation note explain how to onvert variou referene frame

More information

Routing Definition 4.1

Routing Definition 4.1 4 Routing So far, we have only looked at network without dealing with the iue of how to end information in them from one node to another The problem of ending information in a network i known a routing

More information

Deterministic Access for DSRC/802.11p Vehicular Safety Communication

Deterministic Access for DSRC/802.11p Vehicular Safety Communication eterminiti Ae for SRC/802.11p Vehiular Safety Communiation Jihene Rezgui, Soumaya Cheraoui, Omar Charoun INTERLAB Reearh Laboratory Univerité de Sherbrooe, Canada {jihene.rezgui, oumaya.heraoui, omar.haroun

More information

An Evolutionary Multiple Heuristic with Genetic Local Search for Solving TSP

An Evolutionary Multiple Heuristic with Genetic Local Search for Solving TSP An Evolutionary Multiple Heuriti with Geneti Loal Searh for Solving TSP Peng Gang Ihiro Iimura 2 and Shigeru Nakayama 3 Department of Information and Computer Siene Faulty of Engineering Kagohima Univerity

More information

Computer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder

Computer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder Computer Arithmetic Homework 3 2016 2017 Solution 1 An adder for graphic In a normal ripple carry addition of two poitive number, the carry i the ignal for a reult exceeding the maximum. We ue thi ignal

More information

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded Folding is verse of Unfolding Node A A Folding by N (N=folding fator) Folding A Unfolding by J A A J- Hardware Mapped vs. Time multiplexed l Hardware Mapped vs. Time multiplexed/mirooded FI : y x(n) h

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

A SIMPLE IMPERATIVE LANGUAGE THE STORE FUNCTION NON-TERMINATING COMMANDS

A SIMPLE IMPERATIVE LANGUAGE THE STORE FUNCTION NON-TERMINATING COMMANDS A SIMPLE IMPERATIVE LANGUAGE Eventually we will preent the emantic of a full-blown language, with declaration, type and looping. However, there are many complication, o we will build up lowly. Our firt

More information

MAT 155: Describing, Exploring, and Comparing Data Page 1 of NotesCh2-3.doc

MAT 155: Describing, Exploring, and Comparing Data Page 1 of NotesCh2-3.doc MAT 155: Decribing, Exploring, and Comparing Data Page 1 of 8 001-oteCh-3.doc ote for Chapter Summarizing and Graphing Data Chapter 3 Decribing, Exploring, and Comparing Data Frequency Ditribution, Graphic

More information

Calculations for multiple mixers are based on a formalism that uses sideband information and LO frequencies: ( ) sb

Calculations for multiple mixers are based on a formalism that uses sideband information and LO frequencies: ( ) sb Setting frequeny parameter in the WASP databae A. Harri 24 Aug 2003 Calulation for multiple mixer are baed on a formalim that ue ideband information and LO frequenie: b b := ign f ig f LO f IF := f ig

More information

Laboratory Exercise 2

Laboratory Exercise 2 Laoratory Exercie Numer and Diplay Thi i an exercie in deigning cominational circuit that can perform inary-to-decimal numer converion and inary-coded-decimal (BCD) addition. Part I We wih to diplay on

More information

Contents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course:

Contents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course: Content Shortet path Algorithm and Network 21/211 The hortet path problem: Statement Verion Application Algorithm (for ingle ource p problem) Reminder: relaxation, Dijktra, Variant of Dijktra, Bellman-Ford,

More information

Performance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks

Performance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks Performance of a Robut Filter-baed Approach for Contour Detection in Wirele Senor Network Hadi Alati, William A. Armtrong, Jr., and Ai Naipuri Department of Electrical and Computer Engineering The Univerity

More information

Operational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz

Operational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz Operational emantic Page Operational emantic Cla note for a lecture given by Mooly agiv Tel Aviv Univerity 4/5/7 By Roy Ganor and Uri Juhaz Reference emantic with Application, H. Nielon and F. Nielon,

More information

Increasing Throughput and Reducing Delay in Wireless Sensor Networks Using Interference Alignment

Increasing Throughput and Reducing Delay in Wireless Sensor Networks Using Interference Alignment Int. J. Communication, Network and Sytem Science, 0, 5, 90-97 http://dx.doi.org/0.436/ijcn.0.50 Publihed Online February 0 (http://www.scirp.org/journal/ijcn) Increaing Throughput and Reducing Delay in

More information

In-Plane Shear Behavior of SC Composite Walls: Theory vs. Experiment

In-Plane Shear Behavior of SC Composite Walls: Theory vs. Experiment Tranation, MiRT, 6- November,, New Delhi, India Div-VI: Paper ID# 764 In-Plane hear Behavior of C Compoite Wall: Theory v. Experiment Amit H. Varma, ai Zhang, Hoeok Chi 3, Peter Booth 4, Tod Baker 5 Aoiate

More information

The Association of System Performance Professionals

The Association of System Performance Professionals The Aociation of Sytem Performance Profeional The Computer Meaurement Group, commonly called CMG, i a not for profit, worldwide organization of data proceing profeional committed to the meaurement and

More information

Universität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart.

Universität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart. Univerität Augburg à ÊÇÅÍÆ ËÀǼ Approximating Optimal Viual Senor Placement E. Hörter, R. Lienhart Report 2006-01 Januar 2006 Intitut für Informatik D-86135 Augburg Copyright c E. Hörter, R. Lienhart Intitut

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each circuit will be decribed in VHL and implemented

More information

SPH3UW Unit 7.1 The Ray Model of Light Page 2 of 5. The accepted value for the speed of light inside a vacuum is c m which we usually

SPH3UW Unit 7.1 The Ray Model of Light Page 2 of 5. The accepted value for the speed of light inside a vacuum is c m which we usually SPH3UW Unit 7. The Ray Model of Light Page of 5 Note Phyi Tool box Ray light trael in traight path alled ray. Index of refration (n) i the ratio of the peed of light () in a auu to the peed of light in

More information

Multiple Assignments

Multiple Assignments Two Outputs Conneted Together Multiple Assignments Two Outputs Conneted Together if (En1) Q

More information

c s ha2 c s Half Adder Figure 2: Full Adder Block Diagram

c s ha2 c s Half Adder Figure 2: Full Adder Block Diagram Adder Tk: Implement 2-it dder uing 1-it full dder nd 1-it hlf dder omponent (Figure 1) tht re onneted together in top-level module. Derie oth omponent in VHDL. Prepre two implementtion where VHDL omponent

More information

Distributed Packet Processing Architecture with Reconfigurable Hardware Accelerators for 100Gbps Forwarding Performance on Virtualized Edge Router

Distributed Packet Processing Architecture with Reconfigurable Hardware Accelerators for 100Gbps Forwarding Performance on Virtualized Edge Router Ditributed Packet Proceing Architecture with Reconfigurable Hardware Accelerator for 100Gbp Forwarding Performance on Virtualized Edge Router Satohi Nihiyama, Hitohi Kaneko, and Ichiro Kudo Abtract To

More information

Karen L. Collins. Wesleyan University. Middletown, CT and. Mark Hovey MIT. Cambridge, MA Abstract

Karen L. Collins. Wesleyan University. Middletown, CT and. Mark Hovey MIT. Cambridge, MA Abstract Mot Graph are Edge-Cordial Karen L. Collin Dept. of Mathematic Weleyan Univerity Middletown, CT 6457 and Mark Hovey Dept. of Mathematic MIT Cambridge, MA 239 Abtract We extend the definition of edge-cordial

More information

Floating Point CORDIC Based Power Operation

Floating Point CORDIC Based Power Operation Floating Point CORDIC Baed Power Operation Kazumi Malhan, Padmaja AVL Electrical and Computer Engineering Department School of Engineering and Computer Science Oakland Univerity, Rocheter, MI e-mail: kmalhan@oakland.edu,

More information

View-Based Tree-Language Rewritings

View-Based Tree-Language Rewritings View-Baed Tree-Language Rewriting Lak Lakhmanan, Alex Thomo Univerity of Britih Columbia, Canada Univerity of Vitoria, Canada Importane of tree XML Semi-trutured textual format are very popular.

More information

Topics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X

Topics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X Lecture 37: Global Optimization [Adapted from note by R. Bodik and G. Necula] Topic Global optimization refer to program optimization that encompa multiple baic block in a function. (I have ued the term

More information

Interconnection Styles

Interconnection Styles Interonnetion tyles oftware Design Following the Export (erver) tyle 2 M1 M4 M5 4 M3 M6 1 3 oftware Design Following the Export (Client) tyle e 2 e M1 M4 M5 4 M3 M6 1 e 3 oftware Design Following the Export

More information

DAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications

DAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications DAROS: Ditributed Uer-Server Aignment And Replication For Online Social Networking Application Thuan Duong-Ba School of EECS Oregon State Univerity Corvalli, OR 97330, USA Email: duongba@eec.oregontate.edu

More information

Key Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing.

Key Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing. Volume 3, Iue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Reearch in Computer Science and Software Engineering Reearch Paper Available online at: www.ijarce.com Tak Aignment in

More information

Exploring the Commonality in Feature Modeling Notations

Exploring the Commonality in Feature Modeling Notations Exploring the Commonality in Feature Modeling Notations Miloslav ŠÍPKA Slovak University of Tehnology Faulty of Informatis and Information Tehnologies Ilkovičova 3, 842 16 Bratislava, Slovakia miloslav.sipka@gmail.om

More information

ES205 Analysis and Design of Engineering Systems: Lab 1: An Introductory Tutorial: Getting Started with SIMULINK

ES205 Analysis and Design of Engineering Systems: Lab 1: An Introductory Tutorial: Getting Started with SIMULINK ES05 Analyi and Deign of Engineering Sytem: Lab : An Introductory Tutorial: Getting Started with SIMULINK What i SIMULINK? SIMULINK i a oftware package for modeling, imulating, and analyzing dynamic ytem.

More information

See chapter 8 in the textbook. Dr Muhammad Al Salamah, Industrial Engineering, KFUPM

See chapter 8 in the textbook. Dr Muhammad Al Salamah, Industrial Engineering, KFUPM Goal programming Objective of the topic: Indentify indutrial baed ituation where two or more objective function are required. Write a multi objective function model dla a goal LP Ue weighting um and preemptive

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

[N309] Feedforward Active Noise Control Systems with Online Secondary Path Modeling. Muhammad Tahir Akhtar, Masahide Abe, and Masayuki Kawamata

[N309] Feedforward Active Noise Control Systems with Online Secondary Path Modeling. Muhammad Tahir Akhtar, Masahide Abe, and Masayuki Kawamata he 32nd International Congre and Expoition on Noie Control Engineering Jeju International Convention Center, Seogwipo, Korea, Augut 25-28, 2003 [N309] Feedforward Active Noie Control Sytem with Online

More information

Minimum congestion spanning trees in bipartite and random graphs

Minimum congestion spanning trees in bipartite and random graphs Minimum congetion panning tree in bipartite and random graph M.I. Otrovkii Department of Mathematic and Computer Science St. John Univerity 8000 Utopia Parkway Queen, NY 11439, USA e-mail: otrovm@tjohn.edu

More information

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Malaysian Journal of Computer Siene, Vol 10 No 1, June 1997, pp 36-41 A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Md Rafiqul Islam, Harihodin Selamat and Mohd Noor Md Sap Faulty of Computer Siene and

More information

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy ajita.rattani@diee.unia.it Gian Lua Marialis University

More information

A Specification for Rijndael, the AES Algorithm

A Specification for Rijndael, the AES Algorithm A Speifiation for Rijndael, the AES Algorithm. Notation and Convention. Rijndael Input and Output The input, the output and the ipher key for Rijndael are eah it equene ontaining 28, 92 or 256 it with

More information

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

Alleviating DFT cost using testability driven HLS

Alleviating DFT cost using testability driven HLS Alleviating DFT ost using testability driven HLS M.L.Flottes, R.Pires, B.Rouzeyre Laboratoire d Informatique, de Robotique et de Miroéletronique de Montpellier, U.M. CNRS 5506 6 rue Ada, 34392 Montpellier

More information

A Novel Validity Index for Determination of the Optimal Number of Clusters

A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers

More information

Refining SIRAP with a Dedicated Resource Ceiling for Self-Blocking

Refining SIRAP with a Dedicated Resource Ceiling for Self-Blocking Refining SIRAP with a Dedicated Reource Ceiling for Self-Blocking Mori Behnam, Thoma Nolte Mälardalen Real-Time Reearch Centre P.O. Box 883, SE-721 23 Väterå, Sweden {mori.behnam,thoma.nolte}@mdh.e ABSTRACT

More information

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and

More information

Uninformed Search Complexity. Informed Search. Search Revisited. Day 2/3 of Search

Uninformed Search Complexity. Informed Search. Search Revisited. Day 2/3 of Search Informed Search ay 2/3 of Search hap. 4, Ruel & Norvig FS IFS US PFS MEM FS IS Uninformed Search omplexity N = Total number of tate = verage number of ucceor (branching factor) L = Length for tart to goal

More information

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions Department of Eletrial Engineering and Computer iene MAACHUETT INTITUTE OF TECHNOLOGY 6.035 Fall 2016 Test I olutions 1 I Regular Expressions and Finite-tate Automata For Questions 1, 2, and 3, let the

More information

arxiv: v1 [cs.ds] 27 Feb 2018

arxiv: v1 [cs.ds] 27 Feb 2018 Incremental Strong Connectivity and 2-Connectivity in Directed Graph Louka Georgiadi 1, Giueppe F. Italiano 2, and Niko Parotidi 2 arxiv:1802.10189v1 [c.ds] 27 Feb 2018 1 Univerity of Ioannina, Greece.

More information

Multi-Target Tracking In Clutter

Multi-Target Tracking In Clutter Multi-Target Tracking In Clutter John N. Sander-Reed, Mary Jo Duncan, W.B. Boucher, W. Michael Dimmler, Shawn O Keefe ABSTRACT A high frame rate (0 Hz), multi-target, video tracker ha been developed and

More information

3D SMAP Algorithm. April 11, 2012

3D SMAP Algorithm. April 11, 2012 3D SMAP Algorithm April 11, 2012 Baed on the original SMAP paper [1]. Thi report extend the tructure of MSRF into 3D. The prior ditribution i modified to atify the MRF property. In addition, an iterative

More information

Keywords Cloud Computing, Service Level Agreements (SLA), CloudSim, Monitoring & Controlling SLA Agent, JADE

Keywords Cloud Computing, Service Level Agreements (SLA), CloudSim, Monitoring & Controlling SLA Agent, JADE Volume 5, Iue 8, Augut 2015 ISSN: 2277 128X International Journal of Advanced Reearch in Computer Science and Software Engineering Reearch Paper Available online at: www.ijarce.com Verification of Agent

More information

A System Dynamics Model for Transient Availability Modeling of Repairable Redundant Systems

A System Dynamics Model for Transient Availability Modeling of Repairable Redundant Systems International Journal of Performability Engineering Vol., No. 3, May 05, pp. 03-. RAMS Conultant Printed in India A Sytem Dynamic Model for Tranient Availability Modeling of Repairable Redundant Sytem

More information

Discrete sequential models and CRFs. 1 Case Study: Supervised Part-of-Speech Tagging

Discrete sequential models and CRFs. 1 Case Study: Supervised Part-of-Speech Tagging 0-708: Probabilisti Graphial Models 0-708, Spring 204 Disrete sequential models and CRFs Leturer: Eri P. Xing Sribes: Pankesh Bamotra, Xuanhong Li Case Study: Supervised Part-of-Speeh Tagging The supervised

More information

Lecture Outline. Global flow analysis. Global Optimization. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

Lecture Outline. Global flow analysis. Global Optimization. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization Lecture Outline Global flow analyi Global Optimization Global contant propagation Livene analyi Adapted from Lecture by Prof. Alex Aiken and George Necula (UCB) CS781(Praad) L27OP 1 CS781(Praad) L27OP

More information

VLSI Design 9. Datapath Design

VLSI Design 9. Datapath Design VLSI Deign 9. Datapath Deign 9. Datapath Deign Lat module: Adder circuit Simple adder Fat addition Thi module omparator Shifter Multi-input Adder Multiplier omparator detector: A = 1 detector: A = 11 111

More information

Design of High Speed Mac Unit

Design of High Speed Mac Unit Design of High Speed Ma Unit 1 Harish Babu N, 2 Rajeev Pankaj N 1 PG Student, 2 Assistant professor Shools of Eletronis Engineering, VIT University, Vellore -632014, TamilNadu, India. 1 harishharsha72@gmail.om,

More information

Midterm 2 March 10, 2014 Name: NetID: # Total Score

Midterm 2 March 10, 2014 Name: NetID: # Total Score CS 3 : Algorithm and Model of Computation, Spring 0 Midterm March 0, 0 Name: NetID: # 3 Total Score Max 0 0 0 0 Grader Don t panic! Pleae print your name and your NetID in the boxe above. Thi i a cloed-book,

More information

CleanUp: Improving Quadrilateral Finite Element Meshes

CleanUp: Improving Quadrilateral Finite Element Meshes CleanUp: Improving Quadrilateral Finite Element Meshes Paul Kinney MD-10 ECC P.O. Box 203 Ford Motor Company Dearborn, MI. 8121 (313) 28-1228 pkinney@ford.om Abstrat: Unless an all quadrilateral (quad)

More information

An Intro to LP and the Simplex Algorithm. Primal Simplex

An Intro to LP and the Simplex Algorithm. Primal Simplex An Intro to LP and the Simplex Algorithm Primal Simplex Linear programming i contrained minimization of a linear objective over a olution pace defined by linear contraint: min cx Ax b l x u A i an m n

More information

Kinematic design of a double wishbone type front suspension mechanism using multi-objective optimization

Kinematic design of a double wishbone type front suspension mechanism using multi-objective optimization 5 th utralaian Congre on pplied Mehani, CM 2007 10-12 Deember 2007, Bribane, utralia Kinemati deign of a double wihbone tpe front upenion mehanim uing multi-objetive optimiation J. S. wang 1, S. R. Kim

More information

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem Calulation of typial running time of a branh-and-bound algorithm for the vertex-over problem Joni Pajarinen, Joni.Pajarinen@iki.fi Otober 21, 2007 1 Introdution The vertex-over problem is one of a olletion

More information

arxiv: v1 [physics.soc-ph] 17 Oct 2013

arxiv: v1 [physics.soc-ph] 17 Oct 2013 Emergene of Blind Area in Information Sreading arxiv:131707v1 [hyi.o-h] 17 Ot 2013 Zi-Ke Zhang 1,2,, Chu-Xu Zhang 1,3,, Xiao-Pu Han 1,2 and Chuang Liu 1,2 1 Intitute of Information Eonomy, Hangzhou Normal

More information

How to Select Measurement Points in Access Point Localization

How to Select Measurement Points in Access Point Localization Proceeding of the International MultiConference of Engineer and Computer Scientit 205 Vol II, IMECS 205, March 8-20, 205, Hong Kong How to Select Meaurement Point in Acce Point Localization Xiaoling Yang,

More information

Performance Benchmarks for an Interactive Video-on-Demand System

Performance Benchmarks for an Interactive Video-on-Demand System Performane Benhmarks for an Interative Video-on-Demand System. Guo,P.G.Taylor,E.W.M.Wong,S.Chan,M.Zukerman andk.s.tang ARC Speial Researh Centre for Ultra-Broadband Information Networks (CUBIN) Department

More information

Design of a Stewart Platform for General Machining Using Magnetic Bearings

Design of a Stewart Platform for General Machining Using Magnetic Bearings eign of a Stewart Platform for eneral Machining Uing Magnetic earing Jeff Pieper epartment of Mechanical and Manufacturing Engineering Univerity of algary algary lberta anada N N4 pieper@ucalgary.ca Preented

More information

Chapter S:II (continued)

Chapter S:II (continued) Chapter S:II (continued) II. Baic Search Algorithm Sytematic Search Graph Theory Baic State Space Search Depth-Firt Search Backtracking Breadth-Firt Search Uniform-Cot Search AND-OR Graph Baic Depth-Firt

More information

Distributed Resource Allocation Strategies for Achieving Quality of Service in Server Clusters

Distributed Resource Allocation Strategies for Achieving Quality of Service in Server Clusters Proeedings of the 45th IEEE Conferene on Deision & Control Manhester Grand Hyatt Hotel an Diego, CA, UA, Deember 13-15, 2006 Distributed Resoure Alloation trategies for Ahieving Quality of ervie in erver

More information

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer

More information

Stochastic Search and Graph Techniques for MCM Path Planning Christine D. Piatko, Christopher P. Diehl, Paul McNamee, Cheryl Resch and I-Jeng Wang

Stochastic Search and Graph Techniques for MCM Path Planning Christine D. Piatko, Christopher P. Diehl, Paul McNamee, Cheryl Resch and I-Jeng Wang Stochatic Search and Graph Technique for MCM Path Planning Chritine D. Piatko, Chritopher P. Diehl, Paul McNamee, Cheryl Rech and I-Jeng Wang The John Hopkin Univerity Applied Phyic Laboratory, Laurel,

More information

SLA Adaptation for Service Overlay Networks

SLA Adaptation for Service Overlay Networks SLA Adaptation for Service Overlay Network Con Tran 1, Zbigniew Dziong 1, and Michal Pióro 2 1 Department of Electrical Engineering, École de Technologie Supérieure, Univerity of Quebec, Montréal, Canada

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name:

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name: Fall 2010 EE457 Intructor: Gandhi Puvvada Quiz (~ 10%) Date: 10/1/2010, Friday in SGM123 Name: Calculator and Cadence Verilog guide are allowed; Cloed-book, Cloed-note, Time: 12:00-2:15PM Total point:

More information

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index IJCSES International Journal of Computer Sienes and Engineering Systems, ol., No.4, Otober 2007 CSES International 2007 ISSN 0973-4406 253 An Optimized Approah on Applying Geneti Algorithm to Adaptive

More information