Algorithmic Transformation Techniques for Efficient Exploration of Alternative Application Instances
|
|
- Shonda Simmons
- 5 years ago
- Views:
Transcription
1 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 Algorthmc Transformaton Technques for Effcent Exploraton of Alternatve Applcaton Instances Todor Stefanov Leden Insttute of Advanced Computer Scence Leden Unversty The Netherlands stefanov@lacs.nl Bart Kenhus Leden Insttute of Advanced Computer Scence Leden Unversty The Netherlands Ed Deprettere Leden Insttute of Advanced Computer Scence Leden Unversty The Netherlands ABSTRACT ollowng the Y-chart paradgm for desgnng a system, an applcaton and an archtecture are modeled separately and mapped onto each other n an explct desgn step. Next, a performance analyss for alternatve applcaton nstances, archtecture nstances and mappngs has to be done, thereby explorng the desgn space of the target system. Dervng alternatve applcaton nstances s not trvally done. Nevertheless, many nstances of a sngle applcaton exst that are worth to be derved for exploraton. In ths paper, we present algorthmc transformaton technques for systematc and fast generaton of alternatve applcaton nstances that express task-level concurrency hdden n an applcaton n some degree of explctness. These technques help a system desgner to speedup sgnfcantly the desgn space exploraton process. Applcaton Specfcaton for = ::N, [x()] = Source(); [y()] = Source2(); for = ::N, [Out()] = Snk(y()); Memory Communcaton Structure Generate P S Snk P P P3 P4 S KPN_5 S Snk P3 P4 S Snk KPN_2 P KPN_4 KPN_ Snk Keywords system-level desgn, desgn space exploraton, applcaton nstances, algorthmc transformatons. INTRODUCTION In system-level desgn of embedded sgnal-processng systems, a system desgner sees the target system as the par Applcaton(s) specfcaton - Archtecture template. An example of such a par s shown n the left part of gure. The applcaton specfcaton provdes the functonal behavor of the system. The archtecture template specfes the organzaton of the resources of the system onto whch the functonal behavor s to be mapped. In ths stage, a desgner has to make some desgn decsons, for example, how to partton the applcaton nto tasks, how to map the tasks onto the archtecture template, what knd of communcaton structure to use n the archtecture template, etc. In order to evaluate dfferent desgn decsons, a system desgner uses a model of the target system and does performance analyss for alternatve applcaton nstances, archtecture nstances and mappngs, thereby explorng the desgn space of the Applcaton - Archtecture par. A general scheme for a desgn space exploraton s the Y-chart PE0 PE PE2 PEn Archtecture Template Map and Explore P S KPN_3 Snk Instances of the Applcaton gure : Alternatve nstances of the applcaton have to be generated, mapped onto the archtecture template and explored n order to evaluate the performance of the Applcaton- Archtecture par. paradgm [4]. Tools lke SPADE [9] and ORAS [6] mplement technques that support the Y-chart paradgm but they focus only on the exploraton of alternatve archtecture nstances and mappngs [8]. In ths paper, however, we focus on technques that support effcent exploraton of alternatve applcaton nstances n system level desgn. An applcaton nstance s every parttonng of an applcaton nto a composton of concurrent tasks. We use the Kahn Process Network (KPN) model of computaton [3] to descrbe applcaton nstances. In the Kahn model, concurrent processes communcate va unbounded IO channels. In gure, we show a smple applcaton and a set of alternatve KPN nstances of ths applcaton (KPN to KPN 5). Each applcaton nstance dffers from the others n the degree of exploted task-level parallelsm. The performance of the Applcaton - Archtecture par can sgnfcantly dep on the applcaton nstance. So, a system desgner needs support to generate and explore a set of nstances of an applcaton n order to evaluate the performance of the system and to choose an applcaton parttonng that satsfes requrements the target system has to meet.
2 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 In general, a system desgner s only able to derve at most a few alternatve applcaton nstances. Ths s so because no systematc way to derve an applcaton nstance, let alone alternatves, from an applcaton specfcaton s known, as a result of whch heurstc and tme consumng approaches are taken n practce. Nevertheless, many nstances of a sngle applcaton exst that are worth to be derved for exploraton. We present n ths paper algorthmc transformatons that we have developed and mplemented n order to help a system desgner to derve systematcally and fast alternatve applcaton nstances. These transformatons together wth an aggressve parallel compler called COMPAAN are encapsulated n an Applcaton Transformaton Layer that automatcally generates a set of applcaton nstances. The transformatons and the tools presented n ths paper are not generally applcable n the sense that the applcaton specfcaton has to be an affne nested loop program (NLP). In the next secton we show the poston of the Applcaton Transformaton Layer n the Y-chart paradgm. In Secton 3 two specfc algorthmc transformatons are gven. The COMPAAN tool s brefly descrbed n Secton 4. In Secton 5 we show how our algorthmc transformatons are used n practce. In secton 6 we present a number of experments and assocated results. nally, we dscuss related work and draw conclusons n Secton 7 and Secton 8, respectvely. 2. APPLICATION TRANSORMATION LAYER In ths secton, we dscuss the applcaton transformaton layer n the context of the desgn space exploraton process. We use ths layer as an extenson to the Y-chart envronment [4]. The poston- Applcaton Transformaton Layer Archtecture Template Y chart Envronment Mappng Applcaton n Matlab or C 2 3 Performance New Values Analyss 4 of Parameters Performance Numbers Algorthmc Transformatons Intermedate Matlab or C code Compler Process Networks Intal Values of Parameters gure 2: The Y-chart exted wth the Applcaton Transformaton Layer. or lack of space we confne ourselves to only two such transformatons. We have dentfed and mplemented other transformatons as well, e.g., plane-cuttng, look-ahead, loop transformatons. The approach and technque s unform over all transformatons. ng of the transformaton layer s shown n gure 2. We start wth an applcaton specfcaton wrtten n an mperatve language lke Matlab or C and we have to generate and explore a set of nstances (Kahn Process Networks) functonally equvalent to the applcaton. rst, algorthmc transformatons are appled to the applcaton specfcaton. The transformatons are controlled by a set of parameters. At the begnnng some ntal values are assgned to the parameters depng on the avalable resources n the archtecture template. Wth these values, the orgnal code of the applcaton s automatcally transformed and structured n a partcular way n order to make the parallelsm that s nherently avalable n the applcaton explct or to enhance the task-level parallelsm n the applcaton. Second, the transformed code s converted automatcally to a KPN descrpton by an aggressve parallel compler called COMPAAN. Thrd, we use a Y-chart envronment to map the KPN onto an archtecture template and do performance analyss. The result of ths performance analyss can be used to change the values of the parameters (step 4 n gure 2) f the system performance s not satsfactory. Then, we repeat the procedure descrbed above resultng n a desgn space exploraton of alternatve nstances of the applcaton. Ths s shown n gure 2 as a feed-back arrow to the transformaton layer. By changng the values of the parameters, the applcaton transformaton layer automatcally generates a set of KPNs correspondng to a sngle applcaton. The dfference among the KPNs s the degree of the task-level parallelsm that s exploted. Tll the of ths paper we descrbe n more detals the technques and tools we have developed and ncorporated n the transformaton layer. 3. ALGORITHMIC TRANSORMATIONS In ths secton, we present two algorthmc transformatons, namely Unfoldng and Skewng. These transformatons take as nput an affne nested loop program (NLP) [2] and a set of parameters. The output of the unfoldng transformaton s an affne nested loop program whch s functonally equvalent to the nput program but wth enhanced task-level parallelsm. The skewng transformaton makes the potental parallelsm n the nput affne nested loop program explct. We have developed and mplemented these and other transformatons n a tool box called MATTRANSORM. The transformatons n ths tool box operate drectly on the NLP source code wthout usng some ntermedate representaton lke depence graphs, sgnal-flow graphs or data-flow graphs correspondng to the NLP. rst, we explan what unfoldng and skewng mean n the context of our algorthmc transformatons. Next, we defne the unfoldng and skewng transformatons as procedures that operate on an affne nested loop program. or convenence, n our further explanatons, we assume that affne nested loop programs (NLPs) are expressed n Matlab code. The NLPs could also be expressed n other mperatve programmng languages lke, for example, C. 3. Unfoldng and Skewng Consder the applcaton program (NLP) and ts depence graph (DG) shown n gure 3-a). The DG s a graphcal representaton of the NLP. The nodes n the DG represent the NLP functons that are executed n each loop teraton and the edges represent the data depences between the functons. The NLP has two loops (wth terators, ) whch can be unrolled to yeld the DG. Unlke common approaches, n whch ether the loop control s removed through loop unrollng [0] or the DG s folded [], our new approach to get the desred degree of parallelsm - at the task level - s to copy
3 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 for = ::4, for = ::3, for = ::4, f ( mod 2) =, for = ::3, f ( mod 2) = 0, for = ::3, for = 2::4+3, for = max(, 4)::mn(,3), [y(), x( )] = (y(), x( )); y() y(2) y(3) y() y(2) y(3) x() x(2) x(3) x(4) a) Applcaton program (NLP) and ts depence graph b) NLP wth unfolded loop by factor 2 c) NLP wth skewed loop x() x(2) x(3) x(4) y(2) x() x(2) x(3) x(4) y() y(3) gure 3: Smple example llustratng the unfoldng and skewng transformatons. a a number of tmes n such a way that these copes are mutually exclusve. We call ths new approach unfoldng and we have mplemented t n our unfoldng transformaton. An example of our unfoldng s shown n gure 3-b), where the -loop of the program n gure 3-a) s unfolded by a factor of 2. The two peces of code bounded by the f statements n gure 3-b) are mutually exclusve. The mutually exclusveness can be exploted by an aggressve parallel compler to partton the program n gure 3-b) nto two processes (tasks) that can operate n parallel. The graphcal nterpretaton of the unfoldng transformaton s gven by the depence graph n gure 3-b). or ths smple example the unfoldng transformaton parttons the computatonal workload over two parallel processes. The frst process wll execute the nodes bounded by the dashed boxes. The second process wll execute the nodes bounded by the sold boxes. An example of the network connectng these two processes s shown n gure 7 - see KPN. In general, our unfoldng transformaton s used to partton an NLP n processes, where s equal to the unfoldng factor. The process network correspondng to a fully unfolded NLP s equal to the depence graph of ths NLP. Now, consder the same applcaton program (NLP) shown n gure 3-a). The transformaton of skewng s to create a new NLP n whch the bounds of the loops and the ndexes of the varables are changed n a partcular way to make the potental parallelsm n the orgnal NLP explct. or example, skewng the -loop of the program n gure 3-a) leads to the NLP n gure 3-c). The effect of our skewng transformaton s vsualzed by the depence graph (DG) n gure 3-c). Ths DG explctly shows that the nodes nsde a dashed box can be executed n parallel because there are no data depences between these nodes. Ths property can be exploted by an aggressve parallel compler n combnaton wth the unfoldng descrbed above to partton the program nto processes (tasks) that run n parallel. An example of a network of such parallel processes correspondng to the NLP n gure 3-c) s gven n gure 8 - see KPN 4. Moreover, nsde these processes some peces of code can be executed n parallel or n a ppelne fashon because of the UNOLD( ) f ( s empty set) 5 prnt( ); return(); else 0 = frst element of the set ; = frst element of the set ; = take the code from the begnnng of tll the "for" statement wth loop terator, 5 ncludng; = take the body of loop from ; prnt( ); 20 for (k = ; k <= ; k++) prntln("f("+ +"mod"+ +")="+ -k+, ); "! $# = the set wthout the frst element; "! &% 25 = the set wthout the frst element; UNOLD( '! $# "! &% ); prntln(""); prntln(""); return(); gure 4: Pseudo code descrbng the UNOLD transformaton. skewng transformaton. Note that n both cases (unfoldng and skewng), the transformatons proceed along the NLP code n gure 3. The depence graphs are only shown to vsualze the effect of the transformatons. 3.2 Unfoldng procedure Let )(* be an N-deep affne nested loop program wth an teraton vector +-,/ or each 4:-;<+-=0>?,A@ BC565D a parameter E : ;G s assocated. All these parameters form a parameter vector HI,J.0E E D5656 E 7 8 whch we call unfoldng vector. We defne a transformaton UNOLD(NLP,U,I) whch s descrbed n gure 4. The pseudo code n gure 4 descrbes the unfoldng transformaton as a recursve procedure. Ths procedure operates on the affne nested loop program )(K* wth ts teraton vector + and the value of the unfoldng vector H. In order to explan the behavor of the procedure UNOLD we consder the followng smple example. Let L(* be the program shown n the left part of gure 5. )(K* has only one loop wth an terator (ndex). Hence, the teraton vector + correspondng to L(* has only one element +-,M.0 N8 and the unfoldng vector H has also one element HG,/.2EO8. In our example the parameter E s equal to 0. ollowng the procedure UNOLD, frst we check whether + s an empty set. In our example we start wth +P,Q.0 N8 whch s not an empty set. Then, we ntalze four varables, see lnes 0,, 3 and 6 n gure 4. As a result we have: varable R takes the character ; varable S = 0; varable TVUWUYX takes the strng Z[UW\] ^,G@`_a@`_C and SUWbCc takes the code n the body of the loop wth terator. Ths code s marked n gure 5 as a rectangle. Lne 8 n gure 4 prnts to the output the varable TUWUYX. The result s shown n gure 5 - the frst lne n the unfolded NLP. Executng lnes 20 tll 32 n gure 4 wll generate the rest of the code of the unfolded NLP n
4 n qp ut In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 for = ::N, Applcaton program (NLP) U = {0}, I={} UNOLD(NLP, U, I) for = ::N, f ( mod 0) = 9, f ( mod 0) = 8, f ( mod 0) = 0, : : Unfolded NLP gure 5: Smple example llustratng the UNOLD() transformaton shown n gure 4. gure 5. As a result the unfolded NLP n gure 5 has ten copes of the bounded by f statements wth a mod statement makng them mutually exclusve. The example n gure 5 shows that the nput NLP s transformed to a functonally equvalent NLP whch we call an unfolded NLP. The unfolded NLP can be easly converted nto ten tasks that operate n parallel. That s why we say that the unfolded NLP has enhanced task-level parallelsm compared wth the nput NLP. 3.3 Skewng procedure Let )(K* be an N-deep affne nested loop program wth an teraton vector +d,/ D or each : ;<+-=0>e,A@ 'Bf25656 a parameter vector g :,h.0 2 P3 565D5656 P78 s assocated, where each -k<;lg=xl,m@ YBf All parameter vectors form a parameter matrx " r565 7,h.0gdo gdo gdo 7 8, 5D s7 n whch we call skewng matrx. We requre to be unmodular. We defne a transformaton SKEW(NLP,M) as descrbed below: v STEP - Represent the teraton space of )(* as a polytope *w,i.0+<;yx]zp=${ 5 +} ~S08, where { s an ntegral matrx and S s an ntegral vector; v n STE - Use the skewng matrx to transform * as follows: { 5 ng 5 n 5 +G S,ƒ {9 5 +f ) S, where {, { 5 n/ and +, n 5 + ; v STEP3 - Use the ourer-motzkn (M) procedure [] to represent the teraton space, descrbed by { 5 + S, n terms of nested loops. Ths s the new teraton space of )(K* wth teraton vector + ; v STEP4 - Change all ndexes of the varables n )(K* accordng to the equaton +ˆ, n 5 +. The four steps descrbed above are llustrated n gure 6 n the context of a smple example. We start n wth a 2-deep affne nested loop program and a In STEP, ranges of the loop ndexes and are represented as a system of lnear nequaltes { 5 +Ž ms n. Next, we use the skewng matrx to STEP STE STEP3 STEP4 Applcaton program (NLP) for = ::N, A 0 0 A M for = 2::N+K, for = max(, N)::mn(,K), [y( ), x( )] = (y( ), x( )); Skewed NLP * 0 0 * >= * * >= N K I b M I M N K b for = 2::N+K, for = max(, N)::mn(,K), = 0 I M A * I * >= I ==> N K b Substtute: wth wth gure 6: Smple example llustratng the four steps n the SKEW(NLP,M) procedure. do the mathematcal manpulatons descrbed n STE. As a result we have $ a new teraton space for the nput NLP, defned by the loop ndexes and and bounded by the system { 5D $ D o S. The ourer-motzkn (M) procedure s used to represent the new teraton space as nested loops as t s shown n gure 6 - STEP3. After ths step all varables nsde the loops are stll ndexed by the old ndexes and. We have to replace them wth the new ndexes and. In order to do ths we know from STE o, whch mples that `@ $ o. So, we have to replace ndex wth e and ndex wth n all varables. Ths s llustrated n gure 6 - STEP4. 4. COMPILER In ths secton, we brefly descrbe our aggressve parallel compler COMPAAN whch explots the result of the transformatons presented n Secton 3. COMPAAN (Complaton of Matlab to Process Networks) [7] s a method and tool set (MATPARSER, DGPARSER, PANDA) for transformng affne nested loop programs (NLP) [2] wrtten n Matlab nto a Kahn Process Network (KPN) specfcaton. COMPAAN starts the transformaton by convertng a Matlab specfcaton nto a sngle assgnment code (SAC) specfcaton. SAC descrbes all parallelsm avalable n the orgnal Matlab specfcaton. The tool whch does the Matlab-to-SAC transformaton s MAT- PARSER [5]. MATPARSER s an array dataflow analyss compler that fnds all parallelsm avalable n NLPs wrtten n Matlab usng a very aggressve data-depency analyss technque. Ths technque s based on parametrc nteger lnear programmng. Also, MATPARSER can handle non-lnear operators lke Max, Mn, Cel, loor, Mod and Dv. Therefore, t can handle the result of the skewng and unfoldng transformatons presented n Secton 3. Next, a tool called DGPARSER [2] converts the SAC descrpton nto a Polyhedral Reduced Depence Graph (PRDG) [7] descrpton. The PRDG s a compact graphcal representaton of the SAC usng parameterzed polyhedral embeddngs of the atomc functons. nally, the PANDA tool [7] uses the PRDG descrpton n order to generate the Kahn Process Network descrpton and the ndvdual
5 š In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 for = ::N, Transformaton: Unfold(U), U = [u, u2] = [2,] for = ::N, f ( mod 2) =, f ( mod 2) = 0, Converson to KPN: Transformaton: Unfold(U), U = [u, u2] = [2,2] for = ::N, f ( mod 2) =, f ( mod 2) =, f ( mod 2) = 0, f ( mod 2) = 0, f ( mod 2) =, f ( mod 2) = 0, Converson to KPN: P for = ::N, for = 2::N+K, for = max(, N)::mn(,K), [y(), x( )] = (y(), x( )); Converson to KPN: Transformaton: Skew(M) + Unfold(U), m m2 M = = m2 m22 0 Transformaton: U = [u, u2] = [2,] Skew(M), m m2 M = = m2 m22 0 P KPN_3 for = 2::N+K, f ( mod 2) =, for = max(, N)::mn(,K), [y(), x( )] = (y(), x( )); f ( mod 2) = 0, for = max(, N)::mn(,K), [y(), x( )] = (y(), x( )); KPN_4 Converson to KPN: gure 8: An example of generatng two possble Kahn Process Networks from a sngle applcaton usng the skewng and unfoldng transformatons and the COMPAAN tool. P KPN_ P P3 P4 KPN_2 gure 7: An example of generatng two possble Kahn Process Networks from a sngle applcaton usng the unfoldng transformaton and the COMPAAN tool. processes. 5. EXAMPLES In ths secton, we demonstrate the use of our algorthmc transformatons n combnaton wth the COMPAAN tool set. We show how, merely by changng the values of the parameters, a set of Kahn Process Networks (KPN) can be easly generated from a sngle applcaton. Consder the applcaton shown n the top-left corner of gure 7. It s a 2-deep affne nested loop program wrtten n Matlab. In gure 7 frst we apply the unfoldng transformaton on our applcaton and then we use COMPAAN to convert the transformed code nto a KPN descrpton. We assgn two dfferent values to the parameter vector H, namely H, and HI, BfYB. As a result we obtan two dfferent KPNs. They have dfferent numbers of processes and dfferent communcaton structures (see gure 7- KPN and KPN 2). In gure 8, we show another example n whch we use the same applcaton as n gure 7. We obtan KPN 3, whch has only one process, n by applyng the skewng transformaton wth a Also, we show that the skewng transformaton and the unfoldng transformaton can be appled n ton. KPN 4 n gure 8 s derved by applyng frst the and then the unfoldng transfor- n maton wth H, 6. EXPERIMENTS AND RESULTS In ths secton, we present some of the experments we have done n order to evaluate and show the usefulness of the algorthmc transformaton technques presented n ths paper. We bult a Y-chart envronment exted wth the Applcaton Transformaton Layer as shown n gure 2. As an nput applcaton for the transformaton layer we used the QR-decomposton algorthm [2] because t s common computatonal ntensve task n many sgnal processng applcatons lke Dgtal Beamformng, Adaptve Dgtal lterng etc. The algorthm was wrtten n Matlab. The applcaton transformaton layer apples the Unfoldng and Skewng transformatons on the QR algorthm and generates alternatve applcaton nstances - Process Networks - as syntheszable VHDL. We mapped these nstances onto a Xlnx XCV000E PGA devce whch was the archtecture template for our experments. The mappng was done by a syntheszer and place-and-route tools provded by Xlnx. The performance analyss was done usng the tmng analyss and smulaton tools from the Xlnx oundaton R package. gure 9 shows the estmated total executon tme for three applcaton nstances of the QR-decomposton algorthm. These nstances were derved automatcally by applyng the transformaton technques presented n Secton 3. The results show that the effect of Skewng + Unfoldng Unfoldng No transform Tme ( mcro seconds) gure 9: Executon tme of the QR algorthm transformed by usng the unfoldng and skewng transformatons. The unfoldng factor s 3 and the sze of the nput data matrx s 0 by 6. applyng our transformatons s that we can generate alternatve applcaton nstances wth dfferent performance when mappng them onto an archtecture template (n our case an PGA). It can be seen from gure 9 that the unfoldng and skewng transformatons mprove sgnfcantly the performance. gure 0 shows the results obtaned from the exploraton of the performance of ten applcaton nstances of the QR algorthm derved by applyng only the unfoldng transformaton wth unfoldng factors from to 0. Agan, the results show that the performance can be sgnfcantly mproved. In ths experment we also measured how much tme t takes to obtan the results presented n gure 0. The tme taken for these ten experments to be processed
6 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 number of cycles unfoldng factor gure 0: Exploraton of the performance of the QR algorthm unfolded by factors from to 0. The sze of the nput data matrx s 48 by 6. automatcally from Matlab to a hardware mappng onto an PGA and VHDL smulaton was wthn 8 hours. Table shows the processng tmes for some of the experments n more detals. The second row Transform+Comple shows the processng tmes for our tools MATTRANSORM and COMPAAN step and step 2 n gure 2. The row Mappng+Smulaton gves the tme needed to express the Process Networks n terms of a syntheszable VHDL code, to map ths VHDL code on an PGA and fnally to obtan performance numbers from VHDL smulaton step 3 n gure 2. Table : Processng Tmes (hh:mm:ss). Unfold 2 Unfold 5 Unfold 0 Transform+Comple 00:00:08 00:00:8 00:00:29 Mappng+Smulaton 00:22:54 0:24:44 04:47:30 Total 00:23:02 0:25:02 04:47:59 The last row of Table suggests that an extensve desgn space exploraton of alternatve applcaton nstances can be done n a relatvely short amount of tme. Moreover, the accuracy of the results obtaned durng the exploraton s wthn 5%, because we dd very detaled VHDL cycle accurate smulaton. The results gven n the second row of Table show that the applcaton transformaton layer presented n Secton 2 generates very fast alternatve applcaton nstances from a gven applcaton. The tme to do ths s only a few seconds, whereas the tme to map the nstances onto an PGA and smulate them vares form mnutes to hours - see row 3 of Table. However, there s a potental to mprove the mappng and smulaton tme (row 3 of Table ) by usng some system-level desgn space exploraton tools lke SPADE [9] and ORAS [6]. Prelmnary results ndcate that the mappng and smulaton tme can be reduced to a few mnutes nstead of several hours obtanng performance numbers wth reasonable accuracy. 7. RELATED WORK The Unfoldng and Skewng transformatons presented n ths paper are related to the unfoldng and retmng transformaton technques used n the Sgnal-Processng communty []. Also, they are related to the loop unrollng and loop skewng technques used n compler desgn [0]. However, there are some mportant dfferences. rst, we use our transformatons for generatng a set of Kahn Process Networks correspondng to an applcaton (nested loop program) thereby generatng alternatve applcaton nstances. Usng the Unfoldng transformaton to generate Process Networks we do reverse parttonng compared to [3]. We start by puttng all computatonal workload n one process and by unfoldng we partton the workload over more processes. Second, we developed procedures to do these transformatons on the algorthmc (source code) level, whereas n [] smlar transformatons are appled on sgnal-flow graphs, data-flow graphs or depence graphs correspondng to an algorthm. Thrd, our transformatons am at exposng and explotng the task-level parallelsm avalable n an applcaton, whereas the transformatons n [0] am at explotng the fne-gran nstructon-level parallelsm. 8. CONCLUSIONS In ths paper, we presented algorthmc transformaton technques for dervng a set of applcaton nstances (Kahn Process Networks) correspondng to an applcaton. These technques support a system desgner n explorng alternatve nstances of an applcaton mapped onto an archtecture template. We have mplemented our technques n the tools MATTRANSORM and COMPAAN whch means that the process of dervng alternatve nstances s fully automated for applcatons descrbed as affne nested loop programs. Therefore, the presented technques help a system desgner to speedup sgnfcantly the process of explorng alternatve applcaton nstances n system level desgn. Our experments and results show that an extensve desgn space exploraton of alternatve applcaton nstances can be done n a relatvely short amount of tme wth accuracy of the results wthn 5%. 9. REERENCES [] C. Ancourt and. Irgon. Scannng polyhedra wth DO loops. In Proc. ACM SIGPLAN 9, pages 39 50, June 99. [2] P. Held. unctonal Desgn of Data-low Networks, 996. PhD thess, Delft Unversty of Technology, The Netherlands. [3] G. Kahn. The semantcs of a smple language for parallel programmng. In Proc. of the IIP Congress 74. North-Holland Publshng Co., 974. [4] B. Kenhus. Desgn Space Exploraton of Stream-based Dataflow Archtectures: Methods and Tools, Jan PhD thess, Delft Unversty of Technology, The Netherlands. [5] B. Kenhus. MatParser: An array dataflow analyss compler. Techncal report, Unversty of Calforna at Berkeley, UCB/ERL M00/9. [6] B. Kenhus, E. Deprettere, K. Vssers, and P. van der Wolf. The Constructon of a Retargetable Smulator for an Archtecture Template. In Proc. 6-th Int. Workshop on Hardware/Software Codesgn (CODES 98), Seattle, Washngton, Mar [7] B. Kenhus, E. Rpkema, and E.. Deprettere. : Dervng Process Networks from Matlab for Embedded Sgnal Processng Archtectures. In Proc. 8th Internatonal Workshop on Hardware/Software Codesgn (CODES 2000), San Dego, CA, USA, May [8] P. Leverse, T. Stefanov, P. van der Wolf, and E. Deprettere. System Level Desgn wth SPADE: an M-JPEG Case Study. In Proc. Int. Conference on Computer Aded Desgn (ICCAD 0), pages 3 38, San Jose CA, USA, Nov [9] P. Leverse, P. van der Wolf, K. Vssers, and E. Deprettere. A Methodology for Archtecture Exploraton of Heterogeneous Sgnal Processng Systems. Int. Journal of VLSI Sgnal Processng for Sgnal, Image and Vdeo Technology, 29(3):97 207, 200. [0] S. Muchnck. Advanced Compler Desgn and Implementaton. Morgan Kaufmann Publshers, Inc., 997. [] K. Parh. VLSI Dgtal Sgnal Processng Systems: Desgn and Implementaton. John Wley & Sons, Inc., 999. [2] J. Proaks, C. Rader,. Lng, C. Nkas, M. Moonen, and I. Proudler. Algorthms for Statstcal Sgnal Processng. Prentce Hall, Inc., [3] J. Tech and L. Thele. Exact Parttonng of Affne Depence Algorthms. Lecture Notes n Computer Scence (LNCS), Sprnger, 2268:33 5, 2002.
Ptolemy II in Embedded Signal Processing Architectures: Deriving Process Networks From Matlab
Ptolemy II n Embedded Sgnal Processng Archtectures: Dervng Process Networs From Matlab Bart Kenhus and Ed Deprettere Leden Insttute of Advanced omputer Scence (LIAS) Leden Unversty, The Netherlands Ptolemy
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationLoop Transformations, Dependences, and Parallelization
Loop Transformatons, Dependences, and Parallelzaton Announcements Mdterm s Frday from 3-4:15 n ths room Today Semester long project Data dependence recap Parallelsm and storage tradeoff Scalar expanson
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationPolyhedral Compilation Foundations
Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationLLVM passes and Intro to Loop Transformation Frameworks
LLVM passes and Intro to Loop Transformaton Frameworks Announcements Ths class s recorded and wll be n D2L panapto. No quz Monday after sprng break. Wll be dong md-semester class feedback. Today LLVM passes
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More informationToday Using Fourier-Motzkin elimination for code generation Using Fourier-Motzkin elimination for determining schedule constraints
Fourer Motzkn Elmnaton Logstcs HW10 due Frday Aprl 27 th Today Usng Fourer-Motzkn elmnaton for code generaton Usng Fourer-Motzkn elmnaton for determnng schedule constrants Unversty Fourer-Motzkn Elmnaton
More informationVectorization in the Polyhedral Model
Vectorzaton n the Polyhedral Model Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty October 200 888. Introducton: Overvew Vectorzaton: Detecton
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationSolving two-person zero-sum game by Matlab
Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationMemory Modeling in ESL-RTL Equivalence Checking
11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com
More informationAssembler. Building a Modern Computer From First Principles.
Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationAn Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices
Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal
More informationA SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES
A SYSOLIC APPROACH O LOOP PARIIONING AND MAPPING INO FIXED SIZE DISRIBUED MEMORY ARCHIECURES Ioanns Drosts, Nektaros Kozrs, George Papakonstantnou and Panayots sanakas Natonal echncal Unversty of Athens
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationImproving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations
Improvng Hgh Level Synthess Optmzaton Opportunty Through Polyhedral Transformatons We Zuo 2,5, Yun Lang 1, Peng L 1, Kyle Rupnow 3, Demng Chen 2,3 and Jason Cong 1,4 1 Center for Energy-Effcent Computng
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationLoop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation
Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationVerification by testing
Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over
More informationType-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data
Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES
More informationLoop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)
Loop Transformatons for Parallelsm & Localty Prevously Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Loop nterchange Loop transformatons and transformaton frameworks
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationA Facet Generation Procedure. for solving 0/1 integer programs
A Facet Generaton Procedure for solvng 0/ nteger programs by Gyana R. Parja IBM Corporaton, Poughkeepse, NY 260 Radu Gaddov Emery Worldwde Arlnes, Vandala, Oho 45377 and Wlbert E. Wlhelm Teas A&M Unversty,
More informationConcurrent models of computation for embedded software
Concurrent models of computaton for embedded software and hardware! Researcher overvew what t looks lke semantcs what t means and how t relates desgnng an actor language actor propertes and how to represent
More informationRelated-Mode Attacks on CTR Encryption Mode
Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory
More informationData Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach
Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer
More informationParallel Inverse Halftoning by Look-Up Table (LUT) Partitioning
Parallel Inverse Halftonng by Look-Up Table (LUT) Parttonng Umar F. Sddq and Sadq M. Sat umar@ccse.kfupm.edu.sa, sadq@kfupm.edu.sa KFUPM Box: Department of Computer Engneerng, Kng Fahd Unversty of Petroleum
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationX- Chart Using ANOM Approach
ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationAgenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals
Agenda & Readng COMPSCI 8 SC Applcatons Programmng Programmng Fundamentals Control Flow Agenda: Decsonmakng statements: Smple If, Ifelse, nested felse, Select Case s Whle, DoWhle/Untl, For, For Each, Nested
More informationThe stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0
The stream cpher MICKEY-128 (verson 1 Algorthm specfcaton ssue 1. Steve Babbage Vodafone Group R&D, Newbury, UK steve.babbage@vodafone.com Matthew Dodd Independent consultant matthew@mdodd.net www.mdodd.net
More informationPHYSICS-ENHANCED L-SYSTEMS
PHYSICS-ENHANCED L-SYSTEMS Hansrud Noser 1, Stephan Rudolph 2, Peter Stuck 1 1 Department of Informatcs Unversty of Zurch, Wnterthurerstr. 190 CH-8057 Zurch Swtzerland noser(stuck)@f.unzh.ch, http://www.f.unzh.ch/~noser(~stuck)
More informationBrave New World Pseudocode Reference
Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be
More informationSum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints
Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan
More information2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements
Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.
More informationReal-Time Systems. Real-Time Systems. Verification by testing. Verification by testing
EDA222/DIT161 Real-Tme Systems, Chalmers/GU, 2014/2015 Lecture #8 Real-Tme Systems Real-Tme Systems Lecture #8 Specfcaton Professor Jan Jonsson Implementaton System models Executon-tme analyss Department
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationCourse Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms
Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques
More informationComputer models of motion: Iterative calculations
Computer models o moton: Iteratve calculatons OBJECTIVES In ths actvty you wll learn how to: Create 3D box objects Update the poston o an object teratvely (repeatedly) to anmate ts moton Update the momentum
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationIntro. Iterators. 1. Access
Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy
More informationHarvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)
Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More information3D vector computer graphics
3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationWavefront Reconstructor
A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes
More informationAlgorithm To Convert A Decimal To A Fraction
Algorthm To Convert A ecmal To A Fracton by John Kennedy Mathematcs epartment Santa Monca College 1900 Pco Blvd. Santa Monca, CA 90405 jrkennedy6@gmal.com Except for ths comment explanng that t s blank
More informationA HIERARCHICAL SIMULATION FRAMEWORK FOR APPLICATION DEVELOPMENT ON SYSTEM-ON-CHIP ARCHITECTURES. Vaibhav Mathur and Viktor K.
A HIERARCHICAL SIMULATION FRAMEWORK FOR APPLICATION DEVELOPMENT ON SYSTEM-ON-CHIP ARCHITECTURES Vabhav Mathur and Vktor K. Prasanna Department of EE-Systems Unversty of Southern Calforna Los Angeles, CA
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationSkew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach
Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research
More informationComparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments
Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationSimulation Based Analysis of FAST TCP using OMNET++
Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months
More informationA fault tree analysis strategy using binary decision diagrams
Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationConfiguration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*
Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad
More informationMultigranular Simulation of Heterogeneous Embedded Systems
Multgranular Smulaton of Heterogeneous Embedded Systems Adtya Agrawal Insttute for Software Integrated Systems Vanderblt Unversty Nashvlle, TN - 37235 1 615 343 7567 adtya.agrawal@vanderblt.edu Akos Ledecz
More informationLecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III.
Lecture 15: Memory Herarchy Optmzatons I. Caches: A Quck Revew II. Iteraton Space & Loop Transformatons III. Types of Reuse ALSU 7.4.2-7.4.3, 11.2-11.5.1 15-745: Memory Herarchy Optmzatons Phllp B. Gbbons
More informationPetri Net Based Software Dependability Engineering
Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013
More informationModel Integrated Computing: A Framework for Creating Domain Specific Design Environments
Model Integrated Computng: A Framework for Creatng Doman Specfc Desgn Envronments James R. DAVIS Vanderblt Unversty, Insttute for Software Integrated Systems Nashvlle, TN 37203, USA ABSTRACT Model Integrated
More informationAn Image Fusion Approach Based on Segmentation Region
Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua
More informationGSLM Operations Research II Fall 13/14
GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are
More informationNews. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example
Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt
More informationVectorization of Image Outlines Using Rational Spline and Genetic Algorithm
01 Internatonal Conference on Image, Vson and Computng (ICIVC 01) IPCSIT vol. 50 (01) (01) IACSIT Press, Sngapore DOI: 10.776/IPCSIT.01.V50.4 Vectorzaton of Image Outlnes Usng Ratonal Splne and Genetc
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationAssembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.
IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language
More informationArray transposition in CUDA shared memory
Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationCommunication-Minimal Partitioning and Data Alignment for Af"ne Nested Loops
Communcaton-Mnmal Parttonng and Data Algnment for Af"ne Nested Loops HYUK-JAE LEE 1 AND JOSÉ A. B. FORTES 2 1 Department of Computer Scence, Lousana Tech Unversty, Ruston, LA 71272, USA 2 School of Electrcal
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationPerformance Study of Parallel Programming on Cloud Computing Environments Using MapReduce
Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan
More informationSorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions
Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationSENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR
SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationCS1100 Introduction to Programming
Factoral (n) Recursve Program fact(n) = n*fact(n-) CS00 Introducton to Programmng Recurson and Sortng Madhu Mutyam Department of Computer Scence and Engneerng Indan Insttute of Technology Madras nt fact
More informationFEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur
FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents
More informationExplicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements
Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley
More informationThe Shortest Path of Touring Lines given in the Plane
Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He
More informationConcurrent Apriori Data Mining Algorithms
Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More information