Algorithmic Transformation Techniques for Efficient Exploration of Alternative Application Instances

Size: px
Start display at page:

Download "Algorithmic Transformation Techniques for Efficient Exploration of Alternative Application Instances"

Transcription

1 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 Algorthmc Transformaton Technques for Effcent Exploraton of Alternatve Applcaton Instances Todor Stefanov Leden Insttute of Advanced Computer Scence Leden Unversty The Netherlands stefanov@lacs.nl Bart Kenhus Leden Insttute of Advanced Computer Scence Leden Unversty The Netherlands Ed Deprettere Leden Insttute of Advanced Computer Scence Leden Unversty The Netherlands ABSTRACT ollowng the Y-chart paradgm for desgnng a system, an applcaton and an archtecture are modeled separately and mapped onto each other n an explct desgn step. Next, a performance analyss for alternatve applcaton nstances, archtecture nstances and mappngs has to be done, thereby explorng the desgn space of the target system. Dervng alternatve applcaton nstances s not trvally done. Nevertheless, many nstances of a sngle applcaton exst that are worth to be derved for exploraton. In ths paper, we present algorthmc transformaton technques for systematc and fast generaton of alternatve applcaton nstances that express task-level concurrency hdden n an applcaton n some degree of explctness. These technques help a system desgner to speedup sgnfcantly the desgn space exploraton process. Applcaton Specfcaton for = ::N, [x()] = Source(); [y()] = Source2(); for = ::N, [Out()] = Snk(y()); Memory Communcaton Structure Generate P S Snk P P P3 P4 S KPN_5 S Snk P3 P4 S Snk KPN_2 P KPN_4 KPN_ Snk Keywords system-level desgn, desgn space exploraton, applcaton nstances, algorthmc transformatons. INTRODUCTION In system-level desgn of embedded sgnal-processng systems, a system desgner sees the target system as the par Applcaton(s) specfcaton - Archtecture template. An example of such a par s shown n the left part of gure. The applcaton specfcaton provdes the functonal behavor of the system. The archtecture template specfes the organzaton of the resources of the system onto whch the functonal behavor s to be mapped. In ths stage, a desgner has to make some desgn decsons, for example, how to partton the applcaton nto tasks, how to map the tasks onto the archtecture template, what knd of communcaton structure to use n the archtecture template, etc. In order to evaluate dfferent desgn decsons, a system desgner uses a model of the target system and does performance analyss for alternatve applcaton nstances, archtecture nstances and mappngs, thereby explorng the desgn space of the Applcaton - Archtecture par. A general scheme for a desgn space exploraton s the Y-chart PE0 PE PE2 PEn Archtecture Template Map and Explore P S KPN_3 Snk Instances of the Applcaton gure : Alternatve nstances of the applcaton have to be generated, mapped onto the archtecture template and explored n order to evaluate the performance of the Applcaton- Archtecture par. paradgm [4]. Tools lke SPADE [9] and ORAS [6] mplement technques that support the Y-chart paradgm but they focus only on the exploraton of alternatve archtecture nstances and mappngs [8]. In ths paper, however, we focus on technques that support effcent exploraton of alternatve applcaton nstances n system level desgn. An applcaton nstance s every parttonng of an applcaton nto a composton of concurrent tasks. We use the Kahn Process Network (KPN) model of computaton [3] to descrbe applcaton nstances. In the Kahn model, concurrent processes communcate va unbounded IO channels. In gure, we show a smple applcaton and a set of alternatve KPN nstances of ths applcaton (KPN to KPN 5). Each applcaton nstance dffers from the others n the degree of exploted task-level parallelsm. The performance of the Applcaton - Archtecture par can sgnfcantly dep on the applcaton nstance. So, a system desgner needs support to generate and explore a set of nstances of an applcaton n order to evaluate the performance of the system and to choose an applcaton parttonng that satsfes requrements the target system has to meet.

2 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 In general, a system desgner s only able to derve at most a few alternatve applcaton nstances. Ths s so because no systematc way to derve an applcaton nstance, let alone alternatves, from an applcaton specfcaton s known, as a result of whch heurstc and tme consumng approaches are taken n practce. Nevertheless, many nstances of a sngle applcaton exst that are worth to be derved for exploraton. We present n ths paper algorthmc transformatons that we have developed and mplemented n order to help a system desgner to derve systematcally and fast alternatve applcaton nstances. These transformatons together wth an aggressve parallel compler called COMPAAN are encapsulated n an Applcaton Transformaton Layer that automatcally generates a set of applcaton nstances. The transformatons and the tools presented n ths paper are not generally applcable n the sense that the applcaton specfcaton has to be an affne nested loop program (NLP). In the next secton we show the poston of the Applcaton Transformaton Layer n the Y-chart paradgm. In Secton 3 two specfc algorthmc transformatons are gven. The COMPAAN tool s brefly descrbed n Secton 4. In Secton 5 we show how our algorthmc transformatons are used n practce. In secton 6 we present a number of experments and assocated results. nally, we dscuss related work and draw conclusons n Secton 7 and Secton 8, respectvely. 2. APPLICATION TRANSORMATION LAYER In ths secton, we dscuss the applcaton transformaton layer n the context of the desgn space exploraton process. We use ths layer as an extenson to the Y-chart envronment [4]. The poston- Applcaton Transformaton Layer Archtecture Template Y chart Envronment Mappng Applcaton n Matlab or C 2 3 Performance New Values Analyss 4 of Parameters Performance Numbers Algorthmc Transformatons Intermedate Matlab or C code Compler Process Networks Intal Values of Parameters gure 2: The Y-chart exted wth the Applcaton Transformaton Layer. or lack of space we confne ourselves to only two such transformatons. We have dentfed and mplemented other transformatons as well, e.g., plane-cuttng, look-ahead, loop transformatons. The approach and technque s unform over all transformatons. ng of the transformaton layer s shown n gure 2. We start wth an applcaton specfcaton wrtten n an mperatve language lke Matlab or C and we have to generate and explore a set of nstances (Kahn Process Networks) functonally equvalent to the applcaton. rst, algorthmc transformatons are appled to the applcaton specfcaton. The transformatons are controlled by a set of parameters. At the begnnng some ntal values are assgned to the parameters depng on the avalable resources n the archtecture template. Wth these values, the orgnal code of the applcaton s automatcally transformed and structured n a partcular way n order to make the parallelsm that s nherently avalable n the applcaton explct or to enhance the task-level parallelsm n the applcaton. Second, the transformed code s converted automatcally to a KPN descrpton by an aggressve parallel compler called COMPAAN. Thrd, we use a Y-chart envronment to map the KPN onto an archtecture template and do performance analyss. The result of ths performance analyss can be used to change the values of the parameters (step 4 n gure 2) f the system performance s not satsfactory. Then, we repeat the procedure descrbed above resultng n a desgn space exploraton of alternatve nstances of the applcaton. Ths s shown n gure 2 as a feed-back arrow to the transformaton layer. By changng the values of the parameters, the applcaton transformaton layer automatcally generates a set of KPNs correspondng to a sngle applcaton. The dfference among the KPNs s the degree of the task-level parallelsm that s exploted. Tll the of ths paper we descrbe n more detals the technques and tools we have developed and ncorporated n the transformaton layer. 3. ALGORITHMIC TRANSORMATIONS In ths secton, we present two algorthmc transformatons, namely Unfoldng and Skewng. These transformatons take as nput an affne nested loop program (NLP) [2] and a set of parameters. The output of the unfoldng transformaton s an affne nested loop program whch s functonally equvalent to the nput program but wth enhanced task-level parallelsm. The skewng transformaton makes the potental parallelsm n the nput affne nested loop program explct. We have developed and mplemented these and other transformatons n a tool box called MATTRANSORM. The transformatons n ths tool box operate drectly on the NLP source code wthout usng some ntermedate representaton lke depence graphs, sgnal-flow graphs or data-flow graphs correspondng to the NLP. rst, we explan what unfoldng and skewng mean n the context of our algorthmc transformatons. Next, we defne the unfoldng and skewng transformatons as procedures that operate on an affne nested loop program. or convenence, n our further explanatons, we assume that affne nested loop programs (NLPs) are expressed n Matlab code. The NLPs could also be expressed n other mperatve programmng languages lke, for example, C. 3. Unfoldng and Skewng Consder the applcaton program (NLP) and ts depence graph (DG) shown n gure 3-a). The DG s a graphcal representaton of the NLP. The nodes n the DG represent the NLP functons that are executed n each loop teraton and the edges represent the data depences between the functons. The NLP has two loops (wth terators, ) whch can be unrolled to yeld the DG. Unlke common approaches, n whch ether the loop control s removed through loop unrollng [0] or the DG s folded [], our new approach to get the desred degree of parallelsm - at the task level - s to copy

3 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 for = ::4, for = ::3, for = ::4, f ( mod 2) =, for = ::3, f ( mod 2) = 0, for = ::3, for = 2::4+3, for = max(, 4)::mn(,3), [y(), x( )] = (y(), x( )); y() y(2) y(3) y() y(2) y(3) x() x(2) x(3) x(4) a) Applcaton program (NLP) and ts depence graph b) NLP wth unfolded loop by factor 2 c) NLP wth skewed loop x() x(2) x(3) x(4) y(2) x() x(2) x(3) x(4) y() y(3) gure 3: Smple example llustratng the unfoldng and skewng transformatons. a a number of tmes n such a way that these copes are mutually exclusve. We call ths new approach unfoldng and we have mplemented t n our unfoldng transformaton. An example of our unfoldng s shown n gure 3-b), where the -loop of the program n gure 3-a) s unfolded by a factor of 2. The two peces of code bounded by the f statements n gure 3-b) are mutually exclusve. The mutually exclusveness can be exploted by an aggressve parallel compler to partton the program n gure 3-b) nto two processes (tasks) that can operate n parallel. The graphcal nterpretaton of the unfoldng transformaton s gven by the depence graph n gure 3-b). or ths smple example the unfoldng transformaton parttons the computatonal workload over two parallel processes. The frst process wll execute the nodes bounded by the dashed boxes. The second process wll execute the nodes bounded by the sold boxes. An example of the network connectng these two processes s shown n gure 7 - see KPN. In general, our unfoldng transformaton s used to partton an NLP n processes, where s equal to the unfoldng factor. The process network correspondng to a fully unfolded NLP s equal to the depence graph of ths NLP. Now, consder the same applcaton program (NLP) shown n gure 3-a). The transformaton of skewng s to create a new NLP n whch the bounds of the loops and the ndexes of the varables are changed n a partcular way to make the potental parallelsm n the orgnal NLP explct. or example, skewng the -loop of the program n gure 3-a) leads to the NLP n gure 3-c). The effect of our skewng transformaton s vsualzed by the depence graph (DG) n gure 3-c). Ths DG explctly shows that the nodes nsde a dashed box can be executed n parallel because there are no data depences between these nodes. Ths property can be exploted by an aggressve parallel compler n combnaton wth the unfoldng descrbed above to partton the program nto processes (tasks) that run n parallel. An example of a network of such parallel processes correspondng to the NLP n gure 3-c) s gven n gure 8 - see KPN 4. Moreover, nsde these processes some peces of code can be executed n parallel or n a ppelne fashon because of the UNOLD( ) f ( s empty set) 5 prnt( ); return(); else 0 = frst element of the set ; = frst element of the set ; = take the code from the begnnng of tll the "for" statement wth loop terator, 5 ncludng; = take the body of loop from ; prnt( ); 20 for (k = ; k <= ; k++) prntln("f("+ +"mod"+ +")="+ -k+, ); "! $# = the set wthout the frst element; "! &% 25 = the set wthout the frst element; UNOLD( '! $# "! &% ); prntln(""); prntln(""); return(); gure 4: Pseudo code descrbng the UNOLD transformaton. skewng transformaton. Note that n both cases (unfoldng and skewng), the transformatons proceed along the NLP code n gure 3. The depence graphs are only shown to vsualze the effect of the transformatons. 3.2 Unfoldng procedure Let )(* be an N-deep affne nested loop program wth an teraton vector +-,/ or each 4:-;<+-=0>?,A@ BC565D a parameter E : ;G s assocated. All these parameters form a parameter vector HI,J.0E E D5656 E 7 8 whch we call unfoldng vector. We defne a transformaton UNOLD(NLP,U,I) whch s descrbed n gure 4. The pseudo code n gure 4 descrbes the unfoldng transformaton as a recursve procedure. Ths procedure operates on the affne nested loop program )(K* wth ts teraton vector + and the value of the unfoldng vector H. In order to explan the behavor of the procedure UNOLD we consder the followng smple example. Let L(* be the program shown n the left part of gure 5. )(K* has only one loop wth an terator (ndex). Hence, the teraton vector + correspondng to L(* has only one element +-,M.0 N8 and the unfoldng vector H has also one element HG,/.2EO8. In our example the parameter E s equal to 0. ollowng the procedure UNOLD, frst we check whether + s an empty set. In our example we start wth +P,Q.0 N8 whch s not an empty set. Then, we ntalze four varables, see lnes 0,, 3 and 6 n gure 4. As a result we have: varable R takes the character ; varable S = 0; varable TVUWUYX takes the strng Z[UW\] ^,G@`_a@`_C and SUWbCc takes the code n the body of the loop wth terator. Ths code s marked n gure 5 as a rectangle. Lne 8 n gure 4 prnts to the output the varable TUWUYX. The result s shown n gure 5 - the frst lne n the unfolded NLP. Executng lnes 20 tll 32 n gure 4 wll generate the rest of the code of the unfolded NLP n

4 n qp ut In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 for = ::N, Applcaton program (NLP) U = {0}, I={} UNOLD(NLP, U, I) for = ::N, f ( mod 0) = 9, f ( mod 0) = 8, f ( mod 0) = 0, : : Unfolded NLP gure 5: Smple example llustratng the UNOLD() transformaton shown n gure 4. gure 5. As a result the unfolded NLP n gure 5 has ten copes of the bounded by f statements wth a mod statement makng them mutually exclusve. The example n gure 5 shows that the nput NLP s transformed to a functonally equvalent NLP whch we call an unfolded NLP. The unfolded NLP can be easly converted nto ten tasks that operate n parallel. That s why we say that the unfolded NLP has enhanced task-level parallelsm compared wth the nput NLP. 3.3 Skewng procedure Let )(K* be an N-deep affne nested loop program wth an teraton vector +d,/ D or each : ;<+-=0>e,A@ 'Bf25656 a parameter vector g :,h.0 2 P3 565D5656 P78 s assocated, where each -k<;lg=xl,m@ YBf All parameter vectors form a parameter matrx " r565 7,h.0gdo gdo gdo 7 8, 5D s7 n whch we call skewng matrx. We requre to be unmodular. We defne a transformaton SKEW(NLP,M) as descrbed below: v STEP - Represent the teraton space of )(* as a polytope *w,i.0+<;yx]zp=${ 5 +} ~S08, where { s an ntegral matrx and S s an ntegral vector; v n STE - Use the skewng matrx to transform * as follows: { 5 ng 5 n 5 +G S,ƒ {9 5 +f ) S, where {, { 5 n/ and +, n 5 + ; v STEP3 - Use the ourer-motzkn (M) procedure [] to represent the teraton space, descrbed by { 5 + S, n terms of nested loops. Ths s the new teraton space of )(K* wth teraton vector + ; v STEP4 - Change all ndexes of the varables n )(K* accordng to the equaton +ˆ, n 5 +. The four steps descrbed above are llustrated n gure 6 n the context of a smple example. We start n wth a 2-deep affne nested loop program and a In STEP, ranges of the loop ndexes and are represented as a system of lnear nequaltes { 5 +Ž ms n. Next, we use the skewng matrx to STEP STE STEP3 STEP4 Applcaton program (NLP) for = ::N, A 0 0 A M for = 2::N+K, for = max(, N)::mn(,K), [y( ), x( )] = (y( ), x( )); Skewed NLP * 0 0 * >= * * >= N K I b M I M N K b for = 2::N+K, for = max(, N)::mn(,K), = 0 I M A * I * >= I ==> N K b Substtute: wth wth gure 6: Smple example llustratng the four steps n the SKEW(NLP,M) procedure. do the mathematcal manpulatons descrbed n STE. As a result we have $ a new teraton space for the nput NLP, defned by the loop ndexes and and bounded by the system { 5D $ D o S. The ourer-motzkn (M) procedure s used to represent the new teraton space as nested loops as t s shown n gure 6 - STEP3. After ths step all varables nsde the loops are stll ndexed by the old ndexes and. We have to replace them wth the new ndexes and. In order to do ths we know from STE o, whch mples that `@ $ o. So, we have to replace ndex wth e and ndex wth n all varables. Ths s llustrated n gure 6 - STEP4. 4. COMPILER In ths secton, we brefly descrbe our aggressve parallel compler COMPAAN whch explots the result of the transformatons presented n Secton 3. COMPAAN (Complaton of Matlab to Process Networks) [7] s a method and tool set (MATPARSER, DGPARSER, PANDA) for transformng affne nested loop programs (NLP) [2] wrtten n Matlab nto a Kahn Process Network (KPN) specfcaton. COMPAAN starts the transformaton by convertng a Matlab specfcaton nto a sngle assgnment code (SAC) specfcaton. SAC descrbes all parallelsm avalable n the orgnal Matlab specfcaton. The tool whch does the Matlab-to-SAC transformaton s MAT- PARSER [5]. MATPARSER s an array dataflow analyss compler that fnds all parallelsm avalable n NLPs wrtten n Matlab usng a very aggressve data-depency analyss technque. Ths technque s based on parametrc nteger lnear programmng. Also, MATPARSER can handle non-lnear operators lke Max, Mn, Cel, loor, Mod and Dv. Therefore, t can handle the result of the skewng and unfoldng transformatons presented n Secton 3. Next, a tool called DGPARSER [2] converts the SAC descrpton nto a Polyhedral Reduced Depence Graph (PRDG) [7] descrpton. The PRDG s a compact graphcal representaton of the SAC usng parameterzed polyhedral embeddngs of the atomc functons. nally, the PANDA tool [7] uses the PRDG descrpton n order to generate the Kahn Process Network descrpton and the ndvdual

5 š In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 for = ::N, Transformaton: Unfold(U), U = [u, u2] = [2,] for = ::N, f ( mod 2) =, f ( mod 2) = 0, Converson to KPN: Transformaton: Unfold(U), U = [u, u2] = [2,2] for = ::N, f ( mod 2) =, f ( mod 2) =, f ( mod 2) = 0, f ( mod 2) = 0, f ( mod 2) =, f ( mod 2) = 0, Converson to KPN: P for = ::N, for = 2::N+K, for = max(, N)::mn(,K), [y(), x( )] = (y(), x( )); Converson to KPN: Transformaton: Skew(M) + Unfold(U), m m2 M = = m2 m22 0 Transformaton: U = [u, u2] = [2,] Skew(M), m m2 M = = m2 m22 0 P KPN_3 for = 2::N+K, f ( mod 2) =, for = max(, N)::mn(,K), [y(), x( )] = (y(), x( )); f ( mod 2) = 0, for = max(, N)::mn(,K), [y(), x( )] = (y(), x( )); KPN_4 Converson to KPN: gure 8: An example of generatng two possble Kahn Process Networks from a sngle applcaton usng the skewng and unfoldng transformatons and the COMPAAN tool. P KPN_ P P3 P4 KPN_2 gure 7: An example of generatng two possble Kahn Process Networks from a sngle applcaton usng the unfoldng transformaton and the COMPAAN tool. processes. 5. EXAMPLES In ths secton, we demonstrate the use of our algorthmc transformatons n combnaton wth the COMPAAN tool set. We show how, merely by changng the values of the parameters, a set of Kahn Process Networks (KPN) can be easly generated from a sngle applcaton. Consder the applcaton shown n the top-left corner of gure 7. It s a 2-deep affne nested loop program wrtten n Matlab. In gure 7 frst we apply the unfoldng transformaton on our applcaton and then we use COMPAAN to convert the transformed code nto a KPN descrpton. We assgn two dfferent values to the parameter vector H, namely H, and HI, BfYB. As a result we obtan two dfferent KPNs. They have dfferent numbers of processes and dfferent communcaton structures (see gure 7- KPN and KPN 2). In gure 8, we show another example n whch we use the same applcaton as n gure 7. We obtan KPN 3, whch has only one process, n by applyng the skewng transformaton wth a Also, we show that the skewng transformaton and the unfoldng transformaton can be appled n ton. KPN 4 n gure 8 s derved by applyng frst the and then the unfoldng transfor- n maton wth H, 6. EXPERIMENTS AND RESULTS In ths secton, we present some of the experments we have done n order to evaluate and show the usefulness of the algorthmc transformaton technques presented n ths paper. We bult a Y-chart envronment exted wth the Applcaton Transformaton Layer as shown n gure 2. As an nput applcaton for the transformaton layer we used the QR-decomposton algorthm [2] because t s common computatonal ntensve task n many sgnal processng applcatons lke Dgtal Beamformng, Adaptve Dgtal lterng etc. The algorthm was wrtten n Matlab. The applcaton transformaton layer apples the Unfoldng and Skewng transformatons on the QR algorthm and generates alternatve applcaton nstances - Process Networks - as syntheszable VHDL. We mapped these nstances onto a Xlnx XCV000E PGA devce whch was the archtecture template for our experments. The mappng was done by a syntheszer and place-and-route tools provded by Xlnx. The performance analyss was done usng the tmng analyss and smulaton tools from the Xlnx oundaton R package. gure 9 shows the estmated total executon tme for three applcaton nstances of the QR-decomposton algorthm. These nstances were derved automatcally by applyng the transformaton technques presented n Secton 3. The results show that the effect of Skewng + Unfoldng Unfoldng No transform Tme ( mcro seconds) gure 9: Executon tme of the QR algorthm transformed by usng the unfoldng and skewng transformatons. The unfoldng factor s 3 and the sze of the nput data matrx s 0 by 6. applyng our transformatons s that we can generate alternatve applcaton nstances wth dfferent performance when mappng them onto an archtecture template (n our case an PGA). It can be seen from gure 9 that the unfoldng and skewng transformatons mprove sgnfcantly the performance. gure 0 shows the results obtaned from the exploraton of the performance of ten applcaton nstances of the QR algorthm derved by applyng only the unfoldng transformaton wth unfoldng factors from to 0. Agan, the results show that the performance can be sgnfcantly mproved. In ths experment we also measured how much tme t takes to obtan the results presented n gure 0. The tme taken for these ten experments to be processed

6 In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 number of cycles unfoldng factor gure 0: Exploraton of the performance of the QR algorthm unfolded by factors from to 0. The sze of the nput data matrx s 48 by 6. automatcally from Matlab to a hardware mappng onto an PGA and VHDL smulaton was wthn 8 hours. Table shows the processng tmes for some of the experments n more detals. The second row Transform+Comple shows the processng tmes for our tools MATTRANSORM and COMPAAN step and step 2 n gure 2. The row Mappng+Smulaton gves the tme needed to express the Process Networks n terms of a syntheszable VHDL code, to map ths VHDL code on an PGA and fnally to obtan performance numbers from VHDL smulaton step 3 n gure 2. Table : Processng Tmes (hh:mm:ss). Unfold 2 Unfold 5 Unfold 0 Transform+Comple 00:00:08 00:00:8 00:00:29 Mappng+Smulaton 00:22:54 0:24:44 04:47:30 Total 00:23:02 0:25:02 04:47:59 The last row of Table suggests that an extensve desgn space exploraton of alternatve applcaton nstances can be done n a relatvely short amount of tme. Moreover, the accuracy of the results obtaned durng the exploraton s wthn 5%, because we dd very detaled VHDL cycle accurate smulaton. The results gven n the second row of Table show that the applcaton transformaton layer presented n Secton 2 generates very fast alternatve applcaton nstances from a gven applcaton. The tme to do ths s only a few seconds, whereas the tme to map the nstances onto an PGA and smulate them vares form mnutes to hours - see row 3 of Table. However, there s a potental to mprove the mappng and smulaton tme (row 3 of Table ) by usng some system-level desgn space exploraton tools lke SPADE [9] and ORAS [6]. Prelmnary results ndcate that the mappng and smulaton tme can be reduced to a few mnutes nstead of several hours obtanng performance numbers wth reasonable accuracy. 7. RELATED WORK The Unfoldng and Skewng transformatons presented n ths paper are related to the unfoldng and retmng transformaton technques used n the Sgnal-Processng communty []. Also, they are related to the loop unrollng and loop skewng technques used n compler desgn [0]. However, there are some mportant dfferences. rst, we use our transformatons for generatng a set of Kahn Process Networks correspondng to an applcaton (nested loop program) thereby generatng alternatve applcaton nstances. Usng the Unfoldng transformaton to generate Process Networks we do reverse parttonng compared to [3]. We start by puttng all computatonal workload n one process and by unfoldng we partton the workload over more processes. Second, we developed procedures to do these transformatons on the algorthmc (source code) level, whereas n [] smlar transformatons are appled on sgnal-flow graphs, data-flow graphs or depence graphs correspondng to an algorthm. Thrd, our transformatons am at exposng and explotng the task-level parallelsm avalable n an applcaton, whereas the transformatons n [0] am at explotng the fne-gran nstructon-level parallelsm. 8. CONCLUSIONS In ths paper, we presented algorthmc transformaton technques for dervng a set of applcaton nstances (Kahn Process Networks) correspondng to an applcaton. These technques support a system desgner n explorng alternatve nstances of an applcaton mapped onto an archtecture template. We have mplemented our technques n the tools MATTRANSORM and COMPAAN whch means that the process of dervng alternatve nstances s fully automated for applcatons descrbed as affne nested loop programs. Therefore, the presented technques help a system desgner to speedup sgnfcantly the process of explorng alternatve applcaton nstances n system level desgn. Our experments and results show that an extensve desgn space exploraton of alternatve applcaton nstances can be done n a relatvely short amount of tme wth accuracy of the results wthn 5%. 9. REERENCES [] C. Ancourt and. Irgon. Scannng polyhedra wth DO loops. In Proc. ACM SIGPLAN 9, pages 39 50, June 99. [2] P. Held. unctonal Desgn of Data-low Networks, 996. PhD thess, Delft Unversty of Technology, The Netherlands. [3] G. Kahn. The semantcs of a smple language for parallel programmng. In Proc. of the IIP Congress 74. North-Holland Publshng Co., 974. [4] B. Kenhus. Desgn Space Exploraton of Stream-based Dataflow Archtectures: Methods and Tools, Jan PhD thess, Delft Unversty of Technology, The Netherlands. [5] B. Kenhus. MatParser: An array dataflow analyss compler. Techncal report, Unversty of Calforna at Berkeley, UCB/ERL M00/9. [6] B. Kenhus, E. Deprettere, K. Vssers, and P. van der Wolf. The Constructon of a Retargetable Smulator for an Archtecture Template. In Proc. 6-th Int. Workshop on Hardware/Software Codesgn (CODES 98), Seattle, Washngton, Mar [7] B. Kenhus, E. Rpkema, and E.. Deprettere. : Dervng Process Networks from Matlab for Embedded Sgnal Processng Archtectures. In Proc. 8th Internatonal Workshop on Hardware/Software Codesgn (CODES 2000), San Dego, CA, USA, May [8] P. Leverse, T. Stefanov, P. van der Wolf, and E. Deprettere. System Level Desgn wth SPADE: an M-JPEG Case Study. In Proc. Int. Conference on Computer Aded Desgn (ICCAD 0), pages 3 38, San Jose CA, USA, Nov [9] P. Leverse, P. van der Wolf, K. Vssers, and E. Deprettere. A Methodology for Archtecture Exploraton of Heterogeneous Sgnal Processng Systems. Int. Journal of VLSI Sgnal Processng for Sgnal, Image and Vdeo Technology, 29(3):97 207, 200. [0] S. Muchnck. Advanced Compler Desgn and Implementaton. Morgan Kaufmann Publshers, Inc., 997. [] K. Parh. VLSI Dgtal Sgnal Processng Systems: Desgn and Implementaton. John Wley & Sons, Inc., 999. [2] J. Proaks, C. Rader,. Lng, C. Nkas, M. Moonen, and I. Proudler. Algorthms for Statstcal Sgnal Processng. Prentce Hall, Inc., [3] J. Tech and L. Thele. Exact Parttonng of Affne Depence Algorthms. Lecture Notes n Computer Scence (LNCS), Sprnger, 2268:33 5, 2002.

Ptolemy II in Embedded Signal Processing Architectures: Deriving Process Networks From Matlab

Ptolemy II in Embedded Signal Processing Architectures: Deriving Process Networks From Matlab Ptolemy II n Embedded Sgnal Processng Archtectures: Dervng Process Networs From Matlab Bart Kenhus and Ed Deprettere Leden Insttute of Advanced omputer Scence (LIAS) Leden Unversty, The Netherlands Ptolemy

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Loop Transformations, Dependences, and Parallelization

Loop Transformations, Dependences, and Parallelization Loop Transformatons, Dependences, and Parallelzaton Announcements Mdterm s Frday from 3-4:15 n ths room Today Semester long project Data dependence recap Parallelsm and storage tradeoff Scalar expanson

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

LLVM passes and Intro to Loop Transformation Frameworks

LLVM passes and Intro to Loop Transformation Frameworks LLVM passes and Intro to Loop Transformaton Frameworks Announcements Ths class s recorded and wll be n D2L panapto. No quz Monday after sprng break. Wll be dong md-semester class feedback. Today LLVM passes

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Today Using Fourier-Motzkin elimination for code generation Using Fourier-Motzkin elimination for determining schedule constraints

Today Using Fourier-Motzkin elimination for code generation Using Fourier-Motzkin elimination for determining schedule constraints Fourer Motzkn Elmnaton Logstcs HW10 due Frday Aprl 27 th Today Usng Fourer-Motzkn elmnaton for code generaton Usng Fourer-Motzkn elmnaton for determnng schedule constrants Unversty Fourer-Motzkn Elmnaton

More information

Vectorization in the Polyhedral Model

Vectorization in the Polyhedral Model Vectorzaton n the Polyhedral Model Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty October 200 888. Introducton: Overvew Vectorzaton: Detecton

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Memory Modeling in ESL-RTL Equivalence Checking

Memory Modeling in ESL-RTL Equivalence Checking 11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES

A SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES A SYSOLIC APPROACH O LOOP PARIIONING AND MAPPING INO FIXED SIZE DISRIBUED MEMORY ARCHIECURES Ioanns Drosts, Nektaros Kozrs, George Papakonstantnou and Panayots sanakas Natonal echncal Unversty of Athens

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations

Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations Improvng Hgh Level Synthess Optmzaton Opportunty Through Polyhedral Transformatons We Zuo 2,5, Yun Lang 1, Peng L 1, Kyle Rupnow 3, Demng Chen 2,3 and Jason Cong 1,4 1 Center for Energy-Effcent Computng

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation

Loop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Verification by testing

Verification by testing Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)

Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont) Loop Transformatons for Parallelsm & Localty Prevously Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Loop nterchange Loop transformatons and transformaton frameworks

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A Facet Generation Procedure. for solving 0/1 integer programs

A Facet Generation Procedure. for solving 0/1 integer programs A Facet Generaton Procedure for solvng 0/ nteger programs by Gyana R. Parja IBM Corporaton, Poughkeepse, NY 260 Radu Gaddov Emery Worldwde Arlnes, Vandala, Oho 45377 and Wlbert E. Wlhelm Teas A&M Unversty,

More information

Concurrent models of computation for embedded software

Concurrent models of computation for embedded software Concurrent models of computaton for embedded software and hardware! Researcher overvew what t looks lke semantcs what t means and how t relates desgnng an actor language actor propertes and how to represent

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning

Parallel Inverse Halftoning by Look-Up Table (LUT) Partitioning Parallel Inverse Halftonng by Look-Up Table (LUT) Parttonng Umar F. Sddq and Sadq M. Sat umar@ccse.kfupm.edu.sa, sadq@kfupm.edu.sa KFUPM Box: Department of Computer Engneerng, Kng Fahd Unversty of Petroleum

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals

Agenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals Agenda & Readng COMPSCI 8 SC Applcatons Programmng Programmng Fundamentals Control Flow Agenda: Decsonmakng statements: Smple If, Ifelse, nested felse, Select Case s Whle, DoWhle/Untl, For, For Each, Nested

More information

The stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0

The stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0 The stream cpher MICKEY-128 (verson 1 Algorthm specfcaton ssue 1. Steve Babbage Vodafone Group R&D, Newbury, UK steve.babbage@vodafone.com Matthew Dodd Independent consultant matthew@mdodd.net www.mdodd.net

More information

PHYSICS-ENHANCED L-SYSTEMS

PHYSICS-ENHANCED L-SYSTEMS PHYSICS-ENHANCED L-SYSTEMS Hansrud Noser 1, Stephan Rudolph 2, Peter Stuck 1 1 Department of Informatcs Unversty of Zurch, Wnterthurerstr. 190 CH-8057 Zurch Swtzerland noser(stuck)@f.unzh.ch, http://www.f.unzh.ch/~noser(~stuck)

More information

Brave New World Pseudocode Reference

Brave New World Pseudocode Reference Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Real-Time Systems. Real-Time Systems. Verification by testing. Verification by testing

Real-Time Systems. Real-Time Systems. Verification by testing. Verification by testing EDA222/DIT161 Real-Tme Systems, Chalmers/GU, 2014/2015 Lecture #8 Real-Tme Systems Real-Tme Systems Lecture #8 Specfcaton Professor Jan Jonsson Implementaton System models Executon-tme analyss Department

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Computer models of motion: Iterative calculations

Computer models of motion: Iterative calculations Computer models o moton: Iteratve calculatons OBJECTIVES In ths actvty you wll learn how to: Create 3D box objects Update the poston o an object teratvely (repeatedly) to anmate ts moton Update the momentum

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)

Harvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6) Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

Algorithm To Convert A Decimal To A Fraction

Algorithm To Convert A Decimal To A Fraction Algorthm To Convert A ecmal To A Fracton by John Kennedy Mathematcs epartment Santa Monca College 1900 Pco Blvd. Santa Monca, CA 90405 jrkennedy6@gmal.com Except for ths comment explanng that t s blank

More information

A HIERARCHICAL SIMULATION FRAMEWORK FOR APPLICATION DEVELOPMENT ON SYSTEM-ON-CHIP ARCHITECTURES. Vaibhav Mathur and Viktor K.

A HIERARCHICAL SIMULATION FRAMEWORK FOR APPLICATION DEVELOPMENT ON SYSTEM-ON-CHIP ARCHITECTURES. Vaibhav Mathur and Viktor K. A HIERARCHICAL SIMULATION FRAMEWORK FOR APPLICATION DEVELOPMENT ON SYSTEM-ON-CHIP ARCHITECTURES Vabhav Mathur and Vktor K. Prasanna Department of EE-Systems Unversty of Southern Calforna Los Angeles, CA

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

A fault tree analysis strategy using binary decision diagrams

A fault tree analysis strategy using binary decision diagrams Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

Multigranular Simulation of Heterogeneous Embedded Systems

Multigranular Simulation of Heterogeneous Embedded Systems Multgranular Smulaton of Heterogeneous Embedded Systems Adtya Agrawal Insttute for Software Integrated Systems Vanderblt Unversty Nashvlle, TN - 37235 1 615 343 7567 adtya.agrawal@vanderblt.edu Akos Ledecz

More information

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III.

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III. Lecture 15: Memory Herarchy Optmzatons I. Caches: A Quck Revew II. Iteraton Space & Loop Transformatons III. Types of Reuse ALSU 7.4.2-7.4.3, 11.2-11.5.1 15-745: Memory Herarchy Optmzatons Phllp B. Gbbons

More information

Petri Net Based Software Dependability Engineering

Petri Net Based Software Dependability Engineering Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013

More information

Model Integrated Computing: A Framework for Creating Domain Specific Design Environments

Model Integrated Computing: A Framework for Creating Domain Specific Design Environments Model Integrated Computng: A Framework for Creatng Doman Specfc Desgn Envronments James R. DAVIS Vanderblt Unversty, Insttute for Software Integrated Systems Nashvlle, TN 37203, USA ABSTRACT Model Integrated

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt

More information

Vectorization of Image Outlines Using Rational Spline and Genetic Algorithm

Vectorization of Image Outlines Using Rational Spline and Genetic Algorithm 01 Internatonal Conference on Image, Vson and Computng (ICIVC 01) IPCSIT vol. 50 (01) (01) IACSIT Press, Sngapore DOI: 10.776/IPCSIT.01.V50.4 Vectorzaton of Image Outlnes Usng Ratonal Splne and Genetc

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface. IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Communication-Minimal Partitioning and Data Alignment for Af"ne Nested Loops

Communication-Minimal Partitioning and Data Alignment for Afne Nested Loops Communcaton-Mnmal Parttonng and Data Algnment for Af"ne Nested Loops HYUK-JAE LEE 1 AND JOSÉ A. B. FORTES 2 1 Department of Computer Scence, Lousana Tech Unversty, Ruston, LA 71272, USA 2 School of Electrcal

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

CS1100 Introduction to Programming

CS1100 Introduction to Programming Factoral (n) Recursve Program fact(n) = n*fact(n-) CS00 Introducton to Programmng Recurson and Sortng Madhu Mutyam Department of Computer Scence and Engneerng Indan Insttute of Technology Madras nt fact

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements

Explicit Formulas and Efficient Algorithm for Moment Computation of Coupled RC Trees with Lumped and Distributed Elements Explct Formulas and Effcent Algorthm for Moment Computaton of Coupled RC Trees wth Lumped and Dstrbuted Elements Qngan Yu and Ernest S.Kuh Electroncs Research Lab. Unv. of Calforna at Berkeley Berkeley

More information

The Shortest Path of Touring Lines given in the Plane

The Shortest Path of Touring Lines given in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information