WCET-Directed Dynamic Scratchpad Memory Allocation of Data

Size: px
Start display at page:

Download "WCET-Directed Dynamic Scratchpad Memory Allocation of Data"

Transcription

1 WCET-Drected Dynamc Scratchpad Memory Allocaton of Data Jean-Franços Deverge and Isabelle Puaut Unversté Européenne de Bretagne / IRISA, Rennes, France Abstract Many embedded systems feature processors coupled wth a small and fast scratchpad memory. To the dfference wth caches, allocaton of data to scratchpad memory must be handled by software. The major gan s to enhance the predctablty of memory accesses latences. A comple-tme dynamc allocaton approach enables evcton and placement of data to the scratchpad memory at runtme. Prevous dynamc scratchpad memory allocaton approaches amed to reduce average-case program executon tme or the energy consumpton due to memory accesses. For real-tme systems, worst-case executon tme s the man metrc to optmze. In ths paper, we propose a WCET-drected algorthm to dynamcally allocate statc data and stack data of a program to scratchpad memory. The granularty of placement of memory transfers (e.g. on functon, basc block boundares) s dscussed from the perspectve of ts computaton complexty and the qualty of allocaton. 1. Introducton Worst-case executon tme (WCET) of a program s the maxmum tme ths program may take to execute on a specfc hardware platform [14, 25, 28]. Knowng program s WCET s of prme mportance for hard real-tme systems to guarantee computatons wll complete before ther deadlne. Drect-addressed scratchpad memores are beng used as an alternatve to processor caches as they consume less area and less power. Approaches for statc [2] and dynamc [15, 26, 27] allocatons have been desgned to automatcally place code and data on scratchpad memores. So far, many studes have been led on allocaton of code and data on scratchpad memory for average executon tme [2] or energy reducton [27]. A study [28] has demonstrated the superorty of scratchpad memory placement on some cache modelng technques for executon tme predctablty of hard real-tme systems. Recently, algorthms for statc data allocaton on scratchpad memores n [25], and for dynamc code allocaton n [21] have been specally desgned for WCET optmzaton. As far as we know, no dynamc scratchpad memory data allocaton methods for WCET optmzaton have been proposed. In ths paper, we present an approach to allocate program data to scratchpad memory for WCET reducton. Our approach determnes at comple-tme the possble program locatons where data wll be transferred on and off the scratchpad memory at runtme n a two-steps method. Frst, memory accesses to data along the worst-case executon path of the program are analyzed. Second, a /1 nteger lnear program (ILP) problem s formulated to select these data for dynamc scratchpad memory allocaton. However, the worst-case executon path of the program may change after a data allocaton. Consequently, the ILP problem s greedly refned to compute a WCET-drected allocaton. The two steps of the method are descrbed n Secton 2 and 3. In Secton 2, we propose a compler technque to determne potental targets of any data memory accesses of a program. These nformaton are employed to estmate the proft for a data allocaton. Secton 3 descrbes the approach for dynamc scratchpad memory allocaton. Secton 4 provdes some results and studes the performance mprovements of our proposal over prevous scratchpad memory allocaton technques. Secton 5 overvews related work whle Secton 6 descrbes future work and concludes. 2. Determnaton of load-store nstructons targets On many programs, a large amount of data accesses are dynamc; the target address of load-store nstructons may change for each executon. Table 1 gves a twodmensonal classfcaton of data storage and load-store accesses from [18]. The storage type defnes the locaton of a gven data. Statc data, stack data and heap data are respectvely stored n global, heap and stack sectons of the program memory space layout. Lterals are compler-generated constants stored n the code secton; these data are used to reduce the sze of the program code. Storage type Statc Stack Heap Lterals Access type Scalar Regular Irregular Input dependent Descrpton Global and statc structures. Functon stack frame, splled temporares and stack allocated structures. Dynamcally allocated structures on the heap. Constants stored n program code secton. Explanaton Only one element. Array accessed by regular, strde accesses. Non-regular but stll nput data ndependent. Reference drectly depends on nput data Table 1. Data structure classfcaton based on storage type (upper table) and access type (lower table) [18].

2 The access type defnes the way a data s accessed. Scalar access types are accesses to a unque data address for statcs and to a relatve address to stack frame base addresses for stack data. The access type of a load-store nstructon s regular f ths nstructon s accessng to multple elements of a unque array wth a constant strde (classfed as lnear address sequence accesses n []). Irregular accesses nclude accesses to (possbly multple) data through ponters and are stll ndependent to the nput data. Lastly, nput dependent accesses nclude any accesses wth addresses computed at runtme from unknown nput data (mentoned as ndrect address sequence accesses n []). In the next secton, we wll motvate the need for a method to analyze any data memory accesses of programs Quanttatve study of data memory accesses by types Table 2 gves the mpact of data by access types and storage types on the worst-case executon path of programs. Benchmark programs are ndvdually descrbed later n Secton 4 and these programs don t access heap data. Programs are compled for the StrongARM-1 [22] wth loop-related optmzatons (loop unrollng, etc) dsabled. WCET analyses of programs are performed wth the Heptane WCET tmng analyser [6]. Statc Stack Benchmark Scal. or reg. Irreg. or nput dep. Scal. or reg. Irreg. or nput dep. Lterals Adpcm 17.% 6.% 9.1% - 13.% Engne 16.9% % 3.% 12.9% G % 8.6% 39.1% % Hstogram 99.9% -.1% - - Lpc 96.1% -.5% - 3.4% Pocsag 62.4%.4% 13.9% % Spectral 31.7% 24.5% 37.8% - 6.1% Statemate 6.4% - 6.1% % Table 2. Impact rato of load-store nstructons by storage types and access types. Table 2 presents the rato of accesses to statc data, stack data and lterals along the worst-case executon path. To llustrate the partton between accesses wth or wthout ponters, we have respectvely merged results for scalar/regular and rregular/nput dependent accesses nto two sub-categores for statc and stack data. The rato of load-store nstructons to lterals s up to 33.5% and s mportant for most programs of the benchmarks set. On the one hand, most programs have a large amount (between 16.9%-99.9%) of data accesses to statcs. On the other hand, stack data represent a large part of data accesses for three of the eght benchmarks (between 33.2%- 69.8%). All programs, except Hstogram, make use of rregular/nput dependent accesses (e.g. memory accesses through ponters). Moreover, two benchmarks programs make ntensve (24.5% and 6.%) use of such accesses. As a concluson, any accesses to statc data and stack data are mportant. We have to propose a method to calculate the targets for any access types of memory accesses found n programs Calculaton of targets of data memory accesses In programmng languages such as C or C++, programs typcally employ ponters to arrays elements, dynamc data structures (e.g. lnked lsts) and procedures parameters. As shown n the prevous secton, rregular and nput dependent accesses types represent a large part of load-store nstructons. Tradtonally, ponter analyss has been used n complers to buld alasng nformaton [4, 11]. In ths paper, we propose to reuse exstng ponter analyss methods of the compler to determne possble data accessed by any load-store nstructons of a program. In order to exhaustvely assocate any memory access to (possbly-multple) data target(s), we have to apply ponter analyss to () all ponters defntons of the program nterprocedurally, and to () the text of the whole-program [11] wth ts related lbrares. Output assembly Program sources (C,C++,...) language front-end Collecton of ntermedate representatons Code generaton back-end load-store nstructons targets annotatons Ponter analyss Alasng nformatons Code transformatons 1..n Fgure 1. Ponter analyss n complaton process. As shown on the Fgure 1, a compler nfrastructure typcally contans a collecton of ntermedate representatons []. A set of code transformatons s appled teratvely on ntermedate representatons. Ponter analyses must be processed on early phases of program transformatons; these nformaton are brought through the rest of optmzatons phases as annotatons to the ntermedate representatons. Then, the code generaton backend phase translates a low-level ntermedate representaton (smlar to [12]) to the output assembly fle. GCC supports whole-program complaton and t currently provdes an ntraprocedural ponter analyss [4]: targets of ponters passed on procedures parameters are not computed. For the am of ths paper s study, we have slghtly modfed the compler nfrastructure to apply the ponter analyss nterprocedurally ; the compler keeps results of 1 GCC GNU C Compler:

3 ponter analyses durng the whole complaton lfetme. We have also modfed the ARM backend to produce the set of possble ponters targets for each generated load-store nstructon n the output assembly fle. The ponter analyss appled n ths paper supports statc and stack storage types. None of our real-tme benchmarks make use of dynamc heap allocaton. In our study, we don t make the dfferentaton between ndvdual elements of arrays and between the felds of data structures (whereas such nformaton s computed n the current ponter analyss mplementaton of the compler [4]). Moreover, stack data of the whole stack frame of each functon are managed as an ndvdual data structure nstance Related work on determnaton of load-store nstructons targets Some approaches have prevously succeeded to generate nformaton on some access types. Dsassembly of bnary fles enables extracton of scalar accesses to statc data; some dataflow analyses technques have been appled on assembly code to extend scalar access detecton to stack data [3,9,13]. Data dependence analyses technques and loop nducton analyses have been appled on the low-level representaton of VPO [19] to determne regular accesses n [29]. [24] uses a processor smulator to generate program memory profle. The profle contans the trace of all memory addresses accessed. The trace must cover all nstructons of the program and drectly assocates an observed target data address for each load-store nstructon. Ths approach enables analyss of any scalar memory accesses. Non-scalar accesses calculaton s possble by checkng the ncluson of the caught address to any known data s range addresses. Ths approach guarantee to detect the target of any memory accesses to a unque data. An external module, based on abstract nterpretaton technques [5], has been employed on the ntermedate representaton of the SUIF compler [3] for ponter analyss. The results of ths analyss are re-assocated after code generaton to the output assembly wthn the executon of SmpleScalar smulator. Ther approach s the most smlar to ours. The am of ther study s the mpact of memory access alasng nformaton for schedulng of processor memory request queue [5]. 3. Dynamc scratchpad memory allocaton The prevous secton has presented a complete program analyss framework to determne the targets of load-store nstructons of a program. In ths secton, we employ these nformaton to defne at comple-tme a data allocaton n a sngle scratchpad memory devce. Frst, we descrbe the program flowgraph representaton consdered (Secton 3.1). Second, a /1 nteger lnear program (ILP) formulaton s gven for the allocaton of statc data (Secton 3.2) based on ntal knowledge of frequences along worst-case executon path. We apply ths formulaton on the consdered flowgraph to generate an ILP problem. One soluton to ths ILP problem provdes the locaton of memory transfer operatons on the consdered flowgraph. The formulaton s later extended to handle stack data (Secton 3.2.3). Fnally, we descrbe an teratve algorthm to tackle nstablty of worst-case executon path of a program. Ths algorthm ncrementally generates the ILP problem for a better WCET optmzaton Flowgraphs and computaton of worst-case executon path nformaton We have multple choces for placement of memory transfers operatons (e.g. on functons entry-ext, on basc blocks boundares, etc). We propose to ntroduce a generc graph representaton of the program flow. The chosen representaton level of the generated program flowgraph may lead to dfferent placements of memory transfer operatons. In Fgures 2 and 3, the (rght-sde) flowgraph s generated from the (left-sde) orgnal graph representaton. The edges n the generated flowgraph are requred to descrbe any possble flows of executon of the program. proc_d() man() proc_a() proc_c() proc_b() e 7 e 9 e 11 start Fgure 2. Call graph transformaton to a (coarse-gran) flowgraph. For example, one can buld a flowgraph from the orgnal call graph of an applcaton (see Fgure 2). There s one node n the flowgraph for each functon n the call graph. We can also buld a flowgraph from the nterprocedural control flow graph of the applcaton (see Fgure 3). Here, there s one node n the flowgraph for each basc block n the nterprocedural control flow graph. Other levels of representaton are possble; one may balance between coarseness and sze of the resultng flowgraph. The sze of the flowgraph has a practcal ncdence on the complexty of the future memory allocaton problem as shown later n expermental results n Secton 4.2. Prevous approaches for dynamc scratchpad allocaton [15, 26, 27] have focused on the optmzaton of the average case. Data accesses statstcs are typcally computed from the executon proflng wth a tran nput. In order to reduce the WCET of a real-tme applcaton, we rely on nformaton of data memory accesses on the worst-case executon path of the program usng WCET analyss. Consequently, we apply statc tmng analyss as an ntal step to determne the nformaton as proposed n [21, 25]. Heptane produces nformaton on frequences of executon of ndvdual basc blocks on the worst-case executon path. Snce we are able to compute the set of targets for each load-store nstructon (see Secton 2.2), we can determne the mpact e 8 e 2 e 1 e 3 e 5 e 4 e e 13 e 6

4 e 1 start 1 load v 1 = 1 1 = 1 e 7 e 2 e e 3 e 7 4 = 1 store v = 1 ej+1 +n e 5 e 6 e 8 Fgure 4. Fgure 5. Fgure 3. Control flow graph transformaton to a (fne-gran) flowgraph. of any data for each basc blocks of the worst-case executon path. Moreover, we are able determne f ths data s MOD (modfed) or USE (used) on the executon of ths basc block. The outgong edges of the generated flowgraph assocated wth these basc blocks are annotated wth these nformaton. More formally, the flowgraph s a drected graph wth the followng defntons: N = Number of nodes n flowgraph; E = Number of edges n flowgraph; = jth edge of flowgraph, j [1, E]; C ej (v) = Estmated contrbuton to WCET reducton for data v scratchpad-allocated on edge ; Type of usage of data v on edge where U ej (v) {MOD, USE}. Some real-tme applcatons are desgned to be actvated U ej (v) = from multple entry ponts. We have added the start node to the flowgraph to represent these flows of executon. Some edges are added to lnk the start node wth any possble program entry. start node s artfcally actng as a sngle entry pont for the program. In the same way, all program exts are lnked to ths start node Formulaton for statc data We are consderng an ntal problem formulaton to allocate statc data only wth the followng defntons: M = Sze of scratchpad memory; G = Number of statc data n applcaton; v = th statc data, [1, G]; S(v ) = Sze of varable v n bytes; X copy(v ) = Tme to transfer varable v between man memory and scratchpad n cycles; The optmzaton problem s formulated as a /1 nteger lnear programmng problem. We defne the followng set of bnary varables, [1, G], j [1, E]: 8 >< load v = >: 8 >< store v = >: 1 f data v s transferred to scratchpad memory at the begnnng of edge, otherwse. 1 f data v s transferred back to man memory at the end of edge, otherwse. 8 >< = >: 8 >< alloc_ro v = >: 1 f mutable data v s allocated on scratchpad memory on edge, otherwse. 1 f read-only data v s allocated on scratchpad memory on edge, otherwse. Varables load v and store v determne where data v are to be respectvely loaded and stored on scratchpad memory. Varables /alloc_ro v gve the state modfed/not modfed of the scratchpad-allocated data v. A modfed data v must be transferred back to the man memory on end of allocaton. The objectve functon to maxmze s the sum of contrbutons to the WCET of all memory accesses to allocated statc data n the applcaton mnus the cost of transfer operatons of data between man memory and scratchpad memory. GX EX =1 j=1 C ej (v ) + alloc_ro v C ej (v ) load v X copy(v ) store v X copy(v ) Prelmnary constrants have to be added to prevent nconsstences on bnary varables. Data v s allocated on scratchpad memory wth or alloc_ro v exclusvely. [1, G], j [1, E]: + alloc_ro v 1 (1) The MOD and USE annotatons of the edges of the flowgraph have a drect ncdence on the problem formulaton. We have to unset alloc_ro varables for edges that may update ths data: Flow constrants alloc_ro v = f U ej (v ) = MOD; (2) Fgure 4 llustrates the need and objectve of flow constrants. Let us consder data v allocated on scratchpad memory on adjacent and connected edges 1 and. On ths example, ths data s loaded on the executon of 1 and stored back n man memory on the end of s executon. [1, G], (j 1, j) ([1, E], [1, E]), where 1 s an ncomng edge of : 1 alloc_ro v 1 load v = (3) 1 store v 1 = (4) alloc_ro v alloc_ro v 1 load v = (5)

5 Constrant 3 enables scratchpad-allocaton of data v on edge f ths data was already scratchpad-allocated on the ncomng edge 1, or f ths data s loaded on edge. Constrant 4 ensures that data v, updated on edge 1, must be stored and transferred to man memory or alloc_rw on next edge. Constrant 5 ensures that data v, read-only allocated on edge, must be loaded on ths edge or alloc_ro on ncomng edge 1. Fgure 5 llustrates a node wth multples outgong edges. Constrant 6 guarantees consstent values for varables of outgong edges of a node n the flowgraph. [1, G], (j, j ) ([1, E], [1, E]) where edges and are outgong edges of the same node: + alloc_rov alloc_rwv alloc_ro e v j = (6) Fnally, Constrant 7 specfes the upper bound on the sum of the sze of all allocated data on each edge, [1, G], j [1, E]: MX =1 S(v ) + alloc_ro v S(v ) M (7) Optonal support of dynamcally scheduled archtectures WCET analyss requres the complete knowledge of nstructons executons tmes. In dynamcally scheduled archtectures [17], ppelne modelng should take nto account all possble tmngs for each varyng tmng nstructon, ncreasng the complexty of the WCET analyss [16]. For example, load-store nstructons may have multple executons latences f possble data targets are stored n dfferent memores wth heterogeneous latences. In order to reduce the complexty of WCET analyss, we would lke to guarantee unque tmng for each load-store nstructon. Therefore, we have to express allocaton of any targets data of ths load-store nstructon to the same level of the memory herarchy (here, the scratchpad memory or the man memory). Constrant 8 enforces removal of tmng anomales due to data memory accesses. j [1, E], ( 1, 2 ) ([1, G], [1, G]) where v 1 and v 2 are possbly accessed on by the same load-store nstructon: 1 + alloc_ro v 1 2 alloc_ro v 2 = (8) The mpact of Constrant 8 may drectly depend on the number of possble targets of the ponters n programs. However, the StrongARM-1 [22] s a statcally scheduled archtecture and does not enable an evaluaton of ths constrant n the experments of ths paper Memory data address assgnment Our formulaton provdes an optmstc soluton to data allocaton (varables alloc_rw or alloc_rw) on the edges of the flowgraph. Optmstc n the sense not all data selected by the ILP problem resoluton necessary ft on scratchpad memory due to fragmentaton. An address assgnment algorthm has been proposed n [27] to place data on scratchpad memory at comple-tme. If no free place s found for one data, ths data s smply left n man memory. In ther approach, each data can be transferred multple tmes between man memory and scratchpad memory; however, each data must have only one address n the scratchpad memory for the whole program executon. We propose an mprovement to the address assgnment algorthm of [27] wth the detecton of ndvdual data regon. Our proposal, detaled n Algorthm 1, may decrease placement conflcts of data on scratchpad memory. A data regon s defned by a subgraph of connected edges of the flowgraph where data v s scratchpad-allocated. Algorthm 1 enables the assgnment of a dfferent address on scratchpad memory for each data regon. Algorthm 1 Address assgnment algorthm wth detecton of data regons 1: data_regons extract data regons from computed allocaton 2: sort data_regons lst on ther mpactng order on WCET 3: for all ndvdual regon (data, edge_set) from data_regons lst do 4: f data fts n free memory space on edges of edge_set then 5: select free placement for data on edges of edge_set wth frstft polcy 6: else 7: remove the unallocatable data regon 8: end f 9: end for : return data_regons Frst, Algorthm 1 reads the optmstc allocaton computed from the ILP problem, a lst of data regons s generated (lne 1). Ths step requres an analyss of the connected components of the flowgraph for each allocated data. Ths lst s then sorted from the mpact on the program WCET of ndvdual data regon (lne 2). We must try to assgn a concrete address to each data regon. For all edges covered by a data regon, fnd a vald slot to assgn the data (lnes 3-9). If a data regon can not be loaded on scratchpad memory, we gnore ths data regon and we smply remove all transfer operatons for ths data n ths regon Extenson for stack data We are now consderng an extenson to our prevous formulaton to support the lmted lfetme of stack data. These data do not requre ntalzaton nor content backup to the man memory at the end of ther lfetme. Smlarly to statc data, the flowgraph s annotated wth MOD or USE for any usage of stack data. The DEF attrbute s now defned on functon entry and on functon ext for stack data assocated wth the functon lfe span. The DEAD s an attrbute set on flowgraph edges to avod memory transfers for non-lve stack data. Number of stack data n the applcaton; F = f = th stack data, [1, F ]; Type of usage of varable f U e j (f ) = on edge where U e j (f ) {DEF, MOD, USE, DEAD}; S(f ), C ej (f ), X copy(f ) and the varables alloc_rw, alloc_ro, load and store are smlarly defned for stack data.

6 The general flow Constrants 3, 4, 5 and 6 for statc data are drectly applcable to stack data. The objectve functon to maxmze s the contrbuton for WCET reducton of all accesses to the statc data and to the stack data n the applcaton mnus all the dynamc transfers of data between man memory and scratchpad memory: GX EX =1 j=1 FX EX + =1 j=1 C ej (v ) + alloc_ro v C ej (v ) load v X copy(v ) store v X copy(v ) alloc_rw f C ej (f ) + alloc_ro f C ej (f ) load f e j X copy(f ) store f e j X copy(f ) Stack data are created and destroyed on functon entry and on functon ext (where U (f ) = DEF). Consequently these data don t requre memory transfer operatons (Constrants 9 and ). On such edges, stack data are ntalzed wth default values and Constrant 11 forbds read-only allocaton. Moreover, we enforce (Constrant 12) -cost memory transfer operatons to scratchpad before and after the stack data lfetme, [1, F ], j [1, E]: load f = f U (f ) = DEF (9) store f = f U (f ) = DEF () alloc_ro f = f U (f ) = DEF (11) X copy(f ) = f U (f ) = DEAD (12) As descrbed n [2], stack data may have dsjont lfetmes. The program call graph s analyzed to provde nformaton on lfetme of stack data: L = The set of all leaf nodes n the call graph; Total number of unque paths to the lth NP (l) = P t (l) = leaf node n the call graph, l [1, L]; The set of stack data defntons on the tth unque path to lth leaf node n the call graph, t [1, NP (l)]. P t (l) set computes any possble stack data combnatons that are smultaneously alve. The sze constrant should be formulated as, j [1, E], l L, t [1, NP (l)]: GX S(v ) + alloc_ro v S(v ) =1 + X alloc_rw f e j S(v ) + alloc_ro f e j S(f ) f P t(l) M (13) An addtonal extenson would be to support heap allocated data as proposed n [8]. DEF machnery s an deal attrbute to defne a lmted-lfetme data and may be the bass for such an extenson of our formulaton to heap data. Currently, real-tme programs benchmarks rarely employ dynamc heap allocaton and won t enable us to lead a complete study on dynamc allocated data Support for nstablty of the worst-case executon path For many programs, the worst-case executon path of the program may change after some data allocatons. Consequently, t may be needed to evaluate all possble combnatons of data allocatons to fnd the optmal reducton of WCET of the program. However, an exhaustve evaluaton of all possble combnatons would be too tmeconsumng [25]. A greedy heurstc has been proposed for WCET-centrc scratchpad memory statc allocaton that greatly enhances the qualty of allocaton n [25]. The method outlne s to teratvely allocate one data on scratchpad memory and to (re)-estmate data frequences nformaton at every teratons. We propose to adapt ther approach to the ILP problembased scratchpad memory dynamc allocaton schemes n Algorthm 2. Algorthm 2 Iteratve dynamc allocaton algorthm 1: allocatons empty 2: repeat 3: change false 4: perform WCET estmaton 5: extract nformaton on worst-case executon path 6: generate dynamc allocaton ILP problem 7: generate addtonnal Constrants (14) for allocatons 8: new_allocatons call solver on ILP problem 9: f new_allocatons empty then : change true 11: allocatons allocatons most mpactng allocaton from new_allocatons 12: end f 13: untl change = false 14: return allocatons The man dea behnd Algorthm 2 s to ncrementally refne the ILP problem formulaton to support greedy allocaton of most mpactng data. Intally, the worst-case executon path of the applcaton s determned (lne 4) and data accesses nformaton are computed (lne 5). An ntal ILP problem (lne 6) s generated and the problem solver computes a lst of data to allocate (lne 8). On next teratons, WCET estmaton s performed agan and Constrant 14 s added to the problem formulaton (lne 7) to enforce allocaton of selected data. [1, G], j [1, E] where v has been selected for allocaton on : + alloc_ro v = 1 (14) The algorthm selects and allocates the most (non-already allocated) mpactng data to the scratchpad memory (lne 11). Ths process s appled teratvely untl no more allocaton can reduce the WCET of the applcaton. Smlarly, we have modfed the ILP problem formulaton of [2] to obtan an teratve statc memory allocaton algorthm. Ths algorthm s used n the experments of ths paper. Lnes 6-7 of Algorthm 2 are modfed (1) to generate the statc allocaton s ILP problem [2] and (2) to generate addtonal constrants that enforce allocaton of selected data on next teratons. We won t descrbe n ths paper these addtonal constrants appled to [2] s ILP problem (lne 7) due to the lack of space. 4. Results The evaluaton of the approach for dynamc scratchpad memory allocaton s performed for the StrongARM-1 processor [22]. Benchmark programs are compled usng

7 Benchmark Source Lnes of code Statc data sze Max. stack sze Load-store nst. rato Descrpton Adpcm WCET B. 87 bytes 116 bytes 39% Speech codng Engne Powerstone bytes 116 bytes 6% Engne control G721 Powerstone bytes 284 bytes 16% Voce compresson Hstogram UTDSP bytes bytes 39% Image enhancng applcaton Lpc UTDSP bytes 72 bytes 22% Speech codng Pocsag Powerstone bytes 112 bytes 22% Communcaton protocol Spectral UTDSP bytes 116 bytes 44% Speech power spectral estmaton Statemate WCET B bytes 132 bytes 6% Car wndow lft control Table 3. Informatons on benchmarks programs. a modfed GCC 4.1 compler (see Secton 2.2 for the modfcatons appled to the compler to compute load-store nstructons targets). The compler generates two fles: the output program and an addtonal fle for load-store nstructons targets annotatons. Second, the Heptane tmng analyzer reads the annotaton fle wth load-store s targets nstructons to model the complete memory behavor of the program. The program bnary s read. The maxmum teratons for each program loops are gven as annotatons to enable determnaton of possble executon paths n the program. Ths study reports results on optmzed code wth looprelated optmzatons dsabled. The Heptane tmng analyser supports the ppelned executon and the nstructon cache of the StrongARM-1. The latency of a word access to man memory s 11 cycles [22]. The latency for accesses to data allocated on the scratchpad memory s 1 cycle. A penalty model for scratchpad memory transfers operatons are ntegrated to the tmng analyss of Heptane. We appled a penalty latency of 12 cycles per word of data to transfer. The commercal ILP solver CPLEX s confgured to stop on the frst vald soluton found. The proposed technque s evaluated on an assorted set of benchmarks from WCET benchmarks 3, Powerstone [23] and UTDSP Scratchpad allocaton results We have undertaken a comparson of the mpact of our teratve dynamc scheme over non-teratve statc scratchpad memory allocaton [2] on programs WCET. In ths study, (fne-graned) flowgraphs are generated from the nterprocedural control flow graph of benchmarks programs. Ths gves the maxmum lattude for placement of memory transfers n programs. Fgure 6 gves the mprovement rato of teratve dynamc scratchpad memory allocaton over non-teratve statc allocaton (y-axs) for a range of scratchpad memory szes (x-axs) computed by #cycles reducton from teratve dynamc allocaton #cycles reducton from non-teratve statc allocaton. Fgure 6 gves n addton the mprovement of teratve statc allocaton over non-teratve statc allocaton, provdng nsght on stablty of programs worst-case executon paths. 2 ILOG CPLEX Hgh-performance software for mathematcal programmng and optmzaton: cplex/ 3 WCET benchmarks: wcet/benchmarks.html 4 UTDSP DSP Benchmark Sute: edu/~cornna/dsp/nfrastructure/utdsp.html For fve out of eght benchmarks, we can remark teratve statc allocaton may mprove non-teratve statc scratchpad memory allocaton up to 3%, partcularly for programs wth a large amount of control flow (Engne, Pocsag, Statemate). Hstogram s a typcal example of the beneft of the dynamc capablty of our scratchpad memory allocaton method. Ths program contans two frequently used arrays of 24 bytes separately used n two program phases. Statc allocaton succeeds to place one of these two arrays n a 24 bytes scratchpad memory. Dynamc allocaton moves these two arrays alternatvely n the scratchpad memory unt for an mprovement of 47% of the orgnal performance enhancement due to a statc scratchpad allocaton. On a scratchpad memory larger than 48 bytes, there s enough room to statcally place the two arrays. Both schemes yeld to dentcal WCET value. Major benefts for dynamc scratchpad allocaton are acheved for small ratos of scratchpad memory szes over the whole program data workng set. For example, dynamc scratchpad allocaton s valuable for scratchpad szes ratos lower than % of the workng set for the programs Adpcm, Engne and G721. On these ranges, the method outperforms the statc allocaton from 12% to 85%. The approach s notably proftable to systems wth a scratchpad memory shared among several real-tme tasks. Due to the support of stack data (typcally smaller than 32 bytes), our method takes advantage of very small scratchpad memory szes except programs Hstogram, LPC and Spectral. These benchmarks have very few stack data nstances (see Fgure 4) or few accesses to stack data (see Fgure 2). We have conducted some prelmnary evaluatons of address assgnment algorthms descrbed n Secton In the experments of [27], the address assgnment algorthm s shown to be farly close to the optmal address assgnment. In our experments, the algorthm wth detecton of data regons gves margnal performance mprovements over the address assgnment algorthm of [27]. The teratve allocaton algorthm selects data n ther performance mpact order. Consequently, data wth hgh performance mpact have hgher chance to get a vald address assgnment. Programs Adpcm, G721 and Lpc get a relatve performance ncrease of 3%-7% when the detecton of data regons s enabled. Gans are observed for small (less than 3 bytes) scratchpad memores. The scratchpad memory usage s hgh for such confguratons and many data are transferred on scratchpad memory multple tmes.

8 adpcm Iteratve dynamc allocaton Iteratve statc allocaton engne g hstogram 5 15 lpc pocsag spectral statemate Fgure 6. Improvement (n percent) of teratve dynamc and teratve statc allocatons over nonteratve statc allocaton Solver executon tme Allocaton solvng tme tghtly depends on the number of varables of the ILP problem. The number of varables for statc allocaton problem s O(D) where D s the number of statc data and stack data n the programs. In our experments, there s mplctly one stack data nstance for each defned functon. The count of statc data and functons n programs of the benchmark set are gven n the Table 4. The number of varables for dynamc allocaton problem s O(D E) where E s the number of edges n the flowgraph. The number of edges depends on the representaton level of the flowgraph. Table 4 delvers the number of functons and the number of basc blocks (BBs) of the programs. In ths table, the number of functons of a program gves an dea of the sze of the (coarse-gran) flowgraph generated from ts call graph. In the same way, the number of basc blocks of a program gves an dea of the sze of the (fne-gran) flowgraph generated from ts nterprocedural control flow graph. In our experments, we have observed CPLEX runnng tme s the worst for scratchpad memory sze confguratons where dynamc scratchpad memory allocaton gves the best mprovement over statc scratchpad memory allocaton. Table 4 compares maxmum observed runnng tme for CPLEX solver to produce a soluton for (A) statc allocaton problem, (B) dynamc allocaton problem (coarse-gran) flowgraph (C) wth (fne-gran) flowgraph. Consequently, for the same program, B has less number of possble placements for memory transfer operatons and t gves lower qualty allocaton than C. The fnal column of Table 4 gves the relatve allocaton qualty reducton B A. C A A value of % means B allocaton s as effcent as C and % means allocaton provdes results as low as A statc allocaton. Ths rato s computed for the scratchpad memory sze where C does ts best over A statc allocaton. Frst of all, the runnng tme of the ILP solver s typcally not an ssue for any statc scratchpad memory allocaton problems. Second, programs (Adpcm, G711, Pocsag, Statemate) wth an mportant number of data and a large generated (fne-gran) flowgraph may have huge solver runnng tme. Applyng our method to much more benchmarks programs may enable us to draw general conclusons of the number of ILP varables on solver s runnng tme. Unsurprsngly, B dynamc allocaton at functon granularty produces lower qualty results than C dynamc allocaton on basc-blocks granularty for most programs. One can remark B s as effcent as C for two of eght benchmarks (Engne, Statemate): even though ther respectve solvng tme s shorter. Conversely, Hstogram contans only one functon and B allocaton s strctly equvalent to A statc allocaton. The major concluson of ths study s two-fold. Frst, the practcal lmtaton of our method s the runnng tme to solve ILP problems, whch s problematc for the largest benchmarks studed n ths paper. Second, approaches exst to scale up the applcablty of our method to larger programs. A coarse flowgraph nduces smaller ILP problems, potentally leadng to a lower allocaton qualty. An orthogonal approach may be to apply the method to regons of program (.e. program subgraphs), to generate smaller ILP sub-problems. Moreover, t must be proftable to gnore some non-proftable data wthn a regon n the generated sub-problem, reducng the number of data consdered n generated sub-problems. 5. Related work A man ssue for dynamc scratchpad memory allocaton s the prelmnary selecton of possble placements for memory transfer operatons. [15] and [27] are consderng placement of memory transfer operatons at the level of basc

9 Benchmark Data Functons BBs Allocaton problem solvng tme Allocaton qualty (statc) A B C mprov. rato Adpcm s s 179s 59.2% Engne s 1s 35s.% G s 9s 42s 14.3% Hstogram s 1s 1s.% Lpc s 1s s 68.8% Pocsag s 2s 51s 47.% Spectral s 1s 2s 39.8% Statemate s 7s 8367s.% Table 4. Programs szes vs. problems solvng tme. blocks. [26] proposes to restrct memory transfer operatons to nterestng program ponts, such as functons, condtonals or loops entres/exts wth hgh executon frequences n a flexble way. Moreover, [26] assocates executon tmestamps to program ponts n order to capture program executon context. Data accesses statstcs are recorded usng these tmestamps on a profled executon. The manageable granularty of the flowgraph enables flexble selecton of possble places for memory transfer operatons. Moreover, the support of program executon order n our flowgraph seems possble through the replcaton of subgraphs of the generated flowgraph. However, t s unclear how WCET analysers could generate useful data accesses nformaton n assocaton wth executon tmestamps. Formulaton for dynamc scratchpad memory allocaton ntroduced n Secton 3.2 s an adaptaton of [27] to manage read-only and modfed data. The man beneft s to avod useless store memory transfer operatons from scratchpad to the reference copy n man memory for non-modfed data. The support of stack data descrbed n Secton 3.3 s smlar to the work for statc memory allocaton n [2] appled to our formulaton for dynamc memory allocaton. [7] studes allocaton of splled data on a small and fast drect addressed memory. Our formulaton for dynamc scratchpad memory allocaton supports stack data and t supersedes ths orgnal work. [2] contans an nterestng study on granularty of allocaton of whole stack frame (as appled n ths paper s experments) or ndvdual stack data. Each stack data can be allocated n dfferent memores; hence, the program must have to manage multple program stack ponters on ts executon. Ther study concludes ndvdual stack data allocatons gves margnal performance ncrease aganst whole stack frame allocaton due to ncreased cost of a multple stack management. [25] propose an algorthm for greedy statc allocaton on scratchpad memory. Ther algorthm teratvely () evaluates the worst-case executon path of the applcaton, () selects and allocates the most mpactng (non-already allocated) data to the scratchpad memory then apply () and () untl no more allocaton on free memory space s possble. Our approach dffers because the solver s teratvely called on a refned ILP problem and t supports allocaton of stack data. Moreover, our approach s portable to both statc and dynamc scratchpad memory allocaton ILP problems. In Secton 3.2.3, due to scratchpad memory fragmentaton, we have proposed to leave unallocatable data n man memory. Memory compacton s an nterestng alternatve to rearrange data on scratchpad memory. [26] has shown that such a mechansm has a mnor mpact on program performance. [26] also addresses major mplementatons ssues on statc data and stack data relocaton for dynamc scratchpad memory allocaton. Fxed-szed scratchpad memory are unable to allocate too large data and are unable to take beneft of temporal localty on access of such data, to the dfference wth data caches. As studed n [15], program transformatons such as tlng of bg arrays enable better scratchpad memory usage and ncrease global effectveness of the allocaton. 6. Concluson and future work The man contrbutons of ths paper are two-fold. Frst, we have descrbed an approach to calculate targets of loadstore nstructons. Our approach s based on a common compler nfrastructure and reles on the presence of an nterprocedural ponter analyss. Exhaustve knowledge of load-store nstructons targets n a program requres a wholeprogram analyss mode, avalable n our compler nfrastructure. Second, we have proposed a dynamc scratchpad memory allocaton algorthm to support both statc data and stack data. Our approach attempts to reduce WCET of real-tme programs wth the allocaton of most mpactng data on ther worst-case executon paths. Due to the varablty of the worst-case executon paths n programs [25], we have appled an teratve scheme for data allocaton. Ths scheme requres multple teratons of WCET program analyss and t has demonstrated mproved results [25]. Our experments have shown the ncreasng computatonal complexty of allocaton problem solvng wth program sze. To tackle ths ssue, we have proposed to lmt data transfers to entry and ext of functons, reducng allocaton problem sze and leadng to an absolute decrease of allocaton qualty. Scratchpad memory allocaton of data provdes a fully predctable latency of load-store nstructons [1]. Furthermore, some compler optmzatons (e.g. nstructon schedulng) could make proft of such nformaton for better code generaton. [28] have compared statc scratchpad memory allocaton wth some nstructon cache WCET analyzes. We plan to compare our approach for dynamc scratchpad memory allocaton wth data cache analyses. In ths paper, we have consdered a system wth only one scratchpad memory devce. The optmal statc memory al-

10 locaton [2] supports multple scratchpad memory devces. Ths may ncrease drastcally the number of varables of the generated allocaton problem. We leave such an extenson for dynamc memory allocaton as future work. Acknowledgments. The authors thank Olver Rochecouste for ts comments that helped mprove the qualty of ths paper. References [1] S. G. Abraham, R. A. Sugumar, D. Wndheser, B. R. Rau, and R. Gupta. Predctablty of load/store nstructon latences. In Proceedngs of the 26th Annual Internatonal Symposum on Mcroarchtecture, pages , Austn, TX, Dec [2] O. Avssar, R. Barua, and D. Stewart. An optmal memory allocaton scheme for scratch-pad-based embedded systems. ACM Transactons on Embedded Computng Systems, 1(1):6 26, Nov. 2. [3] G. Balakrshnan and T. W. Reps. Analyzng memory accesses n x86 executables. In Proceedngs of the 13th Internatonal Conference on Compler Constructon, volume 2985 of Lecture Notes n Computer Scence, pages 5 23, Barcelona, Span, Mar. 4. [4] D. Berln. Structure alasng n GCC. In Proceedngs of the 5 GCC Developer s Summt, pages 25 35, Ottawa, Canada, June 5. [5] H. Cassé, L. Féraud, C. Rochange, and P. Sanrat. Usng Abstract Interpretaton Technques for Statc Ponter Analyss. Computer Archtecture News, 27(1):47 5, Mar [6] A. Coln and I. Puaut. A modular & retargetable framework for tree-based WCET analyss. In Proceedngs of the 13th Euromcro Conference on Real-Tme Systems, pages 37 44, Delft, The Netherlands, June 1. [7] K. D. Cooper and T. J. Harvey. Compler-controlled memory. In Proceedngs of the 8th Internatonal Conference on Archtectural Support for Programmng Languages and Operatng Systems, pages 2 11, San Jose, CA, Oct [8] A. Domnguez, S. Udayakumaran, and R. Barua. Heap data allocaton to scratch-pad memory n embedded systems. Journal of Embedded Computng, 1(4):521 54, July 5. [9] C. Ferdnand, R. Heckmann, M. Langenbach, F. Martn, M. Schmdt, H. Thelng, S. Thesng, and R. Wlhelm. Relable and precse WCET determnaton for a real-lfe processor. In Proceedngs of the 1st Internatonal Workshop on Embedded Software, volume 2211 of Lecture Notes n Computer Scence, pages , Tahoe Cty, CA, Oct. 1. [] L. J. Hendren, C. Donawa, M. Emam, G. R. Gao, Justan, and B. Srdharan. Desgnng the McCAT compler based on a famly of structured ntermedate representatons. In Proceedngs of the 5th Internatonal Workshop on Languages and Complers for Parallel Computng, pages 46 4, New Haven, CT, Aug [11] M. Hnd. Ponter analyss: haven t we solved ths problem yet? In Proceedngs of the ACM SIGPLAN-SIGSOFT 1 Workshop on Program Analyss for Software Tools and Engneerng, pages 54 61, Snowbrd, UT, June 1. [12] R. E. Johnson, C. McConnell, and J. M. Lake. The RTL system: A framework for code optmzaton. In Proceedngs of the Internatonal Workshop on Code Generaton, pages , Dagstuhl, Germany, May [13] S.-K. Km, S. L. Mn, and R. Ha. Effcent worst case tmng analyss of data cachng. In Proceedngs of the 2nd IEEE Real-Tme Technology and Applcatons Symposum, pages 23 24, Brooklne, MA, June [14] R. Krner and P. P. Puschner. Classfcaton of WCET analyss technques. In Proceedngs of the 8th IEEE Internatonal Symposum on Object-Orented Real-Tme Dstrbuted Computng, pages , Seattle, WA, May 5. [15] L. L, L. Gao, and J. Xue. Memory colorng: A compler approach for scratchpad memory management. In Proceedngs of the 14th Internatonal Conference on Parallel Archtectures and Complaton Technques, pages , St. Lous, MO, Sept. 5. [16] X. L, A. Roychoudhury, and T. Mtra. Modelng out-oforder processors for WCET analyss. Real-Tme Systems, 34(3): , Nov. 6. [17] T. Lundqvst and P. Stenström. Tmng anomales n dynamcally scheduled mcroprocessors. In Proceedngs of the th IEEE Real-Tme Systems Symposum, pages 12 21, Phoenx, AZ, Dec [18] T. Lundqvst and P. Stenström. A method to mprove the estmated worst-case performance of data cachng. In Proceedngs of the 6th Internatonal Conference on Real-Tme Computng Systems and Applcatons, pages , Hong Kong, Chna, Dec [19] J. W. D. Manuel E. Bentez. A portable global optmzer and lnker. In Proceedngs of the ACM SIGPLAN 1988 Conference on Programmng Language Desgn and Implementaton, pages , Atlanta, GA, June [] S. Mehrotra and L. Harrson. Examnaton of a memory access classfcaton scheme for ponter-ntensve and numerc programs. In Proceedngs of the 1996 Internatonal Conference on Supercomputng, pages , Phladelpha, PA, May [21] I. Puaut and C. Pas. Scratchpad memores vs locked caches n hard real-tme systems, a quanttatve comparson. In Proceedngs of the 7 Conference on Desgn Automaton and Test Europe, pages , Nce, France, Apr. 7. [22] SA-1 mcroprocessor tmng: an applcaton note. Dgtal Equpment Corporaton, June [23] J. Scott, L. H. Lee, J. Arends, and B. Moyer. Desgnng the low-power mcore archtecture. In Proceedngs of the Workshop on Power Drven Mcroarchtecture, pages , Barcelona, Span, June [24] J. Staschulat and R. Ernst. Worst case tmng analyss of nput dependent data cache behavor. In Proceedngs of the 18th Euromcro Conference on Real-Tme Systems, pages , Dresden, Germany, July 6. [25] V. Suhendra, T. Mtra, A. Roychoudhury, and T. Chen. WCET centrc data allocaton to scratchpad memory. In Proceedngs of the 26th IEEE Real-Tme Systems Symposum, pages , Mam, FL, Dec. 5. [26] S. Udayakumaran, A. Domnguez, and R. Barua. Dynamc allocaton for scratch-pad memory usng comple-tme decsons. ACM Transactons on Embedded Computng Systems, 5(2): , May 6. [27] M. Verma and P. Marwedel. Overlay technques for scratchpad memores n low-power embedded processors. IEEE Transactons on Very Large Scale Integraton Systems, 4(8):82 815, Aug. 6. [28] L. Wehmeyer and P. Marwedel. Influence of memory herarches on predctablty for tme constraned embedded software. In Proceedngs of 5 Desgn, Automaton and Test n Europe Conference and Exposton, pages 6 65, Munch, Germany, Mar. 5. [29] R. T. Whte, F. Mueller, C. A. Healy, D. B. Whalley, and M. G. Harmon. Tmng analyss for data and wrap-around fll caches. Real-Tme Systems, 17(2-3):9 233, Nov [3] R. P. Wlson, R. S. French, C. S. Wlson, S. P. Amarasnghe, J.-A. M. Anderson, S. W. K. Tjang, S.-W. Lao, C.-W. Tseng, M. W. Hall, M. S. Lam, and J. L. Hennessy. SUIF: An nfrastructure for research on parallelzng and optmzng complers. SIGPLAN Notces, 29(12):31 37, Dec

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

AADL : about scheduling analysis

AADL : about scheduling analysis AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

LECTURE NOTES Duality Theory, Sensitivity Analysis, and Parametric Programming

LECTURE NOTES Duality Theory, Sensitivity Analysis, and Parametric Programming CEE 60 Davd Rosenberg p. LECTURE NOTES Dualty Theory, Senstvty Analyss, and Parametrc Programmng Learnng Objectves. Revew the prmal LP model formulaton 2. Formulate the Dual Problem of an LP problem (TUES)

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization What s a Computer Program? Descrpton of algorthms and data structures to acheve a specfc ojectve Could e done n any language, even a natural language lke Englsh Programmng language: A Standard notaton

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Vectorization in the Polyhedral Model

Vectorization in the Polyhedral Model Vectorzaton n the Polyhedral Model Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty October 200 888. Introducton: Overvew Vectorzaton: Detecton

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Reliability and Energy-aware Cache Reconfiguration for Embedded Systems

Reliability and Energy-aware Cache Reconfiguration for Embedded Systems Relablty and Energy-aware Cache Reconfguraton for Embedded Systems Yuanwen Huang and Prabhat Mshra Department of Computer and Informaton Scence and Engneerng Unversty of Florda, Ganesvlle FL 326-62, USA

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Loop Transformations, Dependences, and Parallelization

Loop Transformations, Dependences, and Parallelization Loop Transformatons, Dependences, and Parallelzaton Announcements Mdterm s Frday from 3-4:15 n ths room Today Semester long project Data dependence recap Parallelsm and storage tradeoff Scalar expanson

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

IP Camera Configuration Software Instruction Manual

IP Camera Configuration Software Instruction Manual IP Camera 9483 - Confguraton Software Instructon Manual VBD 612-4 (10.14) Dear Customer, Wth your purchase of ths IP Camera, you have chosen a qualty product manufactured by RADEMACHER. Thank you for the

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Greedy Technique - Definition

Greedy Technique - Definition Greedy Technque Greedy Technque - Defnton The greedy method s a general algorthm desgn paradgm, bult on the follong elements: confguratons: dfferent choces, collectons, or values to fnd objectve functon:

More information

Petri Net Based Software Dependability Engineering

Petri Net Based Software Dependability Engineering Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits

Repeater Insertion for Two-Terminal Nets in Three-Dimensional Integrated Circuits Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract.

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Verification by testing

Verification by testing Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over

More information

A Facet Generation Procedure. for solving 0/1 integer programs

A Facet Generation Procedure. for solving 0/1 integer programs A Facet Generaton Procedure for solvng 0/ nteger programs by Gyana R. Parja IBM Corporaton, Poughkeepse, NY 260 Radu Gaddov Emery Worldwde Arlnes, Vandala, Oho 45377 and Wlbert E. Wlhelm Teas A&M Unversty,

More information

Array transposition in CUDA shared memory

Array transposition in CUDA shared memory Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desgn and Analyss of Algorthms Heaps and Heapsort Reference: CLRS Chapter 6 Topcs: Heaps Heapsort Prorty queue Huo Hongwe Recap and overvew The story so far... Inserton sort runnng tme of Θ(n 2 ); sorts

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

MODULE DESIGN BASED ON INTERFACE INTEGRATION TO MAXIMIZE PRODUCT VARIETY AND MINIMIZE FAMILY COST

MODULE DESIGN BASED ON INTERFACE INTEGRATION TO MAXIMIZE PRODUCT VARIETY AND MINIMIZE FAMILY COST INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN, ICED 07 28-31 AUGUST 2007, CITE DES SCIENCES ET DE L'INDUSTRIE, PARIS, FRANCE MODULE DESIGN BASED ON INTERFACE INTEGRATION TO MAIMIZE PRODUCT VARIETY AND

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Network Coding as a Dynamical System

Network Coding as a Dynamical System Network Codng as a Dynamcal System Narayan B. Mandayam IEEE Dstngushed Lecture (jont work wth Dan Zhang and a Su) Department of Electrcal and Computer Engneerng Rutgers Unversty Outlne. Introducton 2.

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Maintaining temporal validity of real-time data on non-continuously executing resources

Maintaining temporal validity of real-time data on non-continuously executing resources Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III.

Lecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III. Lecture 15: Memory Herarchy Optmzatons I. Caches: A Quck Revew II. Iteraton Space & Loop Transformatons III. Types of Reuse ALSU 7.4.2-7.4.3, 11.2-11.5.1 15-745: Memory Herarchy Optmzatons Phllp B. Gbbons

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

ARTICLE IN PRESS. Signal Processing: Image Communication

ARTICLE IN PRESS. Signal Processing: Image Communication Sgnal Processng: Image Communcaton 23 (2008) 754 768 Contents lsts avalable at ScenceDrect Sgnal Processng: Image Communcaton journal homepage: www.elsever.com/locate/mage Dstrbuted meda rate allocaton

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

An Integer Linear Programming Approach for Identifying Instruction-Set Extensions

An Integer Linear Programming Approach for Identifying Instruction-Set Extensions An Integer Lnear Programmng Approach for Identfyng Instructon-Set Extensons Kublay Atasu Department of Computer Engneerng Bogazc Unversty, Turkey atasu@boun.edu.tr Günhan Dündar Department of Electrcal

More information

Memory Modeling in ESL-RTL Equivalence Checking

Memory Modeling in ESL-RTL Equivalence Checking 11.4 Memory Modelng n ESL-RTL Equvalence Checkng Alfred Koelbl 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 koelbl@synopsys.com Jerry R. Burch 2025 NW Cornelus Pass Rd. Hllsboro, OR 97124 burch@synopsys.com

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu

More information

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort Sortng: The Bg Pcture Gven n comparable elements n an array, sort them n an ncreasng (or decreasng) order. Smple algorthms: O(n ) Inserton sort Selecton sort Bubble sort Shell sort Fancer algorthms: O(n

More information

Storage Binding in RTL synthesis

Storage Binding in RTL synthesis Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES

VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES UbCC 2011, Volume 6, 5002981-x manuscrpts OPEN ACCES UbCC Journal ISSN 1992-8424 www.ubcc.org VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES

More information

3. CR parameters and Multi-Objective Fitness Function

3. CR parameters and Multi-Objective Fitness Function 3 CR parameters and Mult-objectve Ftness Functon 41 3. CR parameters and Mult-Objectve Ftness Functon 3.1. Introducton Cogntve rados dynamcally confgure the wreless communcaton system, whch takes beneft

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan

More information

Hybrid Heuristics for the Maximum Diversity Problem

Hybrid Heuristics for the Maximum Diversity Problem Hybrd Heurstcs for the Maxmum Dversty Problem MICAEL GALLEGO Departamento de Informátca, Estadístca y Telemátca, Unversdad Rey Juan Carlos, Span. Mcael.Gallego@urjc.es ABRAHAM DUARTE Departamento de Informátca,

More information