Fast Searches for Effective Optimization Phase Sequences

Size: px
Start display at page:

Download "Fast Searches for Effective Optimization Phase Sequences"

Transcription

1 Fast Searches or Eective Optimization Phase Sequences Prasa Kulkar ni 1, Stephen Hines 1, Jason Hiser 2, Davi Whalley 1, Jack Davison 2, Douglas Jones 3 1 Computer Science Dept., Floria State University, Tallahassee, FL ; whalley@cs.su.eu 2 Computer Science Dept., University o Virginia, Charlottesville, VA 22904; jw@virginia.eu 3 Electr ical an Computer Eng. Dept, University o Illinois, Urbana, IL 61801; l-jones@uiuc.eu ABSTRACT It has long been known that a ixe orering o optimization phases will not prouce the best coe or every application. One approach or aressing this phase orering problem is to use an ev olutionary algorithm to search or a speciic sequence o phases or each moule or unction. While such searches have been shown to prouce more eicient coe, the approach can be extremely slow because the application is compile an execute to evaluate each sequence s eectiveness. Consequently, evolutionary or iterative compilation schemes have been promote or compilation systems targeting embee applications where longer compilation times may be tolerate in the inal stage o evelopment. In this paper we escribe two complementary general approaches or achieving aster searches or eective optimization sequences when using a genetic algorithm. The irst approach reuces the search time by avoiing unnecessary executions o the application when possible. Results inicate search time reuctions o 65% on average, oten reucing searches rom hours to minutes. The secon approach moiies the search so ewer generations are require to achieve the same results. Measurements show that the average number o require generations ecrease by 68%. These improvements have the potential or making evolutionary compilation a viable choice or tuning embee applications. Categories an Subject Descriptors D.3.4 [Programming Languages]: Processors compilers, optimization D.4.7 [Operating Systems]: Organization an Design real-time systems an embee systems. General Terms Measurement, Perormance, Experimentation, Algorithms. Keywors Phase orering, interactive compilation, genetic algorithms. Permission to make igital or har copies o all or part o this work or personal or classroom use is grante without ee provie that copies are not mae or istribute or proit or commercial avantage an that copies bear this notice an the ull citation on the irst page. To copy otherwise, or republish, to post on servers or to reistribute to lists, requires prior speciic permission an/or a ee. PLDI 04, June 9-11, 2004, Washington, DC, USA. Copyright 2004 ACM /04/0006$ INTRODUCTION The phase orering problem has long been known to be a iicult ilemma or compiler writers [17, 19]. One sequence o optimization phases is highly unlikely to be the most eective sequence or ev ery application (or even or each unction within a single application) on a given machine. Whether or not a particular optimization enables or isables opportunities or subsequent optimizations is iicult to preict since it epens on the application being compile, the previously applie optimizations, an the target architecture [19]. One approach to eal with this problem is to search or eective optimization phase sequences using genetic algorithms [5, 11]. When the itness criteria or such searches involve ynamic measures (e.g., cycle counts or power consumption), thousans o irect executions o an application may be require. The search time can be signiicant, oten neeing hours or ays when ining eective sequences or a single application, making it less attractive or evelopers. There are application areas where long compilation times are acceptable. For example, long compilation times may be tolerate in application areas where the problem size is irectly relate to the execution time to solve the problem. In act, the size o many computational chemistry an high-energy physics problems is limite by the elapse time to reach a solution (typically a ew ays or a week). Long compilation times may be acceptable i the resulting coe allows larger problem instances to be solve in the same amount o time. Evolutionary compilation systems have also been propose or compilation systems targeting embee systems where meeting strict constraints on execution time, coe size, an power consumption is paramount. Here long compilation times are acceptable because in the inal stages o evelopment an application is compile an embee in a prouct where millions o units may be shippe. For embee systems, the problem is urther exacerbate because the sotware evelopment environment is oten ierent rom the target environment. Obtaining perormance measures on cross-platorm evelopment environments oten requires simulation which can be orers o magnitue slower than native execution. Even when it is possible to use the target machine to gather perormance ata irectly, the embee processor may be signiicantly slower (slower clock rate, less memory, etc.) than available general-purpose processors. We hav e oun that searching or an eective optimization sequence can easily require hours or ays even when using irect execution on a general-purpose processor. For example, using a conventional genetic algorithm to search or eective optimization sequences or the jpeg application on an Ultra SPARC III processor require over 20 hours to 171

2 complete. Thus, ining eective sequences to tune an embee application may result in an intolerably long search time. In this paper we escribe approaches or achieving aster searches or eective optimization sequences using a genetic algorithm. We perorme our experiments using the VISTA (VPO Interactive System or Tuning Applications) ramework [20]. One eature o VISTA is that it can automatically obtain perormance eeback inormation which can be presente to the user an can be use to make phase orering ecisions [11]. We use this perormance inormation to rive the genetic algorithm searches or eective optimization sequences. The remainer o the paper is structure as ollows. First, we review other aggressive compilation techniques that have been use to tune applications. Secon, we give an overview o the VISTA ramework in which our experiments are perorme. Thir, we escribe methos or reucing the overhea o the searches or eective sequences. Fourth, we iscuss techniques or ining eective sequences in ewer generations. Fith, we show results that inicate the eectiveness o using our techniques to perorm aster searches or optimization sequences. Finally, we outline uture work an present the conclusions o the paper. 2. RELATED WORK Prior work has use aggressive compilation techniques to improve perormance. Superoptimizers have been evelope that use an exhaustive search or instruction selection [12] or to eliminate branches [7]. Selecting the best combination o optimizations by turning on or o optimization lags, as oppose to varying the orer o optimizations, has also been investigate [4]. Some systems perorm transormations an use perormance eeback inormation to tune applications. Iterative techniques using perormance eeback inormation ater each compilation have been applie to etermine goo optimization parameters (e.g., blocking sizes) or speciic programs or library routines [10, 18]. Another technique uses compile-time perormance estimation [16]. All o these systems are limite in the set o optimizations they apply. Speciications o coe-improving transormations have been automatically analyze to etermine i one type o transormation can enable or isable another [19]. This inormation can provie insight into how to speciy an eective optimization phase orering or a conventional optimizing compiler. A number o systems have been evelope that use evolutionary algorithms to improve compiler optimizations. A neural network has been use to tune static branch preictions [3]. Genetic algorithms have been use to better parallelize loop nests [13]. Another system use genetic algorithms to erive improve compiler heuristics or hyperblock ormation, register allocation, an ata preetching [15]. A low-level compilation system evelope at Rice University uses a genetic algorithm to reuce coe size by ining eicient optimization phase sequences [5, 6]. The Rice system uses a similar genetic algorithm as in VISTA or ining phase sequences. However, the Rice system is batch oriente instea o interactive an applies the same optimization phase orer or all o the unctions within a ile. Some aspects o the approaches escribe in our paper may be useul or obtaining aster searches in all o these systems. 3. THE VISTA FRAMEWORK This section provies a brie overview o the ramework use or the experiments reporte in this paper. A more etaile escription o VISTA s architecture can be oun in prior publications [20, 11]. Figure 1 illustrates the low o inormation in VISTA, which consists o a compiler an a viewer. The programmer initially inicates a ile to be compile an then speciies requests through the viewer, which inclue sequences o optimization phases, manually speciie transormations, an queries. The compiler perorms the speciie actions an sens program representation inormation back to the viewer. Each time an optimization sequence is selecte or the unction being tune, the compiler instruments the coe, prouces assembly coe, links an executes the program, an gets perormance measures rom the execution. When the user chooses to terminate the session, VISTA writes the sequence o transormations to a ile so they can be reapplie at a later time, enabling uture upates. EASE New Instructions Source File User Measure Request Program Representation Ino. Selections Display Executable Compiler Viewer Perormance Measures Requests Linke File Assembly File Transormation Ino. Save State Figure 1: Interactive Coe Improvement Process The compiler use in VISTA is base on VPO (Very Portable Optimizer), which is a compiler back en that perorms all o its optimizations on a single low-level representation calle RTLs (register transer lists) [1, 2]. Because VPO uses a single representation, it can apply most analyses an optimization phases repeately an in an arbitrary orer. This eature acilitates ining more eective sequences o optimization phases. Figure 2 shows a snapshot o the viewer with the history o a sequence o optimization phases isplaye. Note that not only is the number o transormations associate with each optimization phase isplaye, but also the improvements in instructions execute an coe size are shown. This inormation allows a user to quickly gauge the progress that has been mae in improving the unction. The requency o each basic block relative to the unction is also shown in each block heaer line, which allows a user to ientiy the critical regions o a unction. VISTA allows a user to speciy a set o istinct optimization phases an have the compiler attempt to in the best sequence or applying these phases. Figure 3 shows the ierent options that we provie the user to control the search. The user speciies the sequence length, which is the total number o phases applie in each sequence. Our experiments use the biase sampling search, which applies a genetic algorithm in an attempt to in the most eective sequence within a limite amount o time since in many cases the search space is too large to evaluate all possible sequences [9]. A population is the set o solutions (sequences) 172

3 Figure 2: Main Winow o VISTA Showing History o Optimization Phases that are uner consieration. The number o generations inicates how many sets o populations are to be evaluate. The population size an the number o generations limits the total number o sequences evaluate. VISTA also allows the user to choose ynamic an static weight actors, where the relative improvement o each is use to etermine the overall itness. Figure 4: Winow Showing the Search Status Figure 3: Selecting Options to Search or Possible Sequences Perorming these searches is time consuming, typically requiring tens o minutes or a single unction, an hours or ays or an entire application even when using irect execution. Thus, VISTA provies a winow showing the current search status. Figure 4 shows a snapshot o the status o the search selecte in Figure 3. The percentage o sequences complete, the best sequence, an its eect on perormance are given. The user can terminate the search at any point an accept the best sequence oun so ar. 4. REDUCING THE SEARCH OVERHEAD Perorming a search or an eective optimization phase sequence can be quite expensive, perhaps requiring hours or ays or an entire application even when using irect execution. One obvious beneit or speeing up these searches is that the technique is more likely to be use. Another beneit is that the search can be mae more aggressive, such as increasing the number o generations, in an attempt to prouce a better tune application. VISTA perorms the ollowing tasks to obtain ynamic perormance measurements or a single sequence. (1) The compiler applies the optimization phases in the orer speciie by the sequence. (2) The generate coe or the unction is instrumente i require to obtain perormance measurements an the assembly coe or that unction an the remaining assembly coe or the unctions in the current source ile are written to a ile. (3) The newly generate assembly ile is assemble. (4) The object iles comprising the entire program are linke together into an executable by a comman supplie in a coniguration ile. (5) The program is execute using a comman in a coniguration ile, which may involve irect execution or simulation. As a sie eect o the execution, perormance measurements are prouce. (6) The output o the execution is compare to the esire output to provie assurance that the new sequence i not cause the generate coe to become invali. Tasks 2-6 oten ominate the search time, which is probably ue to these tasks requiring I/O an task 1 being perorme in memory. The ollowing subsections escribe methos to reuce the search overhea by inerring the outcome o a sequence. Figure 5 illustrates the orer in which the ierent methos are attempte. The methos are orere accoring to cost. Each metho hanles 173

4 a superset o the sequences hanle by the methos applie beore it, but the later methos are more expensive. caniate phases best sequence Genetic Algorithm Execute Application previous measure next sequence new measure generate executable oun Check Attempte Sequences apply phases Active Sequences calculate unmappe checksum Check oun Check or Check or calculate Equivalent Ientical mappe Function checksum Function oun oun Figure 5: Methos or Reucing Search Overhea 4.1 Fining Reunant Attempte Sequences Sometimes the same optimization phase sequence is reattempte uring the search. Consier Figure 6, where each optimization phase in a sequence is represente by a letter. The same sequence can be reattempte ue to mutation not occurring on any o the phases in the sequence (e.g. sequence i remaining the same in Figure 6). Likewise, a crossover operation or mutation changing some iniviual phases can prouce a previously attempte sequence (e.g. sequence k mutates to be the same as sequence j beore mutation in Figure 6). A hash table o attempte sequences along with the perormance result or each sequence is maintaine. I a sequence is oun to be previously attempte, then the evaluation o the sequence is not perorme an the previous result is use. This technique o using a hash table to capture previously attempte solutions has been previously use to reuce search time [5, 15, 11]. seq i: seq j: seq k: beore mutation a a e c c c e c b b seq k: b seq i: seq j: ater mutation a a c a c Figure 6: Example o Reunant Attempte Sequences 4.2 Fining Reunant Active Sequences A transormation is a sequence o changes to the program representation, where the semantic behavior is preserve. A phase is a sequence o transormations cause by a single type o optimization. Borrowing rom biological terminology, an active optimization phase (gene) is one that applies transormations, while a ormant optimization phase (gene) is one that has no eect. An optimization phase is ormant when the enabling conitions or the optimization to be applie are not satisie. As one woul expect, only a subset o the attempte phases in a sequence will typically be active. It is common that a ormant phase may be mutate to another ormant phase, but it woul not aect the compilation. Figure 7 illustrates how ierent attempte sequences can map to the same active sequence, where the bol boxes represent active a e c b c c phases an the nonbol boxes represent ormant phases. A secon hash table is use to recor sequences where only the active phases are represente. attempte: seq i: b e c seq j: a e b c active: seq i: e c seq j: e c Figure 7: Example o a Reunant Active Sequence 4.3 Detecting Ientical Coe Sometimes ientical coe can be generate rom ierent active sequences. Oten ierent optimization phases can be applie an can have the same eect. Consier the two ierent ways that the pair o instructions in Figure 8 can be merge together. Instruction selection symbolically merges the instructions an checks to see i the resulting instruction is legal. The same eect in this case can be prouce by constant propagation ollowe by ea assignment elimination. We also oun that perorming some optimization phases in a ierent orer will have no eect on the inal coe that is generate. For instance, consier applying branch chaining beore an ater register allocation. Both branch chaining an register allocation will neither inhibit nor enable the other phase. original coe segment r[2]=1; r[3]=r[4]+r[2]; ater instruction selection r[3]=r[4]+1; original coe segment r[2]=1; r[3]=r[4]+r[2]; ater constant propagation r[2]=1; r[3]=r[4]+1; ater ea assignment elimination r[3]=r[4]+1; Figure 8: Dierent Optimizations Having the Same Eect VISTA has to eiciently etect when ierent active sequences generate ientical coe to be able to reuce the search overhea. A search may result in thousans o unique unction instances, which may be too large to store in memory an very expensive to access on isk. The key realization in aressing this issue was that while we nee to etect when unction instances are ientical, we can tolerate occasionally treating ierent instances as being ientical since the sequences within a population are sorte an the best sequence oun by the genetic algorithm must be completely evaluate. Thus, we calculate a CRC (cyclic reunancy coe) checksum on the bytes o the RTLs an keep a hash table o these checksums. CRCs are commonly use to check the valiity o ata transmitte over a network an have an avantage over conventional checksums in that the orer o the bytes o ata oes aect the result [14]. I the checksum has been generate or a previous unction instance, then we use the perormance results o that instance. We hav e veriie it is rare that we generate the same checksum or ierent unction instances an that the best itness value oun is never aecte in our experiments. 174

5 4.4 Detecting Equivalent Coe Sometimes the coe generate by ierent optimization sequences are equivalent, in reg ar to spee an size, but not ientical. Consier two unction instances that have the same sequence o instruction types, but use ierent registers. This can occur since ierent optimization phases compete or registers. For instance, consier the source coe in Figure 9(a). Figures 9(b) an 9(c) show two possible translations given two ierent orerings o optimization phases that consume registers. To etect this situation, we ientiy the live ranges o all o the registers in the unction an map each live range to a istinct pseuo register. Equivalent unction instances become ientical ater mapping, which is illustrate or the example in Figure 9(). We compute the CRC checksum or the mappe unction instance an check in a separate hash table o CRC checksums to see i the mappe unction ha been previously generate. r[10]=0; r[12]=hi[a]; r[12]=r[12]+lo[a]; r[1]=r[12]; r[9]=4000+r[12]; L3 r[8]=m[r[1]]; r[10]=r[10]+r[8]; r[1]=r[1]+4; IC=r[1]?r[9]; PC=IC<0,L3; (b) Register Allocation beore Coe Motion sum = 0; or (i = 0; i < 1000; i++) sum += a[i]; (a) Source Coe r[11]=0; r[10]=hi[a]; r[10]=r[10]+lo[a]; r[1]=r[10]; r[9]=4000+r[10]; L3 r[8]=m[r[1]]; r[11]=r[11]+r[8]; r[1]=r[1]+4; IC=r[1]?r[9]; PC=IC<0,L3; (c) Coe Motion beore Register Allocation r[32]=0; r[33]=hi[a]; r[33]=r[33]+lo[a]; r[34]=r[33]; r[35]=4000+r[33]; L3 r[36]=m[r[34]]; r[32]=r[32]+r[36]; r[34]=r[34]+4; IC=r[34]?r[35]; PC=IC<0,L3; () Ater Mapping Registers Figure 9: Dierent Functions with Equivalent Coe On most machines there is a uniorm access time or each register in the register ile. Likewise, most statically scheule processors o not generate stalls ue to anti (write ater rea) an output (write ater write) epenences. However, these epenences coul inhibit uture optimizations. Thus, comparing register mappe unctions to avoi executions in the search shoul only be perorme ater all remaining optimizations (e.g. illing elay slots) have been applie. Given that these assumptions are true, i we in that the current mappe unction is equivalent to a previous mappe instance o the unction, then we can assume the two are equivalent an will prouce the same result ater execution. 5. PRODUCING SIMILAR RESULTS IN FEWER GENERATIONS Another approach that can be use to reuce the search time or ining eective optimization sequences is to prouce the same results in ewer generations o the genetic algorithm. I this approach is easible, then users can either speciy ewer generations to be perorme in their searches or they can stop the search sooner once the esire results have been achieve. The ollowing subsections escribe the ierent techniques that we use to obtain eective sequences o optimization phases in ewer generations. All o these techniques ientiy phases that are likely to be active or ormant at a given point in the compilation process. 5.1 Using the Batch Sequence The traitional or batch version o our compiler always attempts the same orer o optimization phases or each unction. We obtain the sequence o active phases (those phases that were able to apply one or more transormations) rom the batch compilation o the unction. We hav e use the length o the active batch sequence to establish the length o the sequences attempte by the genetic algorithm in previous experiments [11]. We propose to use the active batch sequence or the unction as one o the sequences in the initial population. The premise is that i we initialize a sequence in the population with optimization phases that are likely to be active, then this may allow the genetic algorithm to converge aster on the best sequence it can in. This approach is similar to incluing in the initial population the compiler writer s manually speciie priority unction when attempting to tune a compiler heuristic [15]. 5.2 Prohibiting Speciic Phases While many ierent optimization phases can be speciie as caniate phases or the genetic algorithm, sometimes speciic phases can never be active or a given unction. I the genetic algorithm only attempts phases that have an opportunity to be active, then the algorithm may converge on the best sequence it can in in ewer attempts. There are several situations when speciic optimizations shoul not be attempte. Loop optimization phases cannot be active or a unction that oes not contain any loops. Register allocation in VPO cannot be active or a unction that oes not contain any local variables or parameters. Branch optimizations an unreachable coe elimination cannot be active or a unction that contains a single basic block. Detecting that a speciic set o optimization phases can never be active or a given unction requires simple analysis that only nees to be perorme once at the beginning o the genetic algorithm. 5.3 Prohibiting Prior Dormant Phases When compiling a unction, we in certain optimization phases will be ormant given that a speciic preix o active phases has been perorme. Given that the same preix o phases is attempte again, there is no beneit rom attempting the same ormant phase in the same situation since it will remain ormant. To avoi repeating these ormant phases, we represent the active phases as noes in a tree, where each chil correspons to the next phase in an active sequence. We also store at each noe the set o phases 175

6 that were oun to be ormant or that preix o active phases. Figure 10 shows an example tree where the bol portions represent active preixes an the nonbol boxes represent ormant phases given that preix. For instance, a an are ormant phases or the preix bac. To prohibit applying a prior ormant phase, we orce a phase to change uring mutation until we in a phase that has either been active with the speciie preix or has not yet been attempte. b a b e c a e b c a b Figure 10: A Tree Representing Active Preixes 5.4 Prohibiting Unenable Phases Certain optimization phases when perorme cannot become active again until enable. For instance, register allocation replaces reerences to variables in live ranges with registers. A live range is assigne to a register when a register is available at that point in the coloring process. Ater the compiler applies register allocation, this optimization phase will not have an opportunity to be active again until the register pressure has change. Unreachable coe elimination an a variety o branch optimizations will not aect the register pressure an thus will not enable register allocation. Figure 11 illustrates that a speciic phase, the nonbol box o the sequence on the right, will at times be unenable an cannot be active. Again the premise is that i the genetic algorithm concentrates on the phases that have an opportunity to be active, then it will be able to apply more active phases in a sequence an converge to the best sequence it can in in ewer attempts. Note that etermining which optimization phases can enable another phase requires careul consieration by the compiler writer. c enables a b an o not enable a a b c a a b a Figure 11: Enabling Previously Applie Phases We implemente this technique by orcing a phase to mutate i the same phase has alreay been perorme an there are no intervening phases that can enable it. We realize that a speciic phase can become unenable ater an attempte phase is oun to be active or ormant. We irst ollow the tree o active preixes, which was escribe in the previous subsection, to etermine which phases are currently enable. For example, consier again Figure 10. Assume that b can be enable by a, but cannot be enable by c. Giv en the preix bac, we know that b cannot be active at this point since b was ormant ater the preix ba an c cannot reenable it. Ater reaching a lea o the tree we track which phases cannot be enable by just examining the subsequently attempte phases. 6. EXPERIMENTS This section escribes the results o a set o experiments to illustrate the eectiveness o the previously escribe techniques or obtaining ast searches or eective optimization phase sequences. We irst perorm experiments on a Ultra SPARC III processor so that the results coul be obtaine in a reasonable time. Ater ensuring ourselves that the techniques were soun, we use these techniques when obtaining results or the Intel StrongARM SA-110 processor, which has a clock rate that is more than 5 times slower than the Ultra SPARC III. We use a subset o the mibench benchmarks, which are C applications targeting speciic areas o the embee market [8]. We use one benchmark rom each o the six categories o applications. When executing each o the benchmarks, we use the sample input ata that was provie with the benchmark. Table 1 contains escriptions o these programs. Category Program Description auto/inustrial bitcount test bit manipulation abilities network ijkstra calculates shortest path between noes using Dijkstra s algorithm telecomm t perorms ast ourier transorm consumer jpeg image compression & ecompression security sha secure hash algorithm oice stringsearch searches or wors in phrases Table 1: MiBench Benchmarks Use in the Experiments Table 2 shows each o the caniate coe-improving phases that we use in the experiments when compiling each unction. In aition, register assignment, which is a compulsory phase that assigns pseuo registers to harware registers, has to be perorme. VISTA implicitly perorms register assignment beore the irst coe-improving phase in a sequence that requires it. Ater applying the last coe-improving phase in a sequence, we perorm another compulsory phase which inserts instructions at the entry an exit o the unction to manage the activation recor on the run-time stack. Finally, we also perorm aitional coeimproving phases aterwars, such as illing elay slots. Our genetic algorithm search or obtaining the baseline measurements was accomplishe in the ollowing manner. Unlike past stuies using genetic algorithms to generate better coe [13, 5, 15], we perorm a search on each unction (a total o 106 unctions in our test suite), which requires longer compilations but results in better overall improvements [11]. In act, most o the techniques we are evaluating woul be much less eective i we searche or a single sequence to be applie on an entire application. We set the sequence (chromosome) length to be 1.25 times the number o active phases that were applie or the unction by the batch compiler. We elt this length was a reasonable limit an gives us an opportunity to apply more active phases than what the batch compiler coul accomplish, which is much less than the number o phases attempte uring the batch compilation. The sequence lengths use in these experiments varie between 4 an 48 with an average o We set the population size (ixe number o sequences or chromosomes) to twenty an each o these initial sequences is ranomly initialize with caniate optimization phases. We perorme 100 generations when searching or the best sequence or each unction. We sort the sequences in 176

7 Optimization Phase branch chaining common subexpression elimination remove unreachable coe remove useless blocks ea assignment elimination block reorering minimize loop jumps register allocation Description Replaces a branch or jump target with the target o the last jump in a jump chain. Eliminates ully reunant calculations, which also inclues constant an copy propagation. Removes basic blocks that cannot be reache rom the entry block o the unction. Removes empty blocks rom the control-low graph. Removes assignments when the assigne value is never use. Removes a jump by reorering basic blocks when the target o the jump has only a single preecessor. Removes a jump associate with a loop by uplicating a portion o the loop. Replaces reerences to a variable within a speciic live range with a register. loop transormations Perorms loop-invariant coe motion, recurrence elimination, loop strength reuction, an inuction variable elimination on each loop orere by loop nesting level. Each o these transormations can also be iniviually selecte by the user. merge basic blocks Merges two consecutive basic blocks a an b when a is only ollowe by b an b is only precee by a. evaluation orer etermination strength reuction reverse jumps instruction selection remove useless jumps Reorers RTLs in an attempt to use ewer registers. Replaces an expensive instruction with one or more cheaper ones. Eliminates an unconitional jump by reversing a conitional branch when it branches over the jump. Combine instructions together an perorm constant oling when the combine eect is a legal instruction. Removes jumps an branches whose target is the ollowing block. Table 2: Caniate Optimization Phases in the Genetic Algorithm Experiments the population by a itness value calculate using 50% weight on spee an 50% weight on coe size. The spee actor we use was the number o instructions execute since this was a measure that coul be consistently obtaine, it has been use in similar stuies [5, 11], an allowe us to obtain baseline measurements within a reasonable perio o time. We coul obtain a more accurate measure o spee by using a cycle-accurate simulator. Howev er, the main point o our experiments was to evaluate the eectiveness o techniques or obtaining aster searches, which can be applie with any type o itness evaluation criteria. At each generation (time step) we remove the worst sequence an three others rom the lower (poorer perorming) hal o the population chosen at ranom. Each o the remove sequences are replace by ranomly selecting a pair o the remaining sequences rom the upper hal o the population an perorming a crossover (mating) operation to create a pair o new sequences. The crossover operation combines the lower hal o one sequence with the upper hal o the other sequence an vice versa to create two new sequences. Fiteen sequences are then change (mutate) by consiering each optimization phase (gene) in the sequence. Mutation o each phase in a sequence occurs with a probability o 10% an 5% or the lower an upper halves o the population, respectively. When an optimization phase is mutate, it is ranomly replace with another phase. The our sequences subjecte to crossover an the best perorming sequence are not mutate. Finally, i we in ientical sequences in the same population, then we replace the reunant sequences with ones that are ranomly generate. Figures 12, 13, an 14 show the percentage improvement that we obtaine or the SPARC when optimizing or spee only, size only, an 50% or each actor, respectively. Perormance results or the ARM, a wiely use embee processor, are presente later in this section. The baseline measures were obtaine using the batch VPO compiler, which iteratively applies optimization phases until no more improvements can be obtaine. This baseline is much more aggressive than always using a ixe length sequence o phases [11]. The average beneits shown in the igure are slightly improve rom previously publishe results [11] since the searches now inclue aitional optimization phases that were not previously exploite by the genetic algorithm. Note that the contribution o our paper is that the search or these beneits is more eicient, rather than the actual beneits obtaine. Figure 12: Spee Only Improvements or the SPARC Figure 13: Size Only Improvements or the SPARC 177

8 2.27 hours. The reuction appears to be aecte not only by the percentage o the avoie executions, but also by the size o the unctions. The larger unctions tene to have ewer avoie executions an also ha longer compilations. While the average search time was signiicantly reuce or these experiments using irect execution on a SPARC processor, the savings woul only increase when using simulation since the executions o the application woul comprise a larger portion o the search time. Figure 14: Size an Spee Improvements or the SPARC Figure 15 shows the average number o sequences whose executions were avoie or each benchmark using the methos escribe in Section 4. These results o not inclue the unctions in the benchmarks that were not execute when using the sample input ata since these unctions were evaluate on coe size only an i not require execution o the application. Consier or now only the top bar or each benchmark, which represents the results without applying any o the techniques in Section 5. As mentione previously, each metho in Section 4 is able to in a superset o the sequences hanle by methos applie beore it. On av erage 41.3% o the sequences were etecte as reunantly attempte, 27.0% were caught as reunant active sequences, 14.9% were iscovere to prouce ientical coe as generate by a previous sequence, an 1.0% were oun to prouce unique, but equivalent coe. Thus, over 84% o the executions were avoie. We oun that we coul avoi a higher percentage o the executions when tuning smaller unctions since we use shorter sequence lengths that were establishe by the batch compilation ue to ewer optimization phases being active. A shorter sequence length results in more reunant sequences. For instance, the likelihoo o mutation is less when there are ewer phases in a sequence to mutate. Also, ientical or equivalent coe is more likely when ewer phases coul be applie. Figure 15: Number o Avoie Executions Figure 16 shows the relative search time require when applying the methos escribe in Section 4 to not applying these methos. The av erage search time require 0.35 o the time when no executions were avoie an 0.51 o the time when reunant attempte sequences were avoie. The av erage time require to evaluate each o the six benchmarks improve rom 5.57 hours to Figure 16: Relative Total Search Time Figures show the average number o generations that were evaluate or each o the unctions beore ining the best itness value in the search. The baseline result is without using any o the techniques escribe in Section 5. The other results inicate the generation when the irst sequence was oun whose perormance equale the best sequence oun in the baseline search. To ensure a air comparison, we i not inclue the results or the unctions when the best itness value oun was not ientical to the best itness value in the baseline, which occurre on about 18% o the unctions. This cause the baseline results to vary slightly since the unctions with ierent itness values were not always the same when applying each o the techniques. About 11.3% o the unctions ha improve itness values an about 6.6% o the unctions ha worse itness values when all o the techniques were applie. On average the best itness values improve by 0.24% (by 1.33% or only the iering unctions). The maximum number o generations beore ining the best itness value or any unction was 91 out o a possible 100 when not applying any o the our techniques. The maximum was 56 when all our techniques were use. The techniques occasionally cause the best itness value to be oun later, which we believe is ue to the inherent ranomness o using a genetic algorithm. However, all o the techniques were beneicial on average. Figure 17 shows the eect o using the batch sequence in the initial population, which in general was quite beneicial. We oun that this technique worke well or the smaller unctions in the applications since it was oten the case that the batch compiler prouce coe that was as goo as the coe generate by the best sequence oun in the search. However, the smaller unctions tene to converge on the best sequence in the search in ewer generations anyway since the sequence lengths were typically shorter. In act, it is likely that perorming a search or an eective optimization sequence is in general less beneicial or smaller unctions since there is less interplay between phases. Using the batch sequence or the larger unctions oten resulte in ining the best sequence in ewer generations even though the batch compiler typically i not prouce coe that was as goo as prouce by the best sequence oun in the baseline results. Thus, 178

9 simply initializing the population with one sequence containing phases that are likely to be active is quite beneicial. Figure 17: Number o Generations beore Fining the Best Fitness Value When Using the Batch Sequence The eect o prohibiting speciic phases throughout the search was less beneicial, as shown in Figure 18. Speciic phases can only be saely prohibite when the unction is relatively simple an a speciic conition (such as no loops, no variables, or no unconitional jumps) can be etecte. Several applications, such as stringsearch, ha no or very ew unctions that met these criteria. The simpler unctions also tene to converge aster to the best sequence oun in the search since the sequence length establishe by the length o the batch compilation was typically shorter. Likewise, the simpler unctions also have little impact on the size o the entire application an have little impact on spee when they are not requently execute. Figure 19: Number o Generations beore Fining the Best Fitness Value When Prohibiting Prior Dormant Phases Figure 20: Number o Generations beore Fining the Best Fitness Value When Prohibiting Unenable Phases Figure 18: Number o Generations beore Fining the Best Fitness Value When Prohibiting Speciic Phases In contrast, prohibiting prior ormant an unenable phases, which are epicte in Figures 19 an 20, ha a more signiicant impact since these techniques coul be applie to all unctions. Without using these two techniques, it was oten the case that many phases were reattempte when there was no opportunity or them to be active. Applying all the techniques prouce the best overall results, as shown in Figure 21. In act, only about 32% o the generations on average (rom to 8.24) were require to in the best sequence in the search as compare to the baseline. As expecte, applying all o the techniques i not result in the sum o the beneits o the iniviual techniques since some o the phases that were prohibite woul be caught by multiple techniques. Figure 21: Number o Generations beore Fining the Best Fitness Value When Applying All Techniques Consier again Figure 15, which epicts the number o avoie executions. The bottom bar or each benchmark shows the number o executions that are avoie when all o the techniques escribe in Section 5 are applie. One can see that while the number o reunantly attempte sequences ecrease, the number o sequences caught by the three other techniques increase. The remaining reunantly attempte sequences were the sequences create by the crossover operation an the best sequence in the population, which were not subject to mutation, an the reunant sequences with only active phases. The av erage number o avoie executions ecreases by about 10%, which means a greater number o unctions with unique coe were 179

10 generate. However, the ecrease in avoie executions is much less than the average ecrease in generations require to reach the best sequence oun in the search, as shown in Figure 21. Figure 22 shows the relative time or ining the best itness value when all o the techniques in Section 5 were applie. The actual times are shown in minutes since ining the best sequence is accomplishe in a raction o the total generations perorme in the search. Note the baseline or ining the best itness value inclues all o the methos escribe in Section 4 to avoi unnecessary executions. The best itness value was oun in 53.0% o the time on average as compare to the baseline. Figure 23: Spee Only Improvements or the ARM Figure 22: Relative Search Time beore Fining the Best Fitness Value Ater ensuring that the techniques we evelope to improve the search time or eective sequences were soun, we obtaine results on the Intel StrongARM SA-110 processor. Figures 23, 24, an 25 show the percentage improvement when optimizing or spee only, size only, an 50% or each actor, respectively. The av erage time require to obtain results or each o the benchmarks when optimizing or both spee an size on the ARM require hours. Using the average ratio shown in Figure 16, we estimate it woul have taken over hours without applying the techniques in Section IMPLEMENTATION ISSUES During the process o this investigation, we encountere several implementation issues that mae this work challenging. First, proucing coe that always generates the correct output or ierent optimization phase sequences is iicult. Even implementing a conventional compiler that always generates coe that prouces correct output when applying one preeine sequence o optimization phases is not an easy task. In contrast, generating coe that always correctly executes or thousans o ierent optimization phase sequences is a severe stress test. Ensuring that all sequences in the experiments prouce vali coe require tracking own many errors that ha not yet been iscovere in the VISTA system. Secon, the techniques presente in Sections 5.2 an 5.4 require analysis an jugement by the compiler writer to etermine when optimization phases will be enable. We inserte sanity checks when running experiments without using these methos to ensure that our assertions concerning the enabling o optimization phases were accurate. We oun several cases where our reasoning was aulty ater inspecting the situations uncovere by these sanity checks an we were able to correct our enabling assertions. Thir, we sometimes oun that ormant optimization phases i have unexpecte sie eects by changing the analysis inormation, which coul enable or isable a subsequent Figure 24: Size Only Improvements or the ARM Figure 25: Size an Spee Improvements or the ARM optimization phase. These sie eects can aect the results o the methos escribe in Sections 4.2, 5.3, an 5.4. We also inserte sanity checks to ensure that ierent ormant phases i not cause ierent eects on subsequent phases. We etecte when these situations occurre, properly set the inormation about what analysis is require an invaliate by each optimization phase, an now rarely encounter these problems. 8. FUTURE WORK There is much uture research that can be accomplishe on proviing ast searches or eective optimization sequences. We have shown that etecting when a particular optimization phase will be ormant can result in ewer generations to converge on the best sequence in the search. We believe it is possible to estimate the likelihoo that a particular optimization phase will be active 180

11 given the active phases that precee it by empirically collecting this inormation. This inormation coul be exploite by ajusting the mutation operation to more likely mutate to phases that have a better chance o being active with the goal o converging to a better itness value in ewer generations. Another area o uture work is to vary the characteristics o the search. It woul be interesting to see the eect on a search as one changes aspects o genetic algorithm, such as the sequence length, population size, number o generations, etc. We may in that certain search characteristics may be better or one class o unctions, while other characteristics may be better or other unctions. In aition, it woul be interesting to perorm searches involving more compiler optimizations an benchmarks. Finally, the use o a cluster o processors can reuce the search time. Certainly ierent sequences within a population can be evaluate in parallel [15]. Likewise, unctions within the same application can be evaluate inepenently. Even with the use o a cluster, the techniques we have presente in our paper woul still be useul since they will urther enhance the search time. In aition, not every eveloper has access to a cluster. 9. CONCLUSIONS There are several contributions that we have presente in this paper. First, we have shown there are eective methos to reuce the search overhea or ining eective optimization phase sequences by avoiing expensive executions or simulations. Detecting when a phase was active or ormant by instrumenting the compiler was very useul since many sequences can be etecte as reunant by memoizing the results o active phase sequences. We also iscovere that the same coe is oten generate by ierent sequences. We emonstrate that using eicient mechanisms, such as a CRC checksum, to check or ientical or equivalent unctions can also signiicantly reuce the number o require executions o an application. Secon, we have shown that on average the number o generations require to in the best sequence can be reuce by over two thirs. One simple, but eective technique is to insert the active sequence o phases rom the batch compilation as one o the sequences in the initial population. We also oun that we coul oten use analysis an empirical ata to etermine when phases coul not be active. These techniques result in aster convergence to more eective sequences, which can allow equally eective searches to be perorme with ewer generations o the genetic algorithm. An environment to tune the sequence o optimization phases or each unction in an embee application can be very beneicial. However, the overhea o perorming searches or eective sequences using a genetic algorithm can be quite signiicant an this problem is exacerbate when perormance measurements or an application are obtaine by simulation or on a slower embee processor. Many ev elopers are willing to wait or tasks to run overnight to improve a prouct, but are unwilling to wait longer. We hav e shown that the search overhea can be signiicantly reuce, perhaps to a tolerable level, by using methos to avoi reunant executions an techniques to converge to the best sequence it can in in ewer generations. ACKNOWLEDGEMENTS Clark Coleman an the anonymous reviewers provie helpul suggestions that improve the quality o the paper. This research was supporte in part by National Science Founation grants EIA , ACI , CCR , ACI , an CCR REFERENCES [1] M. E. Benitez an J. W. Davison, A Portable Global Optimizer an Linker, Proceeings o the SIGPLAN 88 Symposium on Programming Language Design an Implementation, pp (June 1988). [2] M. E. Benitez an J. W. Davison, The Avantages o Machine-Depenent Global Optimization, Proceeings o the Conerence on Programming Languages an Systems Architectures, pp (March 1994). [3] B. Caler, D. Grunwal, an D. Linsay, Corpus-base Static Branch Preiction, Proceeings o the SIGPLAN 95 Conerence on Programming Language Design an Implementation, pp (June 1995). [4] K. Chow an Y. Wu, Feeback-Directe Selection an Characterization o Compiler Optimizations, Workshop on Feeback-Directe Optimization, (November 1999). [5] K. Cooper, P. Schielke, an D. Subramanian, Optimizing or Reuce Coe Space Using Genetic Algorithms, ACM SIGPLAN Workshop on Languages, Compilers, an Tools or Embee Systems, pp. 1-9 (May 1999). [6] K. Cooper, D. Subramanian, an L. Torczon, Aaptive Optimizing Compilers or the 21st Century, Journal o Supercomputing 23(1) pp (). [7] T. Granlun an R. Kenner, Eliminating Branches using a Superoptimizer an the GNU C Compiler, Proceeings o the SIGPLAN 92 Conerence on Programming Language Design an Implementation, pp (June 1992). [8] M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Muge, an R. Brown, MiBench: A Free, Commercially Representative Embee Benchmark Suite, IEEE Workshop on Workloa Characterization, (December 2001). [9] J. Hollan, Aaptation in Natural an Artiicial Systems, Aison-Wesley (1989). [10] T. Kisuki, P. Knijnenburg, an M. O Boyle, Combine Selection o Tile Sizes an Unroll Factors Using Iterative Compilation, Proceeings o the 2000 International Conerence on Parallel Architectures an Compilation Techniques, pp (October 2000). [11] P. Kulkarni, W. Zhao, H. Moon, K. Cho, D. Whalley, J. Davison, M. Bailey, Y. Paek, an K. Gallivan, Fining Eective Optimization Phase Sequences, ACM SIGPLAN Conerence on Languages, Compilers, an Tools or Embee Systems, pp (June 2003). [12] H. Massalin, Superoptimizer - A Look at the Smallest Program, Proceeings o the 2n International Conerence on Architectural Support or Programming Languages an Operating Systems, pp (October, 1987). [13] A. Nisbet, Genetic Algorithm Optimize Parallelization, Workshop on Proile an Feeback Directe Compilation, (1998). 181

Fast Searches for Effective Optimization Phase Sequences

Fast Searches for Effective Optimization Phase Sequences Fast Searches for Effective Optimization Phase Sequences Prasad Kulkarni, Stephen Hines, Jason Hiser David Whalley, Jack Davidson, Douglas Jones Computer Science Department, Florida State University, Tallahassee,

More information

In Search of Near-Optimal Optimization Phase Orderings

In Search of Near-Optimal Optimization Phase Orderings In Search of Near-Optimal Optimization Phase Orderings Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Languages, Compilers, and Tools for Embedded Systems Optimization Phase Ordering

More information

Evaluating Heuristic Optimization Phase Order Search Algorithms

Evaluating Heuristic Optimization Phase Order Search Algorithms Evaluating Heuristic Optimization Phase Order Search Algorithms by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Computer Science Department, Florida State University, Tallahassee,

More information

Design Management Using Dynamically Defined Flows

Design Management Using Dynamically Defined Flows Design Management Using Dynamically Deine Flows Peter R. Sutton*, Jay B. Brockman** an Stephen W. Director* *Department o Electrical an Computer Engineering, Carnegie Mellon University, Pittsburgh, PA

More information

Learning Polynomial Functions. by Feature Construction

Learning Polynomial Functions. by Feature Construction I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate

More information

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System CS341: Operating System Lect31: 21 st Oct 2014 Dr A Sahu Dept o Comp Sc & Engg Inian Institute o Technology Guwahati ain Contiguous Allocation, Segmentation, Paging Page Table an TLB Paging : Larger Page

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly International Journal "Information Technologies an Knowlege" Vol. / 2007 309 [Project MINERVAEUROPE] Project MINERVAEUROPE: Ministerial Network for Valorising Activities in igitalisation -

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

432μm laser s beam waist measurement of. polarimeter/interferometer on the EAST Tokamak

432μm laser s beam waist measurement of. polarimeter/interferometer on the EAST Tokamak 43μm laser s beam waist measurement o polarimeterintererometer on the EAST Tokamak Z.X.Wang, H.Q.Liu, Y.X.Jie, M.Q.Wu, T.Lan, X.Zhu, Z.Y.Zou, Y.Yang, X.C.Wei, L.Zeng, G.S. Li, X. Gao Institute o Plasma

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introuction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threas 6. CPU Scheuling 7. Process Synchronization 8. Dealocks 9. Memory Management 10.Virtual Memory

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

Short-term prediction of photovoltaic power based on GWPA - BP neural network model

Short-term prediction of photovoltaic power based on GWPA - BP neural network model Short-term preiction of photovoltaic power base on GWPA - BP neural networ moel Jian Di an Shanshan Meng School of orth China Electric Power University, Baoing. China Abstract In recent years, ue to China's

More information

Exhaustive Phase Order Search Space Exploration and Evaluation

Exhaustive Phase Order Search Space Exploration and Evaluation Exhaustive Phase Order Search Space Exploration and Evaluation by Prasad Kulkarni (Florida State University) / 55 Compiler Optimizations To improve efficiency of compiler generated code Optimization phases

More information

NAND flash memory is widely used as a storage

NAND flash memory is widely used as a storage 1 : Buffer-Aware Garbage Collection for Flash-Base Storage Systems Sungjin Lee, Dongkun Shin Member, IEEE, an Jihong Kim Member, IEEE Abstract NAND flash-base storage evice is becoming a viable storage

More information

Message Transport With The User Datagram Protocol

Message Transport With The User Datagram Protocol Message Transport With The User Datagram Protocol User Datagram Protocol (UDP) Use During startup For VoIP an some vieo applications Accounts for less than 10% of Internet traffic Blocke by some ISPs Computer

More information

VISTA: VPO Interactive System for Tuning Applications

VISTA: VPO Interactive System for Tuning Applications VISTA: VPO Interactive System for Tuning Applications PRASAD KULKARNI, WANKANG ZHAO, STEPHEN HINES, DAVID WHALLEY, XIN YUAN, ROBERT VAN ENGELEN and KYLE GALLIVAN Computer Science Department, Florida State

More information

Automatic Selection of GCC Optimization Options Using A Gene Weighted Genetic Algorithm

Automatic Selection of GCC Optimization Options Using A Gene Weighted Genetic Algorithm Automatic Selection of GCC Optimization Options Using A Gene Weighted Genetic Algorithm San-Chih Lin, Chi-Kuang Chang, Nai-Wei Lin National Chung Cheng University Chiayi, Taiwan 621, R.O.C. {lsch94,changck,naiwei}@cs.ccu.edu.tw

More information

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency Software Reliability Moeling an Cost Estimation Incorporating esting-effort an Efficiency Chin-Yu Huang, Jung-Hua Lo, Sy-Yen Kuo, an Michael R. Lyu -+ Department of Electrical Engineering Computer Science

More information

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing Inexing the Eges A simple an yet efficient approach to high-imensional inexing Beng Chin Ooi Kian-Lee Tan Cui Yu Stephane Bressan Department of Computer Science National University of Singapore 3 Science

More information

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

Just-In-Time Software Pipelining

Just-In-Time Software Pipelining Just-In-Time Software Pipelining Hongbo Rong Hyunchul Park Youfeng Wu Cheng Wang Programming Systems Lab Intel Labs, Santa Clara What is software pipelining? A loop optimization exposing instruction-level

More information

Interior Permanent Magnet Synchronous Motor (IPMSM) Adaptive Genetic Parameter Estimation

Interior Permanent Magnet Synchronous Motor (IPMSM) Adaptive Genetic Parameter Estimation Interior Permanent Magnet Synchronous Motor (IPMSM) Aaptive Genetic Parameter Estimation Java Rezaie, Mehi Gholami, Reza Firouzi, Tohi Alizaeh, Karim Salashoor Abstract - Interior permanent magnet synchronous

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

Zero Disparity Filter Based on Wavelet Representation in an Active Vision System

Zero Disparity Filter Based on Wavelet Representation in an Active Vision System IEEE ICSP'96, Beiing, China Huang Yu etal Zero isparity Filter Base on Wavelet Representation in an Active Vision System Huang Yu, Yuan Baozong Institute o Inormation Science, Northern Jiaotong University,

More information

A Novel Accurate Genetic Algorithm for Multivariable Systems

A Novel Accurate Genetic Algorithm for Multivariable Systems World Applied Sciences Journal 5 (): 137-14, 008 ISSN 1818-495 IDOSI Publications, 008 A Novel Accurate Genetic Algorithm or Multivariable Systems Abdorreza Alavi Gharahbagh and Vahid Abolghasemi Department

More information

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem Inuence of Cross-Interferences on Blocke Loops A Case Stuy with Matrix-Vector Multiply CHRISTINE FRICKER INRIA, France an OLIVIER TEMAM an WILLIAM JALBY University of Versailles, France State-of-the art

More information

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract Optimization Methos for Calculating Design Imprecision y William S. Law Eri K. Antonsson Engineering Design Research Laboratory Division of Engineering an Applie Science California Institute of Technology

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

arxiv:cs/ v1 [cs.lo] 15 Jun 2002

arxiv:cs/ v1 [cs.lo] 15 Jun 2002 Sierpinski Gaskets or Logic Functions Representation arxiv:cs/2624v [cs.lo] 5 Jun 22 Denis V. Popel Department o Computer Science, Baker University, Balwin City, KS 666-65, U.S.A. Denis.Popel@bakeru.eu

More information

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks 01 01 01 01 01 00 01 01 Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks Mihaela Carei, Yinying Yang, an Jie Wu Department of Computer Science an Engineering Floria Atlantic University

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle  holds various files of this Leiden University dissertation Cover Page Te anle ttp://l.anle.net/1887/25180 ols various iles o tis Leien University issertation Autor: Rietvel, K.F.D. Title: A versatile tuple-base optimization ramewor Issue Date: 2014-04- CHAPTER

More information

Image Segmentation using K-means clustering and Thresholding

Image Segmentation using K-means clustering and Thresholding Image Segmentation using Kmeans clustering an Thresholing Preeti Panwar 1, Girhar Gopal 2, Rakesh Kumar 3 1M.Tech Stuent, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,

More information

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions.

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions. Overview Operating Systems I Management Provie Services processes files Manage Devices processor memory isk Simple Management One process in memory, using it all each program nees I/O rivers until 96 I/O

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

Fast Fractal Image Compression using PSO Based Optimization Techniques

Fast Fractal Image Compression using PSO Based Optimization Techniques Fast Fractal Compression using PSO Base Optimization Techniques A.Krishnamoorthy Visiting faculty Department Of ECE University College of Engineering panruti rishpci89@gmail.com S.Buvaneswari Visiting

More information

AnyTraffic Labeled Routing

AnyTraffic Labeled Routing AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,

More information

ACE: And/Or-parallel Copying-based Execution of Logic Programs

ACE: And/Or-parallel Copying-based Execution of Logic Programs ACE: An/Or-parallel Copying-base Execution of Logic Programs Gopal GuptaJ Manuel Hermenegilo* Enrico PontelliJ an Vítor Santos Costa' Abstract In this paper we present a novel execution moel for parallel

More information

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember 107 IEICE TRANS INF & SYST, VOLE88 D, NO5 MAY 005 LETTER An Improve Neighbor Selection Algorithm in Collaborative Filtering Taek-Hun KIM a), Stuent Member an Sung-Bong YANG b), Nonmember SUMMARY Nowaays,

More information

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks sensors Article EDOVE: Energy an Depth Variance-Base Opportunistic Voi Avoiance Scheme for Unerwater Acoustic Sensor Networks Safar Hussain Bouk 1, *, Sye Hassan Ahme 2, Kyung-Joon Park 1 an Yongsoon Eun

More information

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation On Effectively Determining the Downlink-to-uplink Sub-frame With Ratio for Mobile WiMAX Networks Using Spline Extrapolation Panagiotis Sarigianniis, Member, IEEE, Member Malamati Louta, Member, IEEE, Member

More information

Evolutionary Optimisation Methods for Template Based Image Registration

Evolutionary Optimisation Methods for Template Based Image Registration Evolutionary Optimisation Methos for Template Base Image Registration Lukasz A Machowski, Tshilizi Marwala School of Electrical an Information Engineering University of Witwatersran, Johannesburg, South

More information

Dual Arm Robot Research Report

Dual Arm Robot Research Report Dual Arm Robot Research Report Analytical Inverse Kinematics Solution for Moularize Dual-Arm Robot With offset at shouler an wrist Motivation an Abstract Generally, an inustrial manipulator such as PUMA

More information

Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches

Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches 1/26 Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches Michael R. Jantz Prasad A. Kulkarni Electrical Engineering and Computer Science, University of Kansas

More information

I DT MC. Operating Manual SINAMICS S120. Verification of Performance Level e in accordance with EN ISO

I DT MC. Operating Manual SINAMICS S120. Verification of Performance Level e in accordance with EN ISO I DT MC Operating Manual SINAMICS S20 Verification of Performance Level e in accorance with EN ISO 3849- Document Project Status: release Organization: I DT MC Baseline:.2 Location: Erl. F80 Date: 24.09.2009

More information

Compiler Optimisation

Compiler Optimisation Compiler Optimisation Michael O Boyle mob@inf.e.ac.uk Room 1.06 January, 2014 1 Two recommene books for the course Recommene texts Engineering a Compiler Engineering a Compiler by K. D. Cooper an L. Torczon.

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

Selection Strategies for Initial Positions and Initial Velocities in Multi-optima Particle Swarms

Selection Strategies for Initial Positions and Initial Velocities in Multi-optima Particle Swarms ACM, 2011. This is the author s version of the work. It is poste here by permission of ACM for your personal use. Not for reistribution. The efinitive version was publishe in Proceeings of the 13th Annual

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique International OPEN ACCESS Journal Of Moern Engineering Research (IJMER) An Aaptive Routing Algorithm for Communication Networks using Back Pressure Technique Khasimpeera Mohamme 1, K. Kalpana 2 1 M. Tech

More information

Improving Both the Performance Benefits and Speed of Optimization Phase Sequence Searches

Improving Both the Performance Benefits and Speed of Optimization Phase Sequence Searches Improving Both the Performance Benefits and Speed of Optimization Phase Sequence Searches Prasad A. Kulkarni Michael R. Jantz University of Kansas Department of Electrical Engineering and Computer Science,

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mit.eu 6.854J / 18.415J Avance Algorithms Fall 2008 For inormation about citing these materials or our Terms o Use, visit: http://ocw.mit.eu/terms. 18.415/6.854 Avance Algorithms

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Local Path Planning with Proximity Sensing for Robot Arm Manipulators. 1. Introduction

Local Path Planning with Proximity Sensing for Robot Arm Manipulators. 1. Introduction Local Path Planning with Proximity Sensing for Robot Arm Manipulators Ewar Cheung an Vlaimir Lumelsky Yale University, Center for Systems Science Department of Electrical Engineering New Haven, Connecticut

More information

Baring it all to Software: The Raw Machine

Baring it all to Software: The Raw Machine Baring it all to Software: The Raw Machine Elliot Waingol, Michael Taylor, Vivek Sarkar, Walter Lee, Victor Lee, Jang Kim, Matthew Frank, Peter Finch, Srikrishna Devabhaktuni, Rajeev Barua, Jonathan Babb,

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

Incremental Detection of Text on Road Signs

Incremental Detection of Text on Road Signs Incremental Detection o Tet on Roa Signs Wen Wu Xilin hen Jie Yang arch 9, 4 U-S-4-6 School o omputer Science arnegie ellon Universit ittsburgh, A 5 Abstract This paper presents a ramework or incremental

More information

Chapter 5 Proposed models for reconstituting/ adapting three stereoscopes

Chapter 5 Proposed models for reconstituting/ adapting three stereoscopes Chapter 5 Propose moels for reconstituting/ aapting three stereoscopes - 89 - 5. Propose moels for reconstituting/aapting three stereoscopes This chapter offers three contributions in the Stereoscopy area,

More information

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory Feature Extraction an Rule Classification Algorithm of Digital Mammography base on Rough Set Theory Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative

More information

Recitation Caches and Blocking. 4 March 2019

Recitation Caches and Blocking. 4 March 2019 15-213 Recitation Caches an Blocking 4 March 2019 Agena Reminers Revisiting Cache Lab Caching Review Blocking to reuce cache misses Cache alignment Reminers Due Dates Cache Lab (Thursay 3/7) Miterm Exam

More information

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE БСУ Международна конференция - 2 THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE Evgeniya Nikolova, Veselina Jecheva Burgas Free University Abstract:

More information

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks Robust PIM-SM Multicasting using Anycast RP in Wireless A Hoc Networks Jaewon Kang, John Sucec, Vikram Kaul, Sunil Samtani an Mariusz A. Fecko Applie Research, Telcoria Technologies One Telcoria Drive,

More information

On the Placement of Internet Taps in Wireless Neighborhood Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks 1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that

More information

Exploring Context with Deep Structured models for Semantic Segmentation

Exploring Context with Deep Structured models for Semantic Segmentation 1 Exploring Context with Deep Structure moels for Semantic Segmentation Guosheng Lin, Chunhua Shen, Anton van en Hengel, Ian Rei between an image patch an a large backgroun image region. Explicitly moeling

More information

Cloud Search Service Product Introduction. Issue 01 Date HUAWEI TECHNOLOGIES CO., LTD.

Cloud Search Service Product Introduction. Issue 01 Date HUAWEI TECHNOLOGIES CO., LTD. 1.3.15 Issue 01 Date 2018-11-21 HUAWEI TECHNOLOGIES CO., LTD. Copyright Huawei Technologies Co., Lt. 2019. All rights reserve. No part of this ocument may be reprouce or transmitte in any form or by any

More information

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In

More information

HOW DO SECURITY TECHNOLOGIES INTERACT WITH EACH OTHER TO CREATE VALUE? THE ANALYSIS OF FIREWALL AND INTRUSION DETECTION SYSTEM

HOW DO SECURITY TECHNOLOGIES INTERACT WITH EACH OTHER TO CREATE VALUE? THE ANALYSIS OF FIREWALL AND INTRUSION DETECTION SYSTEM HOW O SECURTY TECHNOLOGES NTERACT WTH EACH OTHER TO CREATE VALUE? THE ANALYSS O REWALL AN NTRUSON ETECTON SYSTEM Huseyin CAVUSOGLU Srinivasan RAGHUNATHAN Hasan CAVUSOGLU Tulane University University of

More information

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques Politehnica University of Timisoara Mobile Computing, Sensors Network an Embee Systems Laboratory ing Techniques What is testing? ing is the process of emonstrating that errors are not present. The purpose

More information

A shortest path algorithm in multimodal networks: a case study with time varying costs

A shortest path algorithm in multimodal networks: a case study with time varying costs A shortest path algorithm in multimoal networks: a case stuy with time varying costs Daniela Ambrosino*, Anna Sciomachen* * Department of Economics an Quantitative Methos (DIEM), University of Genoa Via

More information

Verifying performance-based design objectives using assemblybased vulnerability

Verifying performance-based design objectives using assemblybased vulnerability Verying performance-base esign objectives using assemblybase vulnerability K.A. Porter Calornia Institute of Technology, Pasaena, Calornia, USA A.S. Kiremijian Stanfor University, Stanfor, Calornia, USA

More information

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks 0 0 0 0 0 0 0 0 on-uniform Sensor Deployment in Mobile Wireless Sensor etworks Mihaela Carei, Yinying Yang, an Jie Wu Department of Computer Science an Engineering Floria Atlantic University Boca Raton,

More information

Rough Set Approach for Classification of Breast Cancer Mammogram Images

Rough Set Approach for Classification of Breast Cancer Mammogram Images Rough Set Approach for Classification of Breast Cancer Mammogram Images Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative Methos an Information Systems

More information

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Questions? Post on piazza, or  Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)! EE122 Fall 2013 HW3 Instructions Recor your answers in a file calle hw3.pf. Make sure to write your name an SID at the top of your assignment. For each problem, clearly inicate your final answer, bol an

More information

Dense Disparity Estimation in Ego-motion Reduced Search Space

Dense Disparity Estimation in Ego-motion Reduced Search Space Dense Disparity Estimation in Ego-motion Reuce Search Space Luka Fućek, Ivan Marković, Igor Cvišić, Ivan Petrović University of Zagreb, Faculty of Electrical Engineering an Computing, Croatia (e-mail:

More information

Frequency Domain Parameter Estimation of a Synchronous Generator Using Bi-objective Genetic Algorithms

Frequency Domain Parameter Estimation of a Synchronous Generator Using Bi-objective Genetic Algorithms Proceeings of the 7th WSEAS International Conference on Simulation, Moelling an Optimization, Beijing, China, September 15-17, 2007 429 Frequenc Domain Parameter Estimation of a Snchronous Generator Using

More information

Automation of Bird Front Half Deboning Procedure: Design and Analysis

Automation of Bird Front Half Deboning Procedure: Design and Analysis Automation of Bir Front Half Deboning Proceure: Design an Analysis Debao Zhou, Jonathan Holmes, Wiley Holcombe, Kok-Meng Lee * an Gary McMurray Foo Processing echnology Division, AAS Laboratory, Georgia

More information

Algorithm for Intermodal Optimal Multidestination Tour with Dynamic Travel Times

Algorithm for Intermodal Optimal Multidestination Tour with Dynamic Travel Times Algorithm for Intermoal Optimal Multiestination Tour with Dynamic Travel Times Neema Nassir, Alireza Khani, Mark Hickman, an Hyunsoo Noh This paper presents an efficient algorithm that fins the intermoal

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks Coorinating Distribute Algorithms for Feature Extraction Offloaing in Multi-Camera Visual Sensor Networks Emil Eriksson, György Dán, Viktoria Foor School of Electrical Engineering, KTH Royal Institute

More information

10. SOPC Builder Component Development Walkthrough

10. SOPC Builder Component Development Walkthrough 10. SOPC Builder Component Development Walkthrough QII54007-9.0.0 Introduction This chapter describes the parts o a custom SOPC Builder component and guides you through the process o creating an example

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

r[2] = M[x]; M[x] = r[2]; r[2] = M[x]; M[x] = r[2];

r[2] = M[x]; M[x] = r[2]; r[2] = M[x]; M[x] = r[2]; Using a Swap Instruction to Coalesce Loads and Stores Apan Qasem, David Whalley, Xin Yuan, and Robert van Engelen Department of Computer Science, Florida State University Tallahassee, FL 32306-4530, U.S.A.

More information

Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles

Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles Impact of cache interferences on usual numerical ense loop nests O. Temam C. Fricker W. Jalby University of Leien INRIA University of Versailles Niels Bohrweg 1 Domaine e Voluceau MASI 2333 CA Leien 78153

More information

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition ITERATIOAL JOURAL OF MATHEMATICS AD COMPUTERS I SIMULATIO A eural etwork Moel Base on Graph Matching an Annealing :Application to Han-Written Digits Recognition Kyunghee Lee Abstract We present a neural

More information

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks Architecture Design of Mobile Access Coorinate Wireless Sensor Networks Mai Abelhakim 1 Leonar E. Lightfoot Jian Ren 1 Tongtong Li 1 1 Department of Electrical & Computer Engineering, Michigan State University,

More information

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 1, NO. 4, APRIL 01 74 Towar Efficient Distribute Algorithms for In-Network Binary Operator Tree Placement in Wireless Sensor Networks Zongqing Lu,

More information

The Journal of Systems and Software

The Journal of Systems and Software The Journal of Systems an Software 83 (010) 1864 187 Contents lists available at ScienceDirect The Journal of Systems an Software journal homepage: www.elsevier.com/locate/jss Embeing capacity raising

More information

NET Institute*

NET Institute* NET Institute* www.netinst.org Working Paper #08-24 October 2008 Computer Virus Propagation in a Network Organization: The Interplay between Social an Technological Networks Hsing Kenny Cheng an Hong Guo

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Tuning the WCET of Embedded Applications

Tuning the WCET of Embedded Applications Tuning the WCET of Embedded Applications Wankang Zhao 1,Prasad Kulkarni 1,David Whalley 1, Christopher Healy 2,Frank Mueller 3,Gang-Ryung Uh 4 1 Computer Science Dept., Florida State University, Tallahassee,

More information

Exploring Context with Deep Structured models for Semantic Segmentation

Exploring Context with Deep Structured models for Semantic Segmentation APPEARING IN IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, APRIL 2017. 1 Exploring Context with Deep Structure moels for Semantic Segmentation Guosheng Lin, Chunhua Shen, Anton van en

More information