Portable Compiler Optimisation Across Embedded Programs and Microarchitectures using Machine Learning

Size: px
Start display at page:

Download "Portable Compiler Optimisation Across Embedded Programs and Microarchitectures using Machine Learning"

Transcription

1 Portabe Compier Optimisation Across Embedded Programs and Microarchitectures using Machine Learning Christophe Dubach, Timothy M. Jones, Edwin V. Bonia Members of HiPEAC Schoo of Informatics University of Edinburgh, UK Grigori Fursin Member of HiPEAC INRIA Sacay, France Michae F.P. O Boye Member of HiPEAC Schoo of Informatics University of Edinburgh, UK ABSTRACT Buiding an optimising compier is a difficut and time consuming task which must be repeated for each generation of a microprocessor. As the underying microarchitecture changes from one generation to the next, the compier must be retuned to optimise specificay for that new system. It may take severa reeases of the compier to effectivey expoit a processor s performance potentia, by which time a new generation has appeared and the process starts again. We address this chaenge by deveoping a portabe optimising compier. Our approach empoys machine earning to automaticay earn the best optimisations to appy for any new program on a new microarchitectura configuration. It achieves this by earning a mode off-ine which maps a microarchitecture description pus the hardware counters from a singe run of the program to the best compier optimisation passes. Our compier gains 67% of the maximum speedup obtainabe by an iterative compier using 000 evauations. We obtain, on average, a.6x speedup over the highest defaut optimisation eve across an entire microarchitecture configuration space, achieving a 4.3x speedup in the best case. We demonstrate the robustness of this technique by appying it to an extended microarchitectura space where we achieve comparabe performance. Categories and Subject Descriptors D.3.4 [Programming anguages]: Processors Compiers; Optimization; Retargetabe compiers; C.0 [Computer Systems Organization]: Genera Hardware/software interfaces; C.4 [Computer Systems Organization]: Performance of systems Design studies; Modeing techniques; I.2.6 [Artificia inteigence]: Learning. Genera Terms Design, Experimentation, Performance. Permission to make digita or hard copies of a or part of this work for persona or cassroom use is granted without fee provided that copies are not made or distributed for profit or commercia advantage and that copies bear this notice and the fu citation on the first page. To copy otherwise, to repubish, to post on servers or to redistribute to ists, requires prior specific permission and/or a fee. MICRO 09, December 2 6, 2009, New York, NY, USA. Copyright 2009 ACM /09/2...$0.00. Keywords architecture/compier co-design, design-space exporation, machine earning.. INTRODUCTION Creating an optimising compier for a new microprocessor is a time consuming and aborious process. For each new microarchitecture generation the compier has to be retuned and speciaised to the particuar characteristics of the new machine. Severa reeases of a compier might be needed to effectivey expoit the processor s performance potentia, by which time the next microarchitecture generation has been deveoped and the process starts again. This never-ending game of catch-up means that we rarey expoit a shipped processor to the fu and this inevitaby deays the time to market. Athough this is a genera issue for a processor domains, it is particuary acute for embedded systems. Ideay, we woud ike a portabe compier technoogy that provides retargetabe optimisation whie fuy expoiting the characteristics of the new microarchitecture. In other words, given any new processor generation, deiver a compier that automaticay optimises for that new target and achieves high performance. Buiding such a compier is, however, extremey chaenging. This is primariy due to the compexity of the underying machine s behaviour and the varying structure of the programs being compied. Iterative compiation, which tunes each new program on a specific architecture [6, 6, 24, 30], has provided a methodoogy to find good optimisations.techniques such as genetic agorithms [24], hi cimbing [2] or optimisation orchestration [30] have been expored, a showing impressive performance improvements. Athough usefu, these approaches a suffer from the arge number of compiations and executions required to optimise each program. Every time the program or architecture changes, this time-consuming process must be repeated. In order to overcome these chaenges, reers have deveoped compiers that earn optimisation strategies using prior knowedge of other programs behaviour. Stephenson et a. [34] showed that genetic programming can earn good individua compier optimisations on a fixed architecture, eiminating the need for any iterative compiations of the new program. Cavazos et a. [3] showed that this coud be used to earning the best set of compier options on a fixed architecture. Athough these approaches dramaticay reduce or even eiminate the need for extra compiations and executions of the target program, they suffer from the need to entirey retrain the compier whenever the patform changes. In this paper we deveop, to the best of our knowedge, the first compier that can automaticay adapt to underying microarchitectura changes. This enabes portabe performance across different

2 generations of a microprocessor. This represents the first step towards the deveopment of a universa compier that can automaticay optimise appications for any patform without requiring extensive tuning. Given a new microarchitecture, our approach automaticay determines the right optimisation passes for any new program. Our scheme earns a machine earning mode off-ine which maps a microarchitecture description pus the hardware counters from a singe run of the program to the best compier optimisation passes. The earning process is a one-off activity whose cost is amortised across a future users of the compier on subsequent variations of the processor s microarchitecture. Using this approach we can, on average, achieve a.6x speedup over the highest defaut compier optimisation across 200 microarchitectura configurations. In addition, we show that this approach achieves 67% of the maximum performance improvement gained by standard iterative compiation using 000 evauations. Given our approach, a new compier does not need to be tuned whenever the processor microarchitecture changes or a new program needs compiing. This aows compiers to become fuy integrated into the design space exporation of a new processor generation, heping designers to fuy evauate the potentia of any new microarchitecture. Overtime, designers may wish to add new microarchitectura features not originay envisaged. We show that our approach adapts to new microarchitectura configurations and is abe to deiver the same eve of performance. In summary, this paper makes the foowing contributions: We deveop a machine earning mode that can predict the best optimisation passes to use for any new program when compiing for a new microarchitecture configuration; We show how our scheme accuratey deivers the performance improvements avaiabe across the MiBench benchmark suite and an embedded microarchitectura design space; We demonstrate the robustness of our scheme showing that it deivers comparabe performance on an extended microarchitecture space. The next section provides a short exampe demonstrating the difficuty of achieving portabe optimisation. Section 3 provides a description of how our compier is trained and depoyed using machine earning. Section 4 describes the experimenta setup. Section 5 then evauates the technique and anayses the resuts. Section 7 evauates this approach on a new extended space. This is foowed by a description of reated work in section 8. Finay, section 9 concudes the paper. 2. EXAMPLE In this paper we imit our study to seecting the right passes within an existing compier framework for varying programs and microarchitectures. Athough this may seem ike a restricted setting, the best optimisation passes to appy vary significanty between programs and microarchitectura configurations. Finding the best set of optimisation passes across programs and microarchitectures is highy non-trivia. To iustrate this point, consider figure which shows segment diagrams for three programs (rijndae_s, un and madpay) on three microarchitectures from our design space (described in section 4). For these three programs and microarchitectures, we found the best optimisation passes to appy (described in section 4.3). These optimisations ead to significant speedups, ranging from.6x to 2.62x speedup over the highest defaut optimisation eve. In this exampe we show ony five significant optimisation passes: bock reordering, oop unroing, function inining, instruction scheduing and goba common sub-expression eimination, abeed Microarchitecture A B C Program rijndae_e un madpay XScae Xscae with sma insn cache Xscae with sma insn cache sma data cache Bock reordering Loop unroing Function inining Goba common subexpression eimination Instruction scheduing Figure : Segment diagrams representing the optimisation passes to enabe in order to achieve the highest performance for three programs executed on three microarchitectures. A fied segment means that the optimisation shoud be enabed, an empty one means it shoud be disabed. on the right of figure. For each program/microarchitecture pair there is a circe of five segments representing the five passes. If the segment is fied, then the corresponding optimisation shoud be enabed for the given program and microarchitecture. If empty, it shoud be disabed. What is immediatey cear is that the best set of optimisation passes to appy changes across programs and microarchitectures. If we consider madpay for instance, three optimisations shoud be enabed for microarchitecture A, a different set of three for B and four enabed for configuration C. If we now consider microarchitecture B, two optimisations shoud be turned on for rijndae_e, four when compiing un and ony three for madpay. Given the arge number of optimisations avaiabe in a typica compier, providing a portabe optimising compier for programs and microarchitectures is non-trivia. However, there are simiarities between programs and microarchitectures that we can expoit. The best set of optimisations for rijndae_e on configuration C and madpay on microarchitecture A are exacty the same. This is aso true for un on microarchitectures B and C and madpay on configuration C. If we can somehow characterise the program madpay on configuration A and reate it to the characteristics of rijndae_e on microarchitecture C, then we can appy the same optimisation passes to madpay as we did to rijndae_e. This wi aow us to obtain the best speedups on this new program/microarchitecture pair without having ever seen madpay or configuration A before. In the next section we deveop a machine earning mode that automaticay identifies these simiarities. We then use and evauate it in section 5 to provide portabe optimisation across programs and microarchitectures. 3. ENABLING PORTABLE OPTIMISATION As seen in the previous section, the best performance is achieved by appying different optimisations depending on the program and the underying microarchitecture. This means that with current approaches to tuning an optimising compier [, 3, 29, 34] a new compier needs to be deveoped for each new generation or variation of the microarchitecture. To overcome this probem, we deveop a machine earning mode that automaticay adapts the compier s optimisation strategy for any program and any microarchitecture variation. 3. Overview Figure 2 gives an overview of our compier s structure. The too works ike any other compier, taking as an input the source code of a program and producing an optimised binary. However, in addition to the source code our compier has two other inputs which

3 Tabe : Performance counters used as a representation of program/microarchitecture pairs. Performance Counter Instructions per cyce Insn cache access rate ALU usage Decoder access rate Insn cache miss rate Mac usage Register fie access rate Data cache access rate Shifter usage Branch pred. access rate Data cache miss rate Figure 2: Overview of our portabe optimising compier. The compier takes in a program source, some performance counters and a microarchitecture description and outputs an optimised program binary for the microarchitecture. At the heart of the compier is a machine earning mode that predicts the best passes to run, controing the optimisations appied. it uses internay to optimise the program specificay for the microarchitecture it wi run on. Firsty, our compier takes in a description of the microarchitecture to target. This is simiar to standard compiers where this description is hard-coded in a machine description fie; here it is just an input. Secondy, it takes in performance counters derived from a previous run of the program. This is simiar to feedback-directed compiers that typicay use profiing information from a previous run to generate an optimised version of the program. However, unike any existing technique, our compier generates an optimised binary specificay for the target microarchitecture even when it has never seen the program or the microarchitecture before. Therefore, the compier does not have to be modified or regenerated whenever a new program or microarchitecture is encountered. At the heart of our compier is a mode that correates the behaviour of the new input program and microarchitecture with programs and microarchitectures that it has previousy seen. Such a mode is buit using machine earning and our approach can be considered as a three stage process: generating training data, buiding a mode and depoying it. The next three subsections describe each of these activities in detai, aowing us to create the overa compier shown in figure Generating Training Data In order to buid a mode that predicts good optimisation passes, we need exampes of various optimisation passes on different programs and microarchitectures as we as a description of each program and microarchitecture. We generate this training data by evauating N different sets of optimisation passes, y, on a set of training program/microarchitecture pairs, X,..., X M, and recording their execution times, t. We can characterise a program/microarchitecture pair using a vector of features x...,x M. Therefore, for each program/microarchitecture pair X j we have an associated dataset D j = {(y i, t i ) j } N i=, with j =,... M. Our goa is to predict the best set of optimisation passes y whenever a new program/microarchitecture X is encountered. Athough the generated dataset may be arge, it is ony a oneoff cost incurred by our mode. Furthermore, techniques such as custering [3] are abe to reduce this and is the subject of future work. Features We characterise program interaction with the processor using performance counters, c, and with the microarchitectures using 8 descriptors, d. The performance counters are shown in tabe and are simiar to those typicay found in processor anaytic modes [2, 22] To capture the features of the microarchitecture we simpy record its static description shown in tabe 2. The performance counters from a program running on a microarchitecture, c, are concatenated together with the microarchitecture description, d, to form a singe feature vector for the program/microarchitecture pair, x = (c,d). 3.3 Buiding a Mode Our aim is to buid a mode, M(x,y), that provides the mapping from any set of program/microarchitecture features to a set of good optimisation passes: M : x y. We approach this probem by earning the mapping from the features, x, to a probabiity distribution over good optimisation passes, q(y x). Once this distribution has been earnt (see next section), prediction on a new program and on a new microarchitecture is achieved by samping at the mode of the distribution. Thus we obtain the predicted set of optimisations by computing: y = argmax q(y x ). () y In other words, we find the vaue of y that gives the greatest probabiity of being a good optimisation Fitting Individua Distributions In order to earn the mode we need to fit a probabiity distribution over good optimisation passes to each training program/microarchitecture. Let g(y X) be a parametric distribution specific to a program/microarchitecture pair X. Note that whereas g(y X) is specific to the identity of a program/microarchitecture pair, q(y x) aows generaisation across programs and microarchitectures by being conditioned on a set of features x. Let Y e be a set of good optimisation passes and p(y X) be the empirica distribution over these passes for program and microarchitecture pair X. We wish to fit the parametric distribution g(y X) for each program/microarchitecture pair to be as cose as possibe to the empirica distribution p(y X). To do this we can minimise the Kuback-Leiber (KL) divergence: 2 fi KL( p(y), g(y)) = og p(y) f = constant + H( p(y), g(y)), g(y) p(y) (2) where H( p(y), g(y)) is the cross-entropy of p(y) and g(y). Thus, we can maximise the objective function: L = H( p(y), g(y)) = X y ey p(y) og g(y). (3) In our experiments we have chosen the set of good optimisations e Y to be those combinations of passes that are within the top 5% of a training optimizations for the respective program/microarchitecture pair. We have then weighted these optimizations uniformy. 2 For the foowing derivation, to simpify the notation, we wi omit the conditiona dependency of these distributions on a specific program/microarchitecture pair X.

4 In principe, our mode s probabiity distribution g(y) can beong to any parametric famiy. However, we have seected a very simpe IID (independent and identicay distributed) mode, where the impact of each pass (y ) is considered to be independent of a others, i.e. g(y) = Q L = g(y ), where L is the number of avaiabe passes. If g(y ) is a mutinomia distribution we have that: where S = {s () S LY LY Y g(y) = g(y ) = = = j= (θ j )I[y =s (j) ], (4),..., s ( S ) } is the set of possibe vaues that the pass y can take (i.e. on, off or a parameter vaue); I[y = s (j) ] is an indicator function that is ony when the particuar optimisation pass y takes on the vaue s (j) y = s (j) and zero otherwise); and θ j (i.e. I[y = s (j) ] = when is the probabiity of the optimisation pass y taking on the particuar vaue s j (i.e. θj = p(y = s (j) )); and, by definition, we have that P j θj =. By using equation (4) in equation (3) and maximising the atter with respect to each parameter θ j subject to the constraints = we obtain: P j θj θ j = X y ey p(y)i[y = s (j) ]. (5) This resut is known as the maximum ikeihood estimator. Since we have used the uniform distribution as p(y), this means that the estimation of the parameters of the IID distribution θ j is the number of (seected) optimisation passes in which y = s (j) divided by the tota number of (seected) passes. Our assumption of statistica independence between compier optimisations may seem simpistic because it is we known that optimisations interact with each other. However, as we see in section 5, this IID mode generaises we across programs and microarchitectures. Our mode works on the assumption that athough compier optimisations do interact, these interactions are ess important across good sets of optimisations. Additionay, more compicated distributions, e.g. a Markov mode, coud be considered without modifying our approach Learning a Predictive Distribution Across Programs and Microarchitectures Once the individua training distributions for each program/microarchitecture pair g(y X) have been obtained, we can earn a predictive distribution q(y x). This wi predict the probabiity of each optimisation pass vaue being good from the features, x, of a program/microarchitecture pair (performance counters, c, and microarchitecture descriptors, d). One possibe way of earning this distribution is to use memorybased methods such as K-nearest neighbours. In other words, we can set the predictive distribution q(y x) to be a convex combination of the K distributions corresponding to the training programs and microarchitectures that are cosest in the feature space to the new (test) program and microarchitecture. The coefficients of the combination are obtained by using: w k = exp{ βd(x (k),x ( ) )} P K k= exp{ βd(x(k),x ( ) )}, (6) where β is a constant and d(, ) is our evauation function, i.e. the eucidean distance of each corresponding nearest training point to the test point, so that the distributions of the cosest training points are assigned arger weights. We have set β = and K = 7 different neighbour programs, athough we have found experimentay that the technique is not sensitive to simiar vaues of K. Tabe 2: Microarchitectura parameters and their vaues. Each parameter varies as a power of 2, meaning 288,000 tota configurations. Aso shown are the vaues for the XScae processor. Parameter Vaues XScae Parameter Vaues XScae IL size 4K... 28K 32K DL size 4K... 28K 32K IL assoc DL assoc IL bock DL bock BTB entries BTB assoc Depoyment Once the mode is buit, it can be used to predict the best optimisation passes for any new program on any new microarchitecture, as shown in figure 2. It does this using just one run of the new program compied with the defaut optimisation eve, O3, on the new microarchitecture. Thus, given a new program/microarchitecture pair, X, we extract its features by using the microarchitecture description, d, and the performance counters from a run of this program (compied with O3) on this microarchitecture, c, so that we form x = (c,d ). We then use equation () above to give the predicted-best optimisation passes, y, compie and execute the program with this new optimisation. 4. DESIGN SPACE The previous section has deveoped a machine earning mode that automaticay predicts the correct optimisation passes to appy for any new program on any microarchitectura configuration. This section describes our experimenta setup ater used to evauate this mode. It aso provides a brief characterisation of the compier and microarchitectura design spaces. 4. Benchmarks We chose to use MiBench [5], a common embedded benchmark suite, we suited to the microarchitecture space of this paper. It contains a mix of programs, from signa processing agorithms to fu office appications. We used a 35 programs, running each to competion using an input set requiring at east 00 miion executed instructions, wherever possibe. Therefore,,, djpeg, and were run with the arge input set, a others were run with the sma inputs. 4.2 Microarchitecture Space In this paper we consider a typica embedded microarchitectura design space based on the XScae processor shown in tabe 2. We show the microarchitecture parameters of the XScae that we varied aong with the vaues each parameter can take. To generate our design space we varied the cache and branch predictor configurations because they are important components of an embedded processor. Other significant parameters such as pipeine depth or votage scaing were not considered, but coud be easiy added to our experimenta setup. We varied the parameters over a wide range of vaues, beyond those in current systems, to fuy expore the design space of this processor s microarchitecture. In tota there are 288,000 different configurations some of which give increased performance over the origina processor (up to 9%), others give significanty reduced power consumption (up to 2%) which is equay significant for embedded processors. In our experiments we used a sampe space of 200 configurations seected with uniform random samping. Each configuration was impemented in the Xtrem simuator [5] which has been vaidated for performance against the XScae processor. We aso used Cacti [35] to accuratey mode the cache access atencies, ensuring our experiments were as reaistic as possibe.

5 Figure 3: Compier optimisations and their parameters. Each is a pass within gcc and can be varied independenty. In tota there are 642 miion combinations. Speedup gs djpeg out fft susan_s ispe ame madpay un rijndae_d rijndae_e Figure 4: Distribution of the maximum speedup avaiabe across a microarchitectures on a per-program basis. The x- axis represents the program and the y-axis the speedup reative to gcc s defaut optimisation eve O3. The centra ine denotes the median speedup. The box represents the 25 and 75 percentie area whie the outer whiskers denote the extreme points of the distribution. Appication to other spaces We have considered an embedded microarchitectura space and benchmark suite because this represents a rea-word chaenge for portabe compier optimisation, based around an existing processor configuration. However, the machine earning schemes we deveop in this paper are independent of this and can equay be appied to other, more compex spaces. In section 7 we extend the microarchitectura space by varying frequency and issue width and show that our approach adapts to the new space. 4.3 Compier Optimisation Space Finding the best optimisation for a specific program on a specific microarchitecture is intractabe. To do this, a equivaent programs, of which there are infinitey many, must be evauated and the best seected. Therefore, in this paper we imit ourseves to finding the best optimisation within a finite space of optimisations. The space we have considered consists of a combinations of the compier passes and their parameters shown in figure 3. These passes are appied within gcc 4.2, an industry standard for the XScae processor. They were found to have a performance impact on the XScae microarchitecture configuration and other reers have expored a simiar space [38], aowing independent comparisons with existing work. Turning passes on or off eads to a design space of 642 miion different optimisations. Varying the parameters controing some of the optimisations, (e.g. gcse has five further options) eads to a tota of unique optimisation passes. AVERAGE Ceary, it is not feasibe to exhaustivey enumerate this entire space to find the best optimisations for each program on each microarchitecture. However, iterative compiation can be used to quicky find an approximation of the best and has been shown to out-perform other approaches [30]. Hence, to find the best optimisations, we used iterative compiation which evauated a 000 different optimisations. These optimisations were seected using uniform random samping. In our experimenta setup we saw amost no additiona improvement after 000 evauations, showing that it is a usefu indicator of the upper bound on reaistic performance achievabe by a compier. 4.4 Characterising the Compier Space Before trying to buid a compier that optimises across microarchitectures, it is important to examine whether there is any performance to gain. For this purpose, we evauated the impact of the compier optimisations on the 35 MiBench programs compied with the 000 random optimisation passes, each of which was executed on the 200 different architectura configurations, as described earier. This corresponds to a sampe space of 7 miion simuations and shoud provide some evidence of the potentia benefits of optimisation passes seection across microarchitectures and programs. We then record the best performance achieved on a per program per architecture basis. Figure 4 shows the sampe space s distribution of maximum speedups for each program across the microarchitectura configurations when compiing with the best set of optimisations per program per microarchitecture. What is immediatey cear is that there is significant variation across the programs. For some the performance improvement is modest; seecting the best optimisations does not hep the ibrary-bound benchmarks or for instance. For rijnadae_e there is significant performance improvement to be gained, ranging from a.2x speedup to 4.8x in the best case,.8x being the average. In the case of the extremes are much ess but on average seecting the best optimisation gives a 2.2x speedup across a configurations. In programs such as, madpay and un, there are modest speedups to be gained on average but significant improvements avaiabe on certain microarchitectures (up to 2.4x for madpay as the top whisker shows). The right-most entry shows that there is an average speedup of.23x avaiabe across the design space if we were abe to seect the best optimisations per program per microarchitecture. The chaenge is to deveop a compier that can automaticay obtain this speedup without having seen the program or target microarchitectura configuration before. Furthermore, it shoud be abe to capture the high performance avaiabe on certain microarchitectures and avoid the potentia sowdowns found by picking the wrong optimisations. We found that choosing the wrong set of passes can ead to an average speedup of 0.7 across programs and 0.2 in the worst

6 Speedup Speedup Microarchitecture fft susan_s out gs djpeg ame ispe rijndae_e rijndae_d un madpay Program Microarchitecture fft susan_s out gs djpeg ame ispe rijndae_e rijndae_d un madpay Program (a) Best passes (b) Our compier Figure 5: Speedup over O3 for each program/microarchitecture pair. Figure (a) shows the best improvement possibe over the programs and microarchitectures. Figure (b) shows the performance of the passes predicted by our compier scheme. Our portabe optimising compier can accuratey predict the best passes to enabe across this space to achieve the speedups avaiabe across a programs and microarchitectures. case (i.e. 5 times sower). The next section evauates experimentay the performance of the machine-earning compier deveoped in section EXPERIMENTAL METHODOLOGY AND RESULTS We first describe our evauation methodoogy and then evauate our approach across programs and microarchitectures. This is foowed by an anaysis of the resuts. 5. Evauation Methodoogy This section describes how we perform our experiment and determine the best performance achievabe in our space. 5.. Cross-Vaidation To evauate the accuracy of our approach we use eave-one-out cross-vaidation. This means that we remove a program and a microarchitecture from our training set, buid a mode based on the remaining data and then predict the best optimisations for the removed program and microarchitecture. Foowing this, the program is compied and executed with the predicted optimisations on that microarchitecture and its performance recorded. We then repeat this for each program/microarchitecture. This is a standard evauation methodoogy for machine earning techniques and means that we never train using the program or microarchitecture for which we wi optimise. Hence, this is a fair evauation methodoogy Best Performance Achievabe In addition to comparison with O3, we want to evauate our approach by assessing how cose its performance is to the maximum achievabe. Athough it is intractabe to determine the best performance that can be achieved by any set of optimisation passes, as stated in section 4.3, we consider the performance of an iterative compier with 000 evauations as being an appropriate upper bound for a compier using a singe profie run. 5.2 Program/Microarchitecture Optimisation Space We first consider the performance of our compier compared to the maximum speedups avaiabe. Figure 5(a) shows the maximum speedups achievabe, when seecting the best optimisations, reative to the defaut optimisation eve, O3, across the program and microarchitectura spaces. The microarchitectures are ordered so that those with arge speedups avaiabe over O3 are on the eft. The benchmarks are ordered so that those with arge performance increases (such as ) are on the right (as in figure 4). In the back corner, the maximum speedup achievabe with the best compier passes is obtained by rijndae_e. This benchmark achieves a 4.85x speedup on a microarchitecture with a sma instruction cache size. The optimisations eading to this resut do not incude any oop optimisations (apart from moving oop-invariant code out of the oops). In particuar, no oop unroing is performed because there is aready extensive, optimised software oop unroing programmed into the source code. The performance of our compier across the programs and microarchitectures is shown in figure 5(b). As is immediatey apparent, it is amost identica to the performance achieved when using the best optimisations, figure 5(a). Our mode is highy accurate at predicting very good compier passes across the programs and microarchitecture space. The coefficient of correation between the performance of the predicted optimisations and the best ones is 0.93 when evauated across the joint program/microarchitecture space. For a program/microarchitecture pairs with arge performance avaiabe, our approach is abe to achieve significant speedups, as is shown by the peaks for programs ispe, madpay, rijndae_d and rijndae_e. These graphs ceary demonstrate that our mode is abe to capture the variation in speedups avaiabe across the program and microarchitecture spaces. The next two sections conduct further evauations of our mode. 5.3 Evauation Across Programs This section focuses on the performance of our compier on each program rather than examining the microarchitectura space. Figure 6 shows the performance of each program when optimised with our portabe optimising compier, reative to compiing with O3,

7 Our Mode Best.5.4 Best Our Mode Speedup Speedup gs djpeg out fft susan_s ispe ame madpay un rijndae_d rijndae_e AVERAGE Microarchitecture Figure 6: The performance of our approach and the best optimisations achieved by iterative compiation for each program, normaised to O3 and averaged over a microarchitectures. Figure 7: The performance of our approach and the best optimisations achieved by iterative compiation for each microarchitecture, normaised to O3 and averaged across programs. averaged across a microarchitecture configurations. The second bar, abeed Best, is the maximum speedup achievabe for each program. On average, our technique obtains a.6x performance improvement across a programs and microarchitectures with just one profie run, achieving a.94x speedup for on average. For three benchmarks in particuar (, rijndae_e and rijndae_d), our scheme achieves significant speedups, approaching the best performance avaiabe. Figure 6 shows that our mode is abe to correcty identify good optimisations, aowing these programs to expoit the arge performance gains when avaiabe. However, figure 6 aso shows that some programs experience minor sowdowns compared with O3. Considering for exampe, our approach achieves ony a 0.97x speedup. This can be expained if we refer back to figure 4 where we can see that there is negigibe performance improvement to be gained over O3 for this program, even when picking the best optimisations per microarchitecture. Unfortunatey, for this benchmark, the majority of optimisations are detrimenta to performance and being ess than 00% accurate in picking optimizations means that compiing with our scheme causes a sma amount of performance oss. Considering our technique compared to the maximum speedup achievabe, we approach Best in most cases. For some programs, such as, we obtain over 95% of the maximum performance. However, for we achieve ony 30%. The reason for this shortfa is due to a subtety in the source code of. The main oop within this benchmark updates a pointer on every iteration, resuting in a arge number of oads and stores. By performing function inining and aowing a arge growth factor (parameter max-inine-insns-auto), this pointer increment is reduced to a simpe register addition which in turn reduces the number of data cache accesses. The performance counters are not sufficienty informative to enabe our machine earning mode to capture this behaviour. This prevents our mode from seecting the best passes. However, the addition of extra features, in particuar code features [9], woud enabe us to pick this up and wi be considered in future work. By way of comparison, standard iterative compiation woud require approximatey 50 iterations on average to achieve simiar performance. For some programs more than 00 iterations woud even be needed to match our mode. This ceary shows the benefit of using machine-earning modes to buid a portabe optimising compier. 5.4 Evauation Across Microarchitectures We now turn our attention to the performance of our compier across the microarchitecture space rather than across programs. Figure 7 shows the performance of our compier compared the best performance avaiabe for each microarchitecture, abeed Best. The microarchitectura configurations are ordered in terms of increasing speedup avaiabe over O3 (i.e. the Best ine). Those on the eft have itte speedup avaiabe whereas those on the right can gain significanty. For our portabe optimising compier we see that the amount of improvement over O3 varies from.08x to.35x. This gives an average speedup of.6x across a programs and microarchitectures. It is important to see that our scheme cosey foows the trend of the Best optimisations, showing how our approach captures the variation between configurations, expoiting architectura features when performance improvements can be achieved. Looking at figure 7 in more detai we can see that it is divided into roughy three regions. On the eft, up to configuration 32, the first region has itte performance improvement avaiabe. A microarchitectures in this area have a sma data cache. Unfortunatey gcc has very few data access optimisations, meaning the avaiabe speedups are reativey sma. Foowing this is the second region where the Best optimisations gain an average.2x speedup and our scheme manages to capture a respectabe.6x. Finay in the third section, after configuration 85, the avaiabe performance improvement increases dramaticay. These microarchitectures on the right have a sma instruction cache, meaning that it is important to prevent code dupication wherever possibe. This is typica of embedded systems where code size is frequenty an important optimisation goa. The performance counter specifying the instruction cache miss rate enabes our mode to earn this from the training programs. In particuar, our compier earns that instruction scheduing (schedue-insns) and function inining (inine-functions) must be disabed to prevent code size increases. In the case of instruction scheduing, this increase is due to a subsequent register aocation pass which emits more spi code for certain schedues. Here we can see an effect of the compex reationships between passes within the compier. Nonetheess, our mode is abe to cope with these interactions and achieve the majority of the speedups avaiabe in this area. 5.5 Summary We have shown that our portabe optimising compier achieves an average.6x speedup over O3 across the entire microarchitecture space for the MiBench benchmark suite. This is equivaent to 67% of the speedup achieved by the Best optimisation passes and is roughy consistent across the architecture configuration space. In addition, our approach is abe to achieve higher eves of performance whenever they are avaiabe, accuratey foowing the trends in the optimisation space across programs and microarchitectures.

8 fthread_jumps fcrossjumping foptimize_sibing_cas fcse_foow_jumps fcse_skip_bocks fexpensive_optimizations fstrength_reduce fre_run_cse_after_oop frerun_oop_opt fcaer_saves fpeephoe2 fregmove freorder_bocks faign_functions faign_jumps faign_oops faign_abes ftree_vrp ftree_pre funswitch_oops fgcse fno_gcse_m fgcse_sm fgcse_as fgcse_after_reoad param_max_gcse_passes fschedue_insns fno_sched_interbock fno_sched_spec finine_functions param_max_inine_insns_auto param_arge_function_insns param_arge_function_growth param_arge_unit_insns param_inine_unit_growth param_inine_ca_cost funro_oops param_max_unro_times param_max_unroed_insns fthread_jumps fcrossjumping foptimize_sibing_cas fcse_foow_jumps fcse_skip_bocks fexpensive_optimizations fstrength_reduce fre_run_cse_after_oop frerun_oop_opt fcaer_saves fpeephoe2 fregmove freorder_bocks faign_functions faign_jumps faign_oops faign_abes ftree_vrp ftree_pre funswitch_oops fgcse fno_gcse_m fgcse_sm fgcse_as fgcse_after_reoad param_max_gcse_passes fschedue_insns fno_sched_interbock fno_sched_spec finine_functions param_max_inine_insns_auto param_arge_function_insns param_arge_function_growth param_arge_unit_insns param_inine_unit_growth param_inine_ca_cost funro_oops param_max_unro_times param_max_unroed_insns gs djpeg out fft susan_s ispe ame madpay un rijndae_d rijndae_e btb_size btb_assoc i_size i_assoc i_bock d_size d_assoc d_bock IPC dec_acc_rate reg_acc_rate bpred_acc_rate icache_acc_rate icache_miss_rate dcache_acc_rate dcache_miss_rate ALU_usg MAC_usg Shft_usg Figure 8: A Hinton diagram showing the optimisations that our mode considers most ikey to affect performance for each benchmark. The arger the box, the more ikey an optimisation affects the performance of the respective program. Figure 9: A Hinton diagram showing the reationship between optimisations and features. The arger the box, the more informative a feature is in predicting whether to appy the optimisation. The next sections anayses our resuts, describing the passes that are important in our space and how our mode seects good optimisation passes for new programs and microarchitectures. 6. ANALYSIS OF RESULTS This section anayses the resuts of our experiments. It first describes how programs affect the choice of optimisation to appy. Then it shows how microarchitecture infuence the optimisations choice. 6. Program Impact on Optimisations Section 5.2 showed that our compier s performance cosey foows the speedups achieved by the best optimisations for each program/microarchitecture pair. We now consider how it achieves this by focusing on those optimisations that are most ikey to affect performance. Note that this is a post-hoc anaysis and, in genera, we cannot know in advance whether an optimisation wi be ikey to affect performance for a specific program and microarchitecture. Figure 8 shows a Hinton diagram of the normaised mutua information between each optimisation and the speedups obtained on each program. Intuitivey, mutua information gives an indication of the impact (good or bad) of a specific compier pass on each program. The arger the box, the greater the impact of the pass. However, as this is a summary across a architectures an optimisation may be important for just a few microarchitectures but not for the others, eading to a sma box being drawn. It is cear from figure 8 that some optimisations are important across a programs, whereas others are ony important to a few benchmarks. For exampe, instruction scheduing (schedue-insns) is important for amost a benchmarks. As discussed in section 5.4, in some cases this optimisation has a negative impact on microarchitectures with a sma instruction cache. Loop unroing (unrooops) is aso an important optimisation for many programs. For programs such as, which contains oops with a known number of iterations, it is important to consider this optimisation to achieve good performance. However, for others, such as rijndae_e, this optimisation does not pay a crucia roe in achieving good performance because extensive unroing is aready impemented in the source code. The optimisation passes affecting function inining (ininefunctions to param-inine-ca-cost) have itte impact on most programs. However, for four programs, ispe,, and, these are the most important passes. By using the mutua information shown in figure 8 our mode focuses on those optimisations that are most ikey to affect performance on a per program/architecture basis. 6.2 Microarchitecture Impact on Optimisations Having anaysed the optimisations that have most impact on different programs, we now turn our attention to the reationship between the microarchitecture and compier optimisations. Figure 9 presents another Hinton diagram showing the microarchitectura impact on the best optimisations to appy. These resuts are averaged over a programs. The features are separated into two groups: the first contains the eight architectura parameters d whist the second contains the performance counters c. Of a the micro architectura parameters, the size of the instruction cache (denoted i_size) has the biggest impact on compier optimisation. In particuar, it strongy infuences the optimisations that contro function inining (inine_functions) and oop unroing (unro_oops) It is therefore critica to predict these optimisations correcty (based on i_size) to avoid increasing the cache miss rate on sma cache configurations. Furthermore, on arger cache configurations, it is important to perform aggressive inining and unroing to expoit the fu potentia of the cache. Now considering the performance counters, d, we can see that IPC has significant impact. This is used by the mode in conjunc-

9 Speedup Our Mode Best gs djpeg out fft susan_s ispe ame madpay un rijndae_d rijndae_e AVERAGE Figure 0: The performance of our approach and the best optimisations achieved by iterative compiation for each program, normaised to O3 and averaged over a microarchitectures on an extended space. tion with the other features to predict the most important optimisations to appy, such as bock reordering (reorder_bocks), goba common subexpression eimination (gcse), instruction scheduing (schedue_insns), function inining (inine_functions) and oop unroing(unro_oops). The performance counters that record cache and branch predictor access/miss rates aso have significant impact on choosing the best optimisation fags. Surprisingy, knowedge of register and functiona unit usage has itte importance in determining the correct compier optimisations to appy. Whie some of these observations may seem rather intuitive, current production compiers, such as gcc, aways use the same strategy when appying optimisation passes, independenty of the architectura parameters. One immediate recommendation woud be to make gcc s unroing and inining optimisations sensitive to the instruction cache size and to make use of branch predictor and cache access performance counters. However, this is just a post hoc anaysis based on the resuts of this space. The technique deveoped in this paper is micro-architectura space neutra enabing a compier to automaticay adapt to any underying microarchitecture, as shown in the next section. 7. EXTENDING THE MICROARCHITEC- TURAL SPACE Whie our approach works we on a predefined architecture space, it is reasonabe to ask how woud it perform if the architecture space was changed at a ater date. We therefore extended our space by varying two microarchitectura parameters not considered in section 4.3, namey frequency and processor width. Frequency ranges from 200 to 600 MHz whie issue width is either or 2. As a reference, the corresponding XScae vaues are 400 MHz and issue width. Given this extended space, we then appied our approach, predicting the best optimisation passes for it. Figure 0 shows the resuting performance across programs, compared to the best performance avaiabe. In this new space, seecting the correct compier optimisation passes has a simiar impact as before. The Best optimisations give an average.24x improvement over O3 compared to.23x in the previous space. Our approach is abe to achieve an improved average of.4x speedup. This is comparabe to the performance achieved on the previous space without any modification to our approach. If we were to incude new features that capture the behaviour of the additiona architectura parameters, the performance of our mode woud be further improved. 8. RELATED WORK There is a significant voume of prior work reated to this paper which we discuss in the foowing seven sections. Domain-Specific Optimisations Yotov et a. [40] investigated a mode-driven approach for the ATLAS sef-tuning inear agebra ibrary that uses the machine description to compute the optima parameters of the optimisations. SPIRAL [32] is another sef-tuned ibrary. It automaticay generates high-performance code for digita signa processing by expoiting domain-specific knowedge to the parameter space at compie-time. These two systems both required domain-specific knowedge and the use of iterative compiation to optimise themseves on the target system. They have to be retuned for each new patform. This contrasts with our work where the compier is buit ony once and optimises across a range of microarchitectures using just one profie run for any new program. Iterative Compiation Iterative compiation optimises a singe program on a specific microarchitecture by ing the optimisation space. Cooper et a. [7] were amongst the first to use a genetic agorithm to sove the phase ordering probem, achieving impressive code size reductions. Later, an extensive study of this probem was conducted, advocating the use of mutipe hi-cimber runs [2]. Vuduc et a. [39] ooked at the probem of optimising a matrix mutipication ibrary using a statistica criterion to stop. Kukarni et a. [24] used their previousy deveoped VISTA compier infrastructure [42] to for effective optimisation phases at a function eve. They buid a tree of effective transformation sequences and use it to imit the of the optimisation space with a genetic agorithm. Orthogonay to this, other reers have focused on finding the best optimisations settings to appy. Triantafyis et a. [36] concentrated on a sma set of optimisations that perform we on a given set of code segments. These are paced in a tree which is traversed to for good optimisation combinations for a new appication. Finay, Pan and Eigenmann [30] compared these techniques with their own agorithm that iterativey eiminates settings with the most negative effect from the space. Compared to our approach, a these techniques specificay tune each program on a per-program, per-architecture basis by ing its optimisation space. Conversey, our technique avoids and recompiation by directy predicting the correct set of compier optimisations to appy on a new micro-architecture. Anaytic Modes for Compiation The use of anaytic modes has aso been investigated to speedup iterative compiation. Triantafyis et a. [36] used an anaytic mode to reduce the required time to evauate different compier optimisations for different code segments. Zhao et a. [4] deveoped an approach named FPO to estimate the impact of different oop transformations. To overcome the high cost of iterative compiation, Cooper et a. [6] deveoped ACME which uses the concept of virtua execution; a simpe anaytic mode that estimates the execution time of basic bocks. Anaytic modes have proved to be usefu for ing the optimisation space quicky. However, since our mode does not perform any but directy predicts the best optimisation passes to appy, they are not appicabe in this context. Machine-Learning Compiers Some of the first reers to incorporate machine earning into optimising compiers were McGovern and Moss [29] who used reinforcement earning for the scheduing of straight-ine code.

Portable Compiler Optimisation Across Embedded Programs and Microarchitectures using Machine Learning

Portable Compiler Optimisation Across Embedded Programs and Microarchitectures using Machine Learning Portabe Compier Optimisation Across Embedded Programs and Microarchitectures using Machine Learning Christophe Dubach, Timothy M. Jones, Edwin V. Bonia Members of HiPEAC Schoo of Informatics University

More information

Mobile App Recommendation: Maximize the Total App Downloads

Mobile App Recommendation: Maximize the Total App Downloads Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

Language Identification for Texts Written in Transliteration

Language Identification for Texts Written in Transliteration Language Identification for Texts Written in Transiteration Andrey Chepovskiy, Sergey Gusev, Margarita Kurbatova Higher Schoo of Economics, Data Anaysis and Artificia Inteigence Department, Pokrovskiy

More information

Automatic Grouping for Social Networks CS229 Project Report

Automatic Grouping for Social Networks CS229 Project Report Automatic Grouping for Socia Networks CS229 Project Report Xiaoying Tian Ya Le Yangru Fang Abstract Socia networking sites aow users to manuay categorize their friends, but it is aborious to construct

More information

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01 Page 1 of 15 Chapter 9 Chapter 9: Deveoping the Logica Data Mode The information requirements and business rues provide the information to produce the entities, attributes, and reationships in ogica mode.

More information

MCSE Training Guide: Windows Architecture and Memory

MCSE Training Guide: Windows Architecture and Memory MCSE Training Guide: Windows 95 -- Ch 2 -- Architecture and Memory Page 1 of 13 MCSE Training Guide: Windows 95-2 - Architecture and Memory This chapter wi hep you prepare for the exam by covering the

More information

As Michi Henning and Steve Vinoski showed 1, calling a remote

As Michi Henning and Steve Vinoski showed 1, calling a remote Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware

More information

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 ECEn 528 Prof. Archibad Lab: Dynamic Scheduing Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 Overview This ab's purpose is to expore issues invoved in the design of out-of-order issue processors.

More information

Neural Network Enhancement of the Los Alamos Force Deployment Estimator

Neural Network Enhancement of the Los Alamos Force Deployment Estimator Missouri University of Science and Technoogy Schoars' Mine Eectrica and Computer Engineering Facuty Research & Creative Works Eectrica and Computer Engineering 1-1-1994 Neura Network Enhancement of the

More information

Quality of Service Evaluations of Multicast Streaming Protocols *

Quality of Service Evaluations of Multicast Streaming Protocols * Quaity of Service Evauations of Muticast Streaming Protocos Haonan Tan Derek L. Eager Mary. Vernon Hongfei Guo omputer Sciences Department University of Wisconsin-Madison, USA {haonan, vernon, guo}@cs.wisc.edu

More information

Space-Time Trade-offs.

Space-Time Trade-offs. Space-Time Trade-offs. Chethan Kamath 03.07.2017 1 Motivation An important question in the study of computation is how to best use the registers in a CPU. In most cases, the amount of registers avaiabe

More information

Special Edition Using Microsoft Excel Selecting and Naming Cells and Ranges

Special Edition Using Microsoft Excel Selecting and Naming Cells and Ranges Specia Edition Using Microsoft Exce 2000 - Lesson 3 - Seecting and Naming Ces and.. Page 1 of 8 [Figures are not incuded in this sampe chapter] Specia Edition Using Microsoft Exce 2000-3 - Seecting and

More information

Sample of a training manual for a software tool

Sample of a training manual for a software tool Sampe of a training manua for a software too We use FogBugz for tracking bugs discovered in RAPPID. I wrote this manua as a training too for instructing the programmers and engineers in the use of FogBugz.

More information

UnixWare 7 System Administration UnixWare 7 System Configuration

UnixWare 7 System Administration UnixWare 7 System Configuration UnixWare 7 System Administration - CH 3 - UnixWare 7 System Configuration Page 1 of 8 [Figures are not incuded in this sampe chapter] UnixWare 7 System Administration - 3 - UnixWare 7 System Configuration

More information

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES Eya En Gad, Akshay Gadde, A. Saman Avestimehr and Antonio Ortega Department of Eectrica Engineering University of Southern

More information

A Memory Grouping Method for Sharing Memory BIST Logic

A Memory Grouping Method for Sharing Memory BIST Logic A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5

More information

Resource Optimization to Provision a Virtual Private Network Using the Hose Model

Resource Optimization to Provision a Virtual Private Network Using the Hose Model Resource Optimization to Provision a Virtua Private Network Using the Hose Mode Monia Ghobadi, Sudhakar Ganti, Ghoamai C. Shoja University of Victoria, Victoria C, Canada V8W 3P6 e-mai: {monia, sganti,

More information

Hiding secrete data in compressed images using histogram analysis

Hiding secrete data in compressed images using histogram analysis University of Woongong Research Onine University of Woongong in Dubai - Papers University of Woongong in Dubai 2 iding secrete data in compressed images using histogram anaysis Farhad Keissarian University

More information

A Petrel Plugin for Surface Modeling

A Petrel Plugin for Surface Modeling A Petre Pugin for Surface Modeing R. M. Hassanpour, S. H. Derakhshan and C. V. Deutsch Structure and thickness uncertainty are important components of any uncertainty study. The exact ocations of the geoogica

More information

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART 13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of

More information

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm

More information

The Big Picture WELCOME TO ESIGNAL

The Big Picture WELCOME TO ESIGNAL 2 The Big Picture HERE S SOME GOOD NEWS. You don t have to be a rocket scientist to harness the power of esigna. That s exciting because we re certain that most of you view your PC and esigna as toos for

More information

Distance Weighted Discrimination and Second Order Cone Programming

Distance Weighted Discrimination and Second Order Cone Programming Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and

More information

A New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions

A New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions 2006 Internationa Joint Conference on Neura Networks Sheraton Vancouver Wa Centre Hote, Vancouver, BC, Canada Juy 16-21, 2006 A New Supervised Custering Agorithm Based on Min-Max Moduar Network with Gaussian-Zero-Crossing

More information

Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed

Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed Nattawut Thepayasuwan, Member, IEEE and Aex Doboi, Member, IEEE Abstract

More information

MACHINE learning techniques can, automatically,

MACHINE learning techniques can, automatically, Proceedings of Internationa Joint Conference on Neura Networks, Daas, Texas, USA, August 4-9, 203 High Leve Data Cassification Based on Network Entropy Fiipe Aves Neto and Liang Zhao Abstract Traditiona

More information

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS Pave Tchesmedjiev, Peter Vassiev Centre for Biomedica Engineering,

More information

Load Balancing by MPLS in Differentiated Services Networks

Load Balancing by MPLS in Differentiated Services Networks Load Baancing by MPLS in Differentiated Services Networks Riikka Susitaiva, Jorma Virtamo, and Samui Aato Networking Laboratory, Hesinki University of Technoogy P.O.Box 3000, FIN-02015 HUT, Finand {riikka.susitaiva,

More information

Infinity Connect Web App Customization Guide

Infinity Connect Web App Customization Guide Infinity Connect Web App Customization Guide Contents Introduction 1 Hosting the customized Web App 2 Customizing the appication 3 More information 8 Introduction The Infinity Connect Web App is incuded

More information

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion.

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion. Lecture outine 433-324 Graphics and Interaction Scan Converting Poygons and Lines Department of Computer Science and Software Engineering The Introduction Scan conversion Scan-ine agorithm Edge coherence

More information

Modelling and Performance Evaluation of Router Transparent Web cache Mode

Modelling and Performance Evaluation of Router Transparent Web cache Mode Emad Hassan A-Hemiary IJCSET Juy 2012 Vo 2, Issue 7,1316-1320 Modeing and Performance Evauation of Transparent cache Mode Emad Hassan A-Hemiary Network Engineering Department, Coege of Information Engineering,

More information

Introducing a Target-Based Approach to Rapid Prototyping of ECUs

Introducing a Target-Based Approach to Rapid Prototyping of ECUs Introducing a Target-Based Approach to Rapid Prototyping of ECUs FEBRUARY, 1997 Abstract This paper presents a target-based approach to Rapid Prototyping of Eectronic Contro Units (ECUs). With this approach,

More information

Dynamic Symbolic Execution of Distributed Concurrent Objects

Dynamic Symbolic Execution of Distributed Concurrent Objects Dynamic Symboic Execution of Distributed Concurrent Objects Andreas Griesmayer 1, Bernhard Aichernig 1,2, Einar Broch Johnsen 3, and Rudof Schatte 1,2 1 Internationa Institute for Software Technoogy, United

More information

May 13, Mark Lutz Boulder, Colorado (303) [work] (303) [home]

May 13, Mark Lutz Boulder, Colorado (303) [work] (303) [home] "Using Python": a Book Preview May 13, 1995 Mark Lutz Bouder, Coorado utz@kapre.com (303) 546-8848 [work] (303) 684-9565 [home] Introduction. This paper is a brief overview of the upcoming Python O'Reiy

More information

Endoscopic Motion Compensation of High Speed Videoendoscopy

Endoscopic Motion Compensation of High Speed Videoendoscopy Endoscopic Motion Compensation of High Speed Videoendoscopy Bharath avuri Department of Computer Science and Engineering, University of South Caroina, Coumbia, SC - 901. ravuri@cse.sc.edu Abstract. High

More information

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed

More information

Bridge Talk Release Notes for Meeting Exchange 5.0

Bridge Talk Release Notes for Meeting Exchange 5.0 Bridge Tak Reease Notes for Meeting Exchange 5.0 This document ists new product features, issues resoved since the previous reease, and current operationa issues. New Features This section provides a brief

More information

Topology-aware Key Management Schemes for Wireless Multicast

Topology-aware Key Management Schemes for Wireless Multicast Topoogy-aware Key Management Schemes for Wireess Muticast Yan Sun, Wade Trappe,andK.J.RayLiu Department of Eectrica and Computer Engineering, University of Maryand, Coege Park Emai: ysun, kjriu@gue.umd.edu

More information

MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY

MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY R.D. FALGOUT, T.A. MANTEUFFEL, B. O NEILL, AND J.B. SCHRODER Abstract. The need for paraeism in the time dimension is being driven

More information

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming The First Internationa Symposium on Optimization and Systems Bioogy (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 267 279 Soving Large Doube Digestion Probems for DNA Restriction

More information

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code Further Optimization of the Decoding Method for Shortened Binary Cycic Fire Code Ch. Nanda Kishore Heosoft (India) Private Limited 8-2-703, Road No-12 Banjara His, Hyderabad, INDIA Phone: +91-040-3378222

More information

Formulation of Loss minimization Problem Using Genetic Algorithm and Line-Flow-based Equations

Formulation of Loss minimization Problem Using Genetic Algorithm and Line-Flow-based Equations Formuation of Loss minimization Probem Using Genetic Agorithm and Line-Fow-based Equations Sharanya Jaganathan, Student Member, IEEE, Arun Sekar, Senior Member, IEEE, and Wenzhong Gao, Senior member, IEEE

More information

Windows NT, Terminal Server and Citrix MetaFrame Terminal Server Architecture

Windows NT, Terminal Server and Citrix MetaFrame Terminal Server Architecture Windows NT, Termina Server and Citrix MetaFrame - CH 3 - Termina Server Architect.. Page 1 of 13 [Figures are not incuded in this sampe chapter] Windows NT, Termina Server and Citrix MetaFrame - 3 - Termina

More information

A Top-to-Bottom View: Energy Analysis for Mobile Application Source Code

A Top-to-Bottom View: Energy Analysis for Mobile Application Source Code A Top-to-Bottom View: Energy Anaysis for Mobie Appication Source Code Xueiang Li John P. Gaagher Roskide University Emai: {xueiang, jpg}@ruc.dk arxiv:1510.04165v1 [cs.oh] 14 Oct 2015 Abstract Energy efficiency

More information

AgreeYa Solutions. Site Administrator for SharePoint User Guide

AgreeYa Solutions. Site Administrator for SharePoint User Guide AgreeYa Soutions Site Administrator for SharePoint 5.2.4 User Guide 2017 2017 AgreeYa Soutions Inc. A rights reserved. This product is protected by U.S. and internationa copyright and inteectua property

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm Outine Parae Numerica Agorithms Chapter 8 Prof. Michae T. Heath Department of Computer Science University of Iinois at Urbana-Champaign CS 554 / CSE 512 1 2 3 4 Trianguar Matrices Michae T. Heath Parae

More information

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Patrice Y. Simard, 1 Yann A. Le Cun, 2 John S. Denker, 2 Bernard Victorri 3 1 Microsoft Research, 1 Microsoft Way, Redmond,

More information

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING Binbin Dai and Wei Yu Ya-Feng Liu Department of Eectrica and Computer Engineering University of Toronto, Toronto ON, Canada M5S 3G4 Emais:

More information

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System Sef-Contro Cycic Access with Time Division - A MAC Proposa for The HFC System S.M. Jiang, Danny H.K. Tsang, Samue T. Chanson Hong Kong University of Science & Technoogy Cear Water Bay, Kowoon, Hong Kong

More information

Quality Assessment using Tone Mapping Algorithm

Quality Assessment using Tone Mapping Algorithm Quaity Assessment using Tone Mapping Agorithm Nandiki.pushpa atha, Kuriti.Rajendra Prasad Research Schoar, Assistant Professor, Vignan s institute of engineering for women, Visakhapatnam, Andhra Pradesh,

More information

NCH Software Express Delegate

NCH Software Express Delegate NCH Software Express Deegate This user guide has been created for use with Express Deegate Version 4.xx NCH Software Technica Support If you have difficuties using Express Deegate pease read the appicabe

More information

Performance of data networks with random links

Performance of data networks with random links Performance of data networks with random inks arxiv:adap-org/9909006 v2 4 Jan 2001 Henryk Fukś and Anna T. Lawniczak Department of Mathematics and Statistics, University of Gueph, Gueph, Ontario N1G 2W1,

More information

Image Segmentation Using Semi-Supervised k-means

Image Segmentation Using Semi-Supervised k-means I J C T A, 9(34) 2016, pp. 595-601 Internationa Science Press Image Segmentation Using Semi-Supervised k-means Reza Monsefi * and Saeed Zahedi * ABSTRACT Extracting the region of interest is a very chaenging

More information

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER!

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER! [1,2] have, in theory, revoutionized cryptography. Unfortunatey, athough offer many advantages over conventiona and authentication), such cock synchronization in this appication due to the arge operand

More information

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University Arithmetic Coding Prof. Ja-Ling Wu Department of Computer Science and Information Engineering Nationa Taiwan University F(X) Shannon-Fano-Eias Coding W..o.g. we can take X={,,,m}. Assume p()>0 for a. The

More information

Register Allocation. Consider the following assignment statement: x = (a*b)+((c*d)+(e*f)); In posfix notation: ab*cd*ef*++x

Register Allocation. Consider the following assignment statement: x = (a*b)+((c*d)+(e*f)); In posfix notation: ab*cd*ef*++x Register Aocation Consider the foowing assignment statement: x = (a*b)+((c*d)+(e*f)); In posfix notation: ab*cd*ef*++x Assume that two registers are avaiabe. Starting from the eft a compier woud generate

More information

Dynamic History-Length Fitting: A third level of adaptivity for branch prediction

Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Dynamic HistoryLength Fitting: A third eve of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Depart. of Computer Architecture Univ. Poitkcnica de Cataunya 08034 Barceona (Spain)

More information

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method 297 Rea-Time Feature escriptor Matching via a Muti-Resoution Ehaustive Search Method Chi-Yi Tsai, An-Hung Tsao, and Chuan-Wei Wang epartment of Eectrica Engineering, Tamang University, New Taipei City,

More information

CylanceOPTICS. Frequently Asked Questions

CylanceOPTICS. Frequently Asked Questions CyanceOPTICS Frequenty Asked Questions Question What is CyanceOPTICS? CyanceOPTICS is an AI driven endpoint detection and response component providing consistent visibiity, root cause anaysis, scaabe threat

More information

THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM

THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM 17th European Signa Processing Conference (EUSIPCO 2009) Gasgow, Scotand, August 24-28, 2009 THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM P. Murray 1, S. Marsha 1, and E.Buinger 2 1 Dept. of Eectronic

More information

Special Edition Using Microsoft Office Sharing Documents Within a Workgroup

Special Edition Using Microsoft Office Sharing Documents Within a Workgroup Specia Edition Using Microsoft Office 2000 - Chapter 7 - Sharing Documents Within a.. Page 1 of 8 [Figures are not incuded in this sampe chapter] Specia Edition Using Microsoft Office 2000-7 - Sharing

More information

Layer-Specific Adaptive Learning Rates for Deep Networks

Layer-Specific Adaptive Learning Rates for Deep Networks Layer-Specific Adaptive Learning Rates for Deep Networks arxiv:1510.04609v1 [cs.cv] 15 Oct 2015 Bharat Singh, Soham De, Yangmuzi Zhang, Thomas Godstein, and Gavin Tayor Department of Computer Science Department

More information

An Introduction to Design Patterns

An Introduction to Design Patterns An Introduction to Design Patterns 1 Definitions A pattern is a recurring soution to a standard probem, in a context. Christopher Aexander, a professor of architecture Why woud what a prof of architecture

More information

understood as processors that match AST patterns of the source language and translate them into patterns in the target language.

understood as processors that match AST patterns of the source language and translate them into patterns in the target language. A Basic Compier At a fundamenta eve compiers can be understood as processors that match AST patterns of the source anguage and transate them into patterns in the target anguage. Here we wi ook at a basic

More information

Readme ORACLE HYPERION PROFITABILITY AND COST MANAGEMENT

Readme ORACLE HYPERION PROFITABILITY AND COST MANAGEMENT ORACLE HYPERION PROFITABILITY AND COST MANAGEMENT Reease 11.1.2.4.000 Readme CONTENTS IN BRIEF Purpose... 2 New Features in This Reease... 2 Instaation Information... 2 Supported Patforms... 2 Supported

More information

Microsoft Visual Studio 2005 Professional Tools. Advanced development tools designed for professional developers

Microsoft Visual Studio 2005 Professional Tools. Advanced development tools designed for professional developers Microsoft Visua Studio 2005 Professiona Toos Advanced deveopment toos designed for professiona deveopers If you re a professiona deveoper, Microsoft has two new ways to fue your deveopment efforts: Microsoft

More information

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162 oward Efficient Spatia Variation Decomposition via Sparse Regression Wangyang Zhang, Karthik Baakrishnan, Xin Li, Duane Boning and Rob Rutenbar 3 Carnegie Meon University, Pittsburgh, PA 53, wangyan@ece.cmu.edu,

More information

Operating Avaya Aura Conferencing

Operating Avaya Aura Conferencing Operating Avaya Aura Conferencing Reease 6.0 June 2011 04-603510 Issue 1 2010 Avaya Inc. A Rights Reserved. Notice Whie reasonabe efforts were made to ensure that the information in this document was compete

More information

1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation

1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation 1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER 2014 Bacward Fuzzy Rue Interpoation Shangzhu Jin, Ren Diao, Chai Que, Senior Member, IEEE, and Qiang Shen Abstract Fuzzy rue interpoation

More information

Response Surface Model Updating for Nonlinear Structures

Response Surface Model Updating for Nonlinear Structures Response Surface Mode Updating for Noninear Structures Gonaz Shahidi a, Shamim Pakzad b a PhD Student, Department of Civi and Environmenta Engineering, Lehigh University, ATLSS Engineering Research Center,

More information

Utility-based Camera Assignment in a Video Network: A Game Theoretic Framework

Utility-based Camera Assignment in a Video Network: A Game Theoretic Framework This artice has been accepted for pubication in a future issue of this journa, but has not been fuy edited. Content may change prior to fina pubication. Y.LI AND B.BHANU CAMERA ASSIGNMENT: A GAME-THEORETIC

More information

A HIGH PERFORMANCE, LOW LATENCY, LOW POWER AUDIO PROCESSING SYSTEM FOR WIDEBAND SPEECH OVER WIRELESS LINKS

A HIGH PERFORMANCE, LOW LATENCY, LOW POWER AUDIO PROCESSING SYSTEM FOR WIDEBAND SPEECH OVER WIRELESS LINKS A HIGH PERFORMANCE, LOW LATENCY, LOW POWER AUDIO PROCESSING SYSTEM FOR WIDEBAND SPEECH OVER WIRELESS LINKS Etienne Cornu 1, Aain Dufaux 2, and David Hermann 1 1 AMI Semiconductor Canada, 611 Kumpf Drive,

More information

Delay Budget Partitioning to Maximize Network Resource Usage Efficiency

Delay Budget Partitioning to Maximize Network Resource Usage Efficiency Deay Budget Partitioning to Maximize Network Resource Usage Efficiency Kartik Gopaan Tzi-cker Chiueh Yow-Jian Lin Forida State University Stony Brook University Tecordia Technoogies kartik@cs.fsu.edu chiueh@cs.sunysb.edu

More information

On-Chip CNN Accelerator for Image Super-Resolution

On-Chip CNN Accelerator for Image Super-Resolution On-Chip CNN Acceerator for Image Super-Resoution Jung-Woo Chang and Suk-Ju Kang Dept. of Eectronic Engineering, Sogang University, Seou, South Korea {zwzang91, sjkang}@sogang.ac.kr ABSTRACT To impement

More information

Navigating and searching theweb

Navigating and searching theweb Navigating and searching theweb Contents Introduction 3 1 The Word Wide Web 3 2 Navigating the web 4 3 Hyperinks 5 4 Searching the web 7 5 Improving your searches 8 6 Activities 9 6.1 Navigating the web

More information

Understanding the Mixing Patterns of Social Networks: The Impact of Cores, Link Directions, and Dynamics

Understanding the Mixing Patterns of Social Networks: The Impact of Cores, Link Directions, and Dynamics Understanding the Mixing Patterns of Socia Networks: The Impact of Cores, Link Directions, and Dynamics [Last revised on May 22, 2011] Abedeaziz Mohaisen Huy Tran Nichoas Hopper Yongdae Kim University

More information

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line American J. of Engineering and Appied Sciences 3 (): 5-24, 200 ISSN 94-7020 200 Science Pubications Appication of Inteigence Based Genetic Agorithm for Job Sequencing Probem on Parae Mixed-Mode Assemby

More information

Relative Positioning from Model Indexing

Relative Positioning from Model Indexing Reative Positioning from Mode Indexing Stefan Carsson Computationa Vision and Active Perception Laboratory (CVAP)* Roya Institute of Technoogy (KTH), Stockhom, Sweden Abstract We show how to determine

More information

Ad Hoc Networks 11 (2013) Contents lists available at SciVerse ScienceDirect. Ad Hoc Networks

Ad Hoc Networks 11 (2013) Contents lists available at SciVerse ScienceDirect. Ad Hoc Networks Ad Hoc Networks (3) 683 698 Contents ists avaiabe at SciVerse ScienceDirect Ad Hoc Networks journa homepage: www.esevier.com/ocate/adhoc Dynamic agent-based hierarchica muticast for wireess mesh networks

More information

For Review Only. CFP: Cooperative Fast Protection. Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapolcai and Hussein T. Mouftah

For Review Only. CFP: Cooperative Fast Protection. Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapolcai and Hussein T. Mouftah Journa of Lightwave Technoogy Page of CFP: Cooperative Fast Protection Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapocai and Hussein T. Mouftah Abstract We introduce a nove protection scheme, caed Cooperative

More information

Chapter 3: Introduction to the Flash Workspace

Chapter 3: Introduction to the Flash Workspace Chapter 3: Introduction to the Fash Workspace Page 1 of 10 Chapter 3: Introduction to the Fash Workspace In This Chapter Features and Functionaity of the Timeine Features and Functionaity of the Stage

More information

Meeting Exchange 4.1 Service Pack 2 Release Notes for the S6200/S6800 Servers

Meeting Exchange 4.1 Service Pack 2 Release Notes for the S6200/S6800 Servers Meeting Exchange 4.1 Service Pack 2 Reease Notes for the S6200/S6800 Servers The Meeting Exchange S6200/S6800 Media Servers are SIP-based voice and web conferencing soutions that extend Avaya s conferencing

More information

Conflict graph-based Markovian model to estimate throughput in unsaturated IEEE networks

Conflict graph-based Markovian model to estimate throughput in unsaturated IEEE networks Confict graph-based Markovian mode to estimate throughput in unsaturated IEEE 802 networks Marija STOJANOVA Thomas BEGIN Anthony BUSSON marijastojanova@ens-yonfr thomasbegin@ens-yonfr anthonybusson@ens-yonfr

More information

Fuzzy Equivalence Relation Based Clustering and Its Use to Restructuring Websites Hyperlinks and Web Pages

Fuzzy Equivalence Relation Based Clustering and Its Use to Restructuring Websites Hyperlinks and Web Pages Fuzzy Equivaence Reation Based Custering and Its Use to Restructuring Websites Hyperinks and Web Pages Dimitris K. Kardaras,*, Xenia J. Mamakou, and Bi Karakostas 2 Business Informatics Laboratory, Dept.

More information

Design of IP Networks with End-to. to- End Performance Guarantees

Design of IP Networks with End-to. to- End Performance Guarantees Design of IP Networks with End-to to- End Performance Guarantees Irena Atov and Richard J. Harris* ( Swinburne University of Technoogy & *Massey University) Presentation Outine Introduction Mutiservice

More information

Community-Aware Opportunistic Routing in Mobile Social Networks

Community-Aware Opportunistic Routing in Mobile Social Networks IEEE TRANSACTIONS ON COMPUTERS VOL:PP NO:99 YEAR 213 Community-Aware Opportunistic Routing in Mobie Socia Networks Mingjun Xiao, Member, IEEE Jie Wu, Feow, IEEE, and Liusheng Huang, Member, IEEE Abstract

More information

Service Chain (SC) Mapping with Multiple SC Instances in a Wide Area Network

Service Chain (SC) Mapping with Multiple SC Instances in a Wide Area Network Service Chain (SC) Mapping with Mutipe SC Instances in a Wide Area Network This is a preprint eectronic version of the artice submitted to IEEE GobeCom 2017 Abhishek Gupta, Brigitte Jaumard, Massimo Tornatore,

More information

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory 0 th Word Congress on Structura and Mutidiscipinary Optimization May 9 -, 03, Orando, Forida, USA A Design Method for Optima Truss Structures with Certain Redundancy Based on Combinatoria Rigidity Theory

More information

Insert the power cord into the AC input socket of your projector, as shown in Figure 1. Connect the other end of the power cord to an AC outlet.

Insert the power cord into the AC input socket of your projector, as shown in Figure 1. Connect the other end of the power cord to an AC outlet. Getting Started This chapter wi expain the set-up and connection procedures for your projector, incuding information pertaining to basic adjustments and interfacing with periphera equipment. Powering Up

More information

Reference trajectory tracking for a multi-dof robot arm

Reference trajectory tracking for a multi-dof robot arm Archives of Contro Sciences Voume 5LXI, 5 No. 4, pages 53 57 Reference trajectory tracking for a muti-dof robot arm RÓBERT KRASŇANSKÝ, PETER VALACH, DÁVID SOÓS, JAVAD ZARBAKHSH This paper presents the

More information

Privacy Preserving Subgraph Matching on Large Graphs in Cloud

Privacy Preserving Subgraph Matching on Large Graphs in Cloud Privacy Preserving Subgraph Matching on Large Graphs in Coud Zhao Chang,#, Lei Zou, Feifei Li # Peing University, China; # University of Utah, USA; {changzhao,zouei}@pu.edu.cn; {zchang,ifeifei}@cs.utah.edu

More information

Geometric clustering for line drawing simplification

Geometric clustering for line drawing simplification Eurographics Symposium on Rendering (2005) Kavita Baa, Phiip Dutré (Editors) Geometric custering for ine drawing simpification P. Bara, J.Thoot and F. X. Siion ε ARTIS GRAVIR/IMAG INRIA Figure 1: The two

More information

5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 2014

5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 2014 5940 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 11, NOVEMBER 014 Topoogy-Transparent Scheduing in Mobie Ad Hoc Networks With Mutipe Packet Reception Capabiity Yiming Liu, Member, IEEE,

More information

Telephony Trainers with Discovery Software

Telephony Trainers with Discovery Software Teephony Trainers 58 Series Teephony Trainers with Discovery Software 58-001 Teephony Training System 58-002 Digita Switching System 58-003 Digita Teephony Training System 58-004 Digita Trunk Network System

More information

Real-Time Image Generation with Simultaneous Video Memory Read/Write Access and Fast Physical Addressing

Real-Time Image Generation with Simultaneous Video Memory Read/Write Access and Fast Physical Addressing Rea-Time Image Generation with Simutaneous Video Memory Read/rite Access and Fast Physica Addressing Mountassar Maamoun 1, Bouaem Laichi 2, Abdehaim Benbekacem 3, Daoud Berkani 4 1 Department of Eectronic,

More information

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism Backing-up Fuzzy Contro of a Truck-traier Equipped with a Kingpin Siding Mechanism G. Siamantas and S. Manesis Eectrica & Computer Engineering Dept., University of Patras, Patras, Greece gsiama@upatras.gr;stam.manesis@ece.upatras.gr

More information

power-saving mode for mobile computing in Wi-Fi hotspots: Limitations, enhancements and open issues

power-saving mode for mobile computing in Wi-Fi hotspots: Limitations, enhancements and open issues DOI 10.1007/s11276-006-0010-9 802.11 power-saving mode for mobie computing in Wi-Fi hotspots: Limitations, enhancements and open issues G. Anastasi M. Conti E. Gregori A. Passarea C Science + Business

More information

CSE120 Principles of Operating Systems. Architecture Support for OS

CSE120 Principles of Operating Systems. Architecture Support for OS CSE120 Principes of Operating Systems Architecture Support for OS Why are you sti here? You shoud run away from my CSE120! 2 CSE 120 Architectura Support Announcement Have you visited the web page? http://cseweb.ucsd.edu/casses/fa18/cse120-a/

More information