Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles

Size: px
Start display at page:

Download "Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles"

Transcription

1 Impact of cache interferences on usual numerical ense loop nests O. Temam C. Fricker W. Jalby University of Leien INRIA University of Versailles Niels Bohrweg 1 Domaine e Voluceau MASI 2333 CA Leien Le Chesnay Ceex Versailles The Netherlans France France Abstract In numerical coes, the regular interleave accesses that occur within o-loop nests inuce cache interference phenomena that can severely egrae program performance. Cache interferences can signicantly increase the volume of memory trac an the amount of communication in uniprocessors an multiprocessors. In this paper, we ientify cache interference phenomena, etermine their causes an the conitions uner which they occur. Base on these results, we erive a methoology for computing an analytical expression of cache misses for most classic loop nests, which can be use for precise performance analysis an preiction. We show that cache performance is unstable, because some unexpecte parameters such as arrays base aress can play a signicant role in interference phenomena. We also show that the impact of cache interferences can be so high, that the benets of current ata locality optimization techniques can be partially, if not totally, eraicate. Keywors: memory reference patterns, software optimization, ata locality, numerical coes, moeling cache interferences, performance analysis, performance preiction. 1 Introuction As CPU cycle time ecreases, main memory an network latencies rapily increase an cache misses become very costly. Furthermore, the increasing issue rate of processors worsen the buren on caches. Moreover, most CPU chips are now being esigne for integration into a massively parallel supercomputer or a parallel workstation, an therefore minimizing memory trac, i.e. optimizing memory hierarchy utilization, is becoming critical. For all these reasons, optimizing the cache behavior has become a major issue. To achieve such optimizations, many stuies have been performe to unerstan the workings of cache memories an erive proper optimizations. The rst category of stuies [?] relies on numerous simulations of representative coes, i.e. a collection of coes which correspons to the average workloa of a computer. Such simulations provie a goo summary of average cache performance an some hints at the relationships between the ierent cache parameters (cache size, line size, set-associativity). Furthermore they provie nearly exact inications on the behavior of cache memories for specic examples. A major problem inherent to such techniques is to n coes which are truly representative in terms of memory referencing. A secon category of stuies [?] aims at builing analytical moels for synthesizing the behavior of cache uner most circumstances. Such moels provie better insight on the relationship between the ierent parameters. These moels can also be use for performance preiction, This work was fune by the BRA Esprit III European Project APPARC, European Agency DGXIII. 1

2 avoiing numerous simulations. However, while such analytical moels are more representative than simulation base stuies, they are generally less accurate. An they are intrinsicly limite because they cannot an are not esigne for unerstaning specic phenomena which occur within a cache. Moeling the global behavior of cache may be sucient as far as trens are neee. However, for unerstaning the weaknesses of caches an eriving either software or harware optimizations more precise moeling is necessary. Therefore a goo unerstaning of a program reference pattern nees to be extracte. Some moels have aresse this problem [?]. Although they are valuable tools, they still lack accuracy because they aim at characterizing the behavior of most programs. Therefore, they again sacrice accuracy for generality an representativity. However, there is a category of coes, numerical coes, that are emerging as some of the most emaning programs in terms of execution time an memory usage. Many architectures are targete or at least tune for such programs. The wiesprea use of numerical coes as benchmarks is a clear sign of their growing inuence. Therefore, stuying the cache behavior uner numerical workloas is critical. However, numerical coes have specic properties in terms of memory aressing (spatial an temporal locality) which harly allow them to t in the classic framework of general moels. Fricker an al. [?] evelope a moel for irect-mappe an set-associative caches that is eicate to regular (an some irregular) numerical coes. This moel takes into account the fact that references within numerical coes correspon to chunks of consecutive aresses which recur perioically. The main asset of the moel is to show the behavior of cache uner numerical workloas, an to allow imensioning of cache parameters for such coes. So, if this eort allows a better unerstaning of cache behavior uner such workloas, it oes not help in unveiling hot-spot an irregular phenomena which are specic to numerical coes an alter the cache performance. Furthermore numerical coes are actually mae of a limite number of typical loop nests. Therefore, eorts shoul be concentrate on moeling an unerstaning the actual an most frequent types of loop nests. This problem is twofol: the rst step is ientifying these typical cases an consequently restricting problem hypotheses. The secon an main step is eriving a moel which encompasses the majority of such cases. Porterel [?] evelope a moel eicate to numerical coes, which oes examine specic pieces of coes an etermine their behavior on cache. However, mostly fully-associative caches have been consiere. Because such caches are unlikely to experience interference phenomena, conclusions can harly be erive for interferences in real caches. An important step towars accurate evaluation of cache interferences has been mae in [?], where blocke Matrix-Matrix multiply is carefully stuie. Cross an self-interference misses are evaluate, an a moel for this algorithm is provie. This paper clearly unveils that interferences can severely alter locality exploitation. However, the moel still lacks accuracy an is not capable of catching some of the specic phenomena an performance uctuations inuce by the mapping of irect-mappe caches. Furthermore, new parameters (such as arrays base aress) which can play an important role in interference phenomena are not taken into account. Moreover, this moel is eicate to one particular example, while a methoology suitable to many numerical algorithms woul be very useful. Inee, many powerful software optimization techniques for exploiting numerical coes locality have now been esigne [?,?,?]. Although they stress the possible impact of cache interferences, no real evaluation of these phenomena nor a stuy of their frequency of occurrence have been 2

3 performe yet. In the next sections, we will show that cache interferences can have a strong impact on performance an occur frequently. Ferrante an al. [?] propose a realistic approach to locality optimization by evaluating the number of cache lines use (as oppose to the number of elements as in most other methos). It is mentione that evaluating cache conicts woul help rening even more such realistic locality optimization techniques. In one example, it is briey shown how to etect self-interferences. To aress the issues relate to cache interferences, a moel name NUMODE (NUmerical MODEl) [?] is being evelope within the APPARC 1 project. The goal of NUMODE is to provie a framework for moeling cache interferences of a given loop nest, an then eriving an analytical expression for the number of cache misses. Therefore, NUMODE is both a moel an a methoology. It is no oubt that such a moel cannot be as exible as others for global performance preictions, or for extracting global trens on cache behavior. However, it is possible to precisely quantify the interference phenomena that occur in real caches, etermine their causes an conitions of occurrence. Many such phenomena can be ientie. Moreover, the cache performance is shown to be actually unstable in many situations. It namely appears that software optimization techniques lack accuracy an can consequently lose their eciency. It is also shown that the number of aitional memory references ue to cache interferences can be obtaine as an analytical function of problem parameters. This result allows a precise analysis of the behavior of most classic algorithms on caches, an can then be use to esign new optimization techniques or tune existing ones. Making such a function available to compilers can be protable to coe restructuring techniques. NUMODE is currently uner evelopment an will be implemente to perform extensive testings of its scope an accuracy. In section?? the problem is ene an hypotheses are given. In section??, the general metho for computing the number of misses is inicate. In section??, conclusions are rawn an further work is iscusse. 2 Problem statement The purpose of the paper is twofol. The rst goal is to show that cache interferences are not infrequent, that they can have a signicant impact on numerical loop nests performance, an that their conitions of occurrence can be etermine even though cache interferences are highly irregular. Unerstaning an then eliminating such interferences woul allow stable cache performance. Furthermore, cache line size is kept small because large line sizes are generally consiere to bring more interferences. This notion is not completely true: when line size is large, compulsory misses are much less important so that interference misses correspon to a larger portion of total misses. Therefore, numerical loop nests are more sensitive to cache interferences when line size is large, but such interferences are not necessarily more important. It erives that exploiting a larger line size requires a goo unerstaning of cache interferences. The secon an main goal of this paper is to introuce a metho for estimating these cache interferences. General principles an main steps of the technique are inicate an illustrate with examples. Cache architecture Direct-mappe caches have been chosen for several reasons: Direct-mappe caches are more sensitive to interferences than w-way associative caches. Therefore, they are more likely to benet from stuies an optimizations on that matter. 1 APPARC is a BRA Esprit III European project 3

4 Since the replacement policy of irect-mappe caches is straightforwar, computing interferences is easier in irect-mappe caches. Though, we strongly believe the technique can be extene to w-way associative caches with moerate moications. Among the three newest processor chips (DEC Alpha, MIPS R4000, SuperSPARC), two chips (DEC Alpha, MIPS R4000) inclue a small (8kbytes) irect-mappe on-chip ata cache. Since the frequency of such processors is very high, the cost of a cache miss is huge, making it critical to reuce the amount of interferences. The placement policy in the DEC Alpha, for example, is such that, a ata cache location can generally be etermine from the ata virtual aress. Therefore, a stuy base on virtual aresses woul accurately escribe real cache behavior. In the remainer of the paper, the cache size is inicate by C S an the line size by L S. The unit size is 8 bytes, i.e. the size of a ouble-precision oating point ata. In all experiments, cache size is equal to 8-kbyte an line size is equal to 32-byte (the characteristics of the DEC Alpha ata cache), so C S = 1024 an L S = 4. Coes Let us now iscuss which types of coes are consiere. In numerical coes most ata trac occurs in o-loops, only these coe constructs are examine. Only array references are consiere because it is probable that other variables woul be store in registers if they are frequently use, an otherwise they woul inuce minimal perturbations of cache behavior. Loop Nests A loop nest is compose of n istinct loops, j i being the loop inex of the i th loop, an j n being the loop inex of the innermost loop. Column-major storage is assume, so, for example, the virtual aress of array reference A(j 1 ; j 2 ) is a 0 + N j 1 + j 2 where N is the leaing imension of array A an of the starting aress of array A (cf gure??). 3 Moeling numerical coes behavior 3.1 Restrictive hypotheses DO j 1 = 0; N 1? 1 DO j 2 = 0; N 2? 1. DO j n = 0; N n? 1. A( A 1 j k1 + 1 A ; : : :; A p j kp + p A ) B( B 1 j k1 + 1 B ; : : : ; B p j kp + p B ).. Figure 1: An example of loop nest consiere. A number of restrictions are impose on the loop nests consiere. First, the array subscripts must all be of the form A( A 1 j k1 +A 1 ; : : : ; A p j kp + A p ) where (j ki ) 1ip (with p n) is any subset of 4

5 loop inices, an ( A) i 1ip an ( A) i 1ip are constants. For example, subscripts such as A(j 1 +j 2 ) are not consiere. A close analysis of benchmark suites such as the Perfect Club [?] or NAS [?] an stuies such as the one one by Yew an al. [?] show that such hypotheses encompass the subscripts foun within most numerical loop nests. Furthermore, in usty-eck coes, \irregular" subscripts often correspon to linearization, an therefore, in terms of memory reference patterns, are equivalent to those consiere within the scope of the moel. For all these reasons, these restrictions on array subscripts are consiere to be reasonable constraints. Moreover, further evelopments of the present moel may inclue more complex subscripts. Another moel hypothesis is that the bounaries of each loop inex must be constant (after normalization, 0 j i N i ) an the strie of all inices equal to 1. For any rectangular loop nest, the loop inices an array subscripts can be change so as to satisfy these hypotheses. However, the fact non-rectangular loops o not r these hypotheses. Since this point is relatively restrictive, further evelopments of the moel will mainly focus on incluing this kin of loop nests. 3.2 General principles If cache was fully-associative an replacement was optimal, cache misses woul occur in two cases only. First, when ata are loae in cache for the rst time; such misses are calle compulsory misses. Secon, when cache space is too small to store all loop nest ata. Then, an element is ushe from cache each time a new element nees to be loae; such misses are calle capacity misses. However, in irect-mappe caches (an in set-associative caches as well), cache misses can occur though cache space is sucient, because a ata element can only be mappe into one specic cache location (or w locations in w-way associative caches). Therefore, such cache misses o not occur because of capacity conicts, but because of mapping conicts; they are calle mapping misses [?]. The main eect of such unexpecte misses is to egrae the spatial an temporal reuse of ata. Interferences can either correspon to interferences of an array with itself (self-interferences), or with another array (cross-interferences). In the remainer of the paper, these two types of interferences are analyze. For self-interferences the principle is to stuy the mapping of the set of elements of an array to be reuse, an check whether these elements overlap with themselves. If so, self-interferences occur, an estimating the egree of overlapping yiels the number of aitional memory accesses brought by self-interferences. In general, self-interferences mostly correspon to temporal interferences. For cross-interferences, once the set of elements of an array to be reuse is ientie (taking into account self-interferences), the overlapping between these elements an elements of another array is etermine. Knowing the number of times the two sets of elements overlap, an the amount of overlapping each time, is sucient to compute the number of aitional memory accesses brought by cross-interferences. Cross-interferences can correspon to either temporal or spatial interferences. 3.3 Self-interferences Theoretical reuse set Thanks to the subscript types consiere, it is easy to ientify where reuse ue to self-epenences occurs. If coecient a i = 0 in the virtual aress expression a 0 + a 1 j 1 + : : : + a n j n of a reference to array A, then loop i carries reuse. Let us ene l as the lowest loop level where reuse occurs. On all loops k with k > l, no reuse occurs. So, on each iteration of these loops, the array elements reference are all istinct. This set of elements is calle the reuse set. During each execution of the sub loop nest j n ; : : : ; j l+1, all elements of the reuse set are reference. 5

6 Denition For array reference a 0 + a 1 j 1 + : : :+ a n j n, the reuse set can be ene if there exists a k such that a k = 0. The loop level of a reuse set is l = max fk=a k = 0g. The theoretical reuse set is equal to RS(A) l = fa 0 + a 1 j 1 + : : : + a n j n ; (0 j i N i? 1) i>l g. Reuse can occur on ierent loop levels. However, reuse that is carrie on loop levels higher than l is at least one orer of magnitue less important than the reuse on loop l. For example, let us consier the following loop nest: DO J1 = 0, N1-1 DO J2 = 0, N2-1 DO J3 = 0, N3-1 DO J4 = 0, N Y(J1,J3)..... For reference Y (j 1 ; j 3 ), reuse occurs on loop levels 4 an 2; here l = 4. On one execution of loop j 4, the array element Y (j 1 ; j 3 ) can be reuse N 4? 1 times (except for the rst iteration of loop j 4 ). So, uring execution of the whole loop nest, N 1 N 2 N 3 (N 4? 1) reuses of elements of Y can be achieve on loop j 4. On one execution of loop j 2, Y (j 1 ; 0); : : :; Y (j 1 ; N 3? 1) can be reuse N 2? 1 times. So, uring execution of the whole loop nest, N 1 (N 2? 1) N 3 reuses can be achieve on j 2. Consequently, the potential reuse on loop j 4 is approximately N 4 times more important than the potential reuse on loop j 2. Furthermore, the time interval between two reuses on loop j 4 is minimum, i.e. it is equal to one iteration of loop j 4, while N 4 N 3 iterations of loop j 4 are execute between two reuses on loop j 2. Therefore, not only reuse is more scarce on loop j 2, but the probability it can be exploite is also much smaller. That is why only the rst reuse level is consiere in general. However, when some coecients consecutive to a l (i.e. a l?1 ; a l?2 ; : : :) are also equal to 0, reuse on these loop levels is consiere to still be achieve (with respect to the reuse set, it is the same as if the bounary of loop l were N l N l?1 N l?2 : : : instea of N l ). Note that if a l = a l?1 = a l?2 = : : : = a k = 0, the reuse set on loop k is the same as on loop l. Size of the theoretical reuse set Let us compute the number of cache lines corresponing to the theoretical reuse set assuming no self-interferences. First, let us etermine the cache line strie of the reuse set of a reference A (where reuse occurs on loop l). It is equal to min(1; ), where = min l+1kn (a k ). This term is the ratio of the smallest coecient (only consiering loop levels ening the reuse set) to the line size. If this ratio is greater than 1, then it is necessarily equal to 1, since at most one new cache line is reference on each iteration. For example, the access 6

7 3 strie of reference A(3j 3 ) is min(1; ). Let us consier another example. The virtual aress of reference A(j 3 ; B j 2 ) is a 0 + N j 3 + Bj 2 where N is the leaing imension of A (here reuse occurs on loop level 1). Then the access strie of the reuse set is min(1; min(n;b) ). Then, the number of cache lines in the theoretical reuse set is equal to N n : : : N l+1 acces strie. Actual reuse set Because self-interferences occur, not all elements of the reuse set can actually be reuse. In irect-mappe caches, as soon as two elements of the reuse set compete for the same cache line, none of the two elements can be reuse: they are victim of self-interferences. The elements of the reuse set not victim of self-interferences belong the actual reuse set. The actual reuse set is the set of cache lines of the theoretical reuse set where no self-interferences occur. Characterizing self-interferences is equivalent to etermining the actual reuse set. An etermining the actual reuse set is equivalent to stuying the mapping of the theoretical reuse set in cache. To compute the overlapping within the theoretical reuse set of a reference A, the loops n to l + 1 are successively execute, starting with loop n. On each loop level k, the cache lines use by loops n; : : : ; k + 1, which are not victim of interferences, form a temporary reuse set calle RS(A) k. l On loop level k, the interferences between N k such temporary reuse sets RS(A) k+1 are etermine, l an the cache lines still not victim of interferences form the new temporary reuse set RS(A) k. l 1. Loop level n (reuse occurs on loop l): on this loop level, the reuse set is fa0 + a1j1 + :::anjn; 0 jn < Nng. Let a 0 n = a 0 +a 1 j 1 +: : :+a n?1 j n?1. A temporary reuse set RS(T ) n correspons to min(1; an l ) N n cache lines, starting at cache position a 0 n mo C S. If C S < min(1; an ) N n then capacity interferences occur, an the victim cache lines are remove from the temporary reuse set (cf example on Matrix-Vector multiply below). 2. Loop level n? 1: let a n?1 0 = a 0 + a 1 j 1 + : : : + a n?2 j n?2. The temporary reuse sets RS(A) n l start at cache positions a n?1 0 + a n?1 j n?1 mo C S. By checking whether two such temporary reuse sets interfere (epening on their relative cache positions) an evaluating the amount of overlapping, it is possible to compute the number of cache lines not victim of interferences. These cache lines correspon to reuse set RS(A) n?1 (cf example on Matrix-Matrix multiply l below). 3. All subsequent steps are ientical to step 2. The process stops on loop level l. 4. For each iteration of loop l, the number of cache lines of the theoretical reuse set victim of self-interferences is equal to Number of cache lines of the theoretical reuse set - Number of cache lines of the actual reuse set. So the number of aitional memory requests per iteration of loop l is equal to Number of cache lines of the theoretical reuse set - Number of cache lines of the actual reuse set. The total number of aitional memory requests is equal to N 1 : : : N l (Number of cache lines of the theoretical reuse set? Number of cache lines of the actual reuse set). This process is generally not too complex because few levels i have to be consiere (reuse sets of imension 1 or 2, i.e. epening on 1 or 2 loop levels, correspon to a majority of cases). Furthermore, very often the layout of sets in cache is straightforwar ue to the way array elements are reference (strie one access to arrays), making it relatively easy to evaluate self-interferences, an compute the actual reuse set. 7

8 Matrix-Vector multiply o-loop nest is the following Let us illustrate the previous notions with matrix-vector multiply. The DO J1 = 0, N-1 DO J2 = 0, N-1 Y(J1) += A(J1,J2) * X(J2) The linear function corresponing to each array reference is the following Y: 0j 2 + j 1 + y 0 X: j 2 + 0j 1 + x 0 A: j 2 + N j 1 + a 0 (where N is the leaing imension of array A) For array Y, only loop level 2 carries reuse, the theoretical reuse set is RS(Y ) 2 = fy (j 1 )g. For array X, the theoretical reuse set is RS(X) 1 = fx(0); : : :; X(N? 1)g. No reuse occurs for array A. N <= Cs Cs < N <= 2Cs Cache Elements of X Reusable elements of X 2Cs < N Figure 2: Mapping of elements of X into cache. The theoretical reuse set of Y is equal to the actual reuse set of Y an correspons to one cache line only; no self-interferences can occur. On the other han the theoretical reuse set of X contains N consecutive elements (the access strie is 1), or N cache lines. Therefore, if N C S, self-interferences occur. More precisely, if C S N 2C S, 2(N? C S ) elements of X are overlappe by other elements of X before they can be reuse. Therefore, only the reuse of 2C S? N elements of X can actually be exploite. So, the actual reuse set size is 2C S?N aitional memory requests are necessary on each iteration of loop j 1 because of self-interferences of X. If N > 2C S, no reuse occurs, the overlap between elements of X is total (cf gure??). This problem is well known in the omain of loop restructuring. It correspons to capacity interferences rather than mapping interferences. The most classic metho for ealing with it is to block the loop, so that the reuse set of X is smaller than cache size. However, a less obvious an. 2(N?C S) 8

9 known fact is that, even when the loop is blocke interferences can still occur. Let us consier the blocke version of Matrix-Matrix multiply. Blocke Matrix-Matrix multiply DO J1 = 0, N-1, Bs DO J2 = 0, N-1, Bs DO J3 = 0, N-1 DO J4 = J1, J1 + Bs-1 DO J5 = J2, J2 + Bs-1 C(J3,J5) += A(J3,J4) * B(J4,J5) where B S is the block size (for sake of simplicity it is assume B S ivies N). Matrix-Matrix multiply is generally blocke that way so as to keep in cache a submatrix B S B S of matrix B. Let us normalize the loop inices so that they t the moel hypotheses (constant bounaries an strie 1): j 5 is ene by j 5 = J5? J2 an similarly j 4 = J4? J1, j 1 = J1, BS j 2 = J2, an for all other i, j i = Ji. Then, the linear function corresponing to array B is BS j 5 + N B j 4 + 0j 3 + B S j 2 + B S N B j 1 + b 0, so its reuse set is RS(B) 3 = fj 5 + N B j 4 + j 2 + N B j 1 + b 0, 0 j 5 B S? 1; 0 j 4 B S? 1g, where N B is the leaing imension of array B. Let us consier what happens in cache when this block is reference. When j 5 varies between 0 an B S? 1, an interval of B S elements of B is mappe into consecutive cache locations, or B S cache lines. These cache lines correspon to the temporary reuse set RS(B) 5 3. Then, j 4 is increase by 1, an another interval of B S elements is loae into cache at a istance N B from the previous interval. Due to the nite size of the cache, the actual istance in cache between two consecutive intervals of size B S is N B mo C S (where mo is the moulo operator) instea of N B. Therefore epening on the value of N B mo C S, successive intervals can overlap in cache, thereby more or less egraing the potential reuse gaine from blocking (all the cache lines of the subintervals of size B S constitute the temporary reuse set RS(B) 4 3). For example, if 0 N B mo C S < B S, two consecutive intervals overlap by B S? (N B mo C S ) cache locations or B S?(NB mo CS) cache lines (cf gure??). Since an interval has two neighbors, each interval overlaps by 2 B S?(NB mo CS ) cache lines with its neighbors (except for the rst an last interval which have only one neighbor an consequently overlap by B S?(NB mo CS) cache lines only). So, in each interval, only B S?2(BS?(NB mo CS )) cache lines are not victim of self-interferences (except for the rst an last interval, where B S?(BS?NB mo CS ) cache lines are not victim of self-interferences). Therefore, the actual reuse set only correspons to (B S? 2) BS?2(BS?(NB mo CS)) +2 B S?(BS?NB mo CS ) = (B S?2)(2(NB mo CS )?BS)+2(NB mo CS ) lines, while the theoretical reuse set (assuming no self-interferences) correspons to B S B S cache cache lines. Therefore the self-interferences of array B bree B2 S? (B S?2)(2(NB mo CS )?BS)+2(NB mo CS) 9

10 aitional memory requests per iteration of loop 3, an N 1 N 2 N 3 (B S?2)(2(NB mo CS )?BS)+2(NB mo CS) aitional memory requests in total. Paing can be use for matrix B so that B S = N B mo C S (where N B is the leaing imension of matrix B). In that case, placement of intervals in cache is optimum (cf gure??). Conclusions For moeling purposes, such examples stress the fact that it is critical to properly evaluate mapping of array elements in cache To obtain accurate evaluation of self-interference misses (cf gure??). On a broaer scope, it shows that precise analysis of interferences can be valuable for coe optimizations such as loop restructuring since part or all the benets of classic loop restructuring can be lost because of cache interferences. Moreover, such phenomena can appear frequently an with varying importance (in our example, epening on the parameter N B mo C S, interferences can be maximum when N B mo C S = 0 or minimum when N B mo C S = B S ). N mo Cs = 0 N mo Cs = Bs Cache Elements of B (or D) Reusable elements of B (or D) N mo Cs = 3Bs/4 Actual Reuse Set of B (or D) Actual Interference Set of B (or D) Bs Figure 3: Mapping of elements of B into cache. 1 Hit ratio of array B NB (leaing imension of array B) Figure 4: Inuence of parameter N B on self-interferences. 3.4 Cross-interferences Two main cases of cross-interferences can occur between two references: either the ierence between the corresponing two virtual aresses is constant (inepenent of loop inices), or it 10

11 varies with the loop inices. These two cases must be istinguishe because such cross-interferences are very ierent. In the rst case, the two references are in translation, they always overlap an the amount of overlapping is constant, while in the secon case the two references overlap only perioically an the amount of overlapping varies. So, etecting an estimating cross-interferences in the rst case basically amounts to comparing the constant parameter of the two virtual aresses (it generally epens on arrays imensions an base aress), while in the secon case, the relative movement of the two references must be analyze. The rst type of interferences is calle internal cross-interferences (because a set of references in translation constitutes a kin of class of references, an such cross-interferences then occur within a class), an the secon type of interferences is calle external cross-interferences (interferences among two references not belonging to the same class) Estimating cross-interferences Computing the impact of cross-interferences between two references amounts to estimating how much of the reuse of one reference is lost because of cross-interferences with the other reference. So, let us consier two references R 1 ; R 2, an compute the impact of R 2 on the reuse of R 1. First, the set of elements R 1 can reuse must be estimate; it is the reuse set ene in section??. Secon, the set of elements of R 2 that can interfere with R 1 must be estimate as well; this set is calle the interference set. Because the reuse set is compute on a given loop level l, the interference set shoul be compute on the same loop level. Recall, that below loop l, no reuse can occur for R 1. Therefore, the number of aitional memory requests for R 1 ue to cross-interferences with R 2, on each reutilization of the reuse set (i.e. on each iteration of loop l), is exactly equal to the number of cache lines use by both the reuse set an the interference set. This notion is funamental in the computation of cross-interferences. Computing interferences this way allows to make abstraction of time consierations, i.e. when interferences occur. It is sucient to estimate the intersection between the set of cache lines corresponing to the reuse set an the interference set. It is important to note that, in the following sections, the reuse set consiere is the actual reuse set, otherwise cross-interferences woul be counte where self-interferences alreay occur, resulting in an overestimate of aitional memory requests. The interference set The enition of the theoretical interference set is the same as that of the theoretical reuse set. Denition For array reference a 0 + a 1 j 1 + : : : + a n j n, the theoretical interference set, on loop level l (this loop level is etermine by the victim reuse set), is equal to IS(A) l = fa 0 + a 1 j 1 + : : : + a n j n ; (0 j i N i? 1) i>l g. Moreover, etermining the actual interference set is one much the same way as for the actual reuse set. The actual reuse set is the subset of cache lines of the theoretical reuse set where no self-interferences occur, while the actual interference set is simply the set of cache lines use by the theoretical interference set. So if a cache line of the theoretical interference set is victim of self-interferences, this cache line is still counte in the actual interference set (while such a cache line is rejecte from the actual reuse set). Intuitively, the actual interference set correspons to the cache surface (the number of cache lines) use by the theoretical interference set. The amount of cross-interferences is irectly correlate to the size of the actual interference set (the larger the set, the higher the probability of overlapping with the reuse set). The process for etermining the actual reuse set is the following: 11

12 1. Loop level n (for the reuse set, reuse occurs on loop l): on this loop level, the interference set is fa 0 + a 1 j 1 + : : : a n j n ; 0 j n < N n g. Let a n 0 = a 0 + a 1 j 1 + : : : + a n?1 j n?1. A temporary interference set IS(T ) n correspons to min(1; an ) N l n cache lines, starting at cache position a n 0 mo C S. If C S < min(1; an ) N n then capacity interferences occur, an the cache lines use twice or more are only counte once. 2. Loop level n? 1: let a n?1 0 = a 0 + a 1 j 1 + : : : + a n?2 j n?2. The temporary interference sets IS(A) n starts at cache positions a n?1 l 0 + a n?1 j n?1 mo C S. So, by checking whether two such temporary interference sets interfere (epening on their relative cache positions) an evaluating the amount of overlapping, it is possible to compute the number of cache lines corresponing to the union of the two intervals. The union of the cache lines of all temporary interference sets IS(A) n is the temporary interference set IS(A) n?1. l l 3. All subsequent steps are ientical to step 2. The process stops on loop level l. Let us consier the same example use for illustrating the computation of the actual reuse set, i.e. Matrix-Matrix multiply, except that reference B(j 4 ; j 5 ) is replace by B(j 4 ; j 5 ) + D(j 4 ; j 5 ). Then, computing cross-interferences on B ue to D implies computing the interference set of D on loop 3 (since reuse occurs on loop 3 for B). Then, as for array B, if 0 N D mo C S < B S, each interval of size B S of D overlaps by 2(B S?(ND mo CS)) cache lines with its two neighbor intervals (except for the rst an the last interval). However, in opposition to the actual reuse set, these cache lines are still counte in the actual interference set. Because of this overlapping, the size of mo CS)+BS the actual interference set is (B S?1)(ND cache lines instea of B2 S (cf gure??). In opposition to the reuse set, it is preferable that overlapping occurs within the interference set, because the larger the overlapping within a theoretical interference set, the smaller the corresponing actual interference set, an the less likely the actual interference set overlaps with cache lines of the actual reuse set. Therefore, the optimal case is obtaine for N D mo C S = 0 (note that it correspons to the worse case for the reuse set of D; cf section??) Internal cross-interferences Internal cross-interferences occur between two references in translation. Thanks to that property the relative cache position between the reuse set of the victim reference an the interference set of the interfering reference is always the same. Therefore, if the reuse set is ene on loop level l (reuse occurs on loop l), the total number of aitional memory requests ue to cross-interferences between the two references, is equal to the total number of iterations of loop l times the number of cache lines use by both the actual interference set an the actual reuse set. The process for computing internal cross-interferences on a reference R 1 ue to reference R 2 is the following: 1. Compute the actual reuse set of R 1 ; the reuse occurs on loop l. Compute the actual interference set of R 2 on loop l. 2. Compute the number of cache lines CL(R 1 ; R 2 ) use by both the actual reuse set an the actual interference set. This is one by simply comparing the relative starting position of both sets. 3. The number of total aitional memory requests ue to internal cross-interferences between R 1 an R 2 is equal to N 1 : : : N l CL(R 1 ; R 2 ). 12

13 Elementary linear algebra operation Let us consier the matrix operation A = X t Y where A is an N N matrix an X; Y are two vectors of imension N. Often, vectors X; Y are linear combination of several vectors. For example, if X = X 1 + X 2, the operation is A = (X 1 + X 2 ): t Y. The corresponing loop is the following DO J1 = 0, N1-1 DO J2 = 0, N2-1 A(J1,J2) = (X1(J2) + X2(J2))*Y(J1) We want to compute cross-interferences on X 1 ue to X 2. The virtual aresses of X 1 (j 1 ); X 2 (j 2 ) are x j 1 an x j 2. x j 2? (x j 2) = x 1 0? x 20, the ierence is constant so the two references are in translation. The cross-interferences are internal cross-interferences. For array X 1, reuse occurs on loop j 1 an the reuse set of X 1 correspons to N 2 consecutive cache lines. On this loop level, the interference set of X 2 correspons to N 2 consecutive cache lines. The cache istance between the two sets is equal to x 1 0? x 20 mo C S. If x 1 0? x 20 mo C S 2 [0; N 2? 1] S [C S? (N 2? 1); C S ], the two sets overlap (cf gure??). In that case, the amount of overlapping is equal to CL(X 1 ; X 2 ) = min((x 10?x 20 mo C S);CS?(x10?x 20 mo C S )) cache lines (cf gure??). Therefore, internal cross-interferences between X 1 an X 2 bring CL(X 1 ; X 2 ) aitional memory requests per iteration of loop j 1, or N 1 : : :N l CL(X 1 ; X 2 ) aitional memory requests in total (cf gure??). > N2 N N < N 2 Cache Elements of X1 Elements of X2 Elements victim of internal cross-interferences = (x1 - x2 ) mo Cs 0 0 Figure 5: Internal cross-interferences between X 1 an X 2. Spatial interferences Note that internal cross-interferences correspon to spatial interferences, only if the relative cache positions of the actual reuse set an the actual interference set is smaller than the line size (in the above example, ((x 1 0? x 20 mo C S) < L S ). So, spatial interferences have a low probability to occur. On the other han, when such a case happens, little or no spatial reuse can be achieve an no temporal reuse can be achieve also. These cases are extremely costly, they correspon to "ping-pong", i.e. when two arrays translate in cache an constantly compete for the same cache location. 13

14 Hit ratio Total X1 X Cache istance between X1 an X2 Figure 6: Impact of internal cross-interferences on the hit ratio (N 1 = 200; N 2 = 40). Group-epenence reuse In this paper, mostly reuse ue to self-epenences (a reference reuses itself) is analyze. However, the reuse ue to group-epenences (a reference reuses elements of another reference) can also be signicant, though it is in general less important than reuse ue to self-epenences. However, since the way internal cross-interferences perturbate group-epenence reuse is original, the phenomenon is worth being illustrate with an example. Example from FLO52 Let us consier the following simple example extracte from a version of a Perfect Club coe FLO52 [?]. DO 10 J1=2,N1 DO 10 J2=1,N2 XY(J1,J2) = X(J1-1,J2,1) - X(J1,J2,1) Array XY is actually a temporary array use for scalar expansion. The leaing imension of arrays XY an X can be consiere equal to N 2. In this case, there is no self-epenence on array X, but there is a group-epenence between references X(j 1 ; j 2 ; 1) an X(j 1? 1; j 2 ; 1) which benets to array reference X(j 1? 1; j 2 ; 1). Let us call R 1 the reference X(j 1? 1; j 2 ; 1) an R 2 the reference X(j 1 ; j 2 ; 1). Since reuse occurs on a given loop level (loop 1), it is possible to exten the enition of the reuse set to group-epenences (except the reuse set is not reuse by the reference itself). The reuse set of R 1 is RS(X) 1 2 = fx(j 1; 1; 1); : : :; X(j 1 ; N 2 ; 1)g. The epenence istance between the two references is equal to N 2. It is assume N 2 < C S so that the reuse set of R 1 ( N 2 cache lines) ts in cache. Now, array XY can have internal crossinterferences with reference R 1. However, the way such cross-interferences occur is not straightfor- war. The amount of overlapping is not equal to the number of cache lines use by both the reuse set of R 1 an the interference set of XY. Inee, internal cross-interferences occur in a boolean way. Depening on the cache istance x 0? xy 0 mo C S between the interference set of XY an the reuse set of R 1 two cases can occur. Either x 0? xy 0 mo C S 2 [C S? (N 2? 1); C S ] an the 14

15 interference set of XY ushes no element of the reuse set of R 1 because a given cache line is use by R 1 after it has been use by XY (cf gures?? an??). There is no aitional memory request. Or x 0? xy 0 mo C S 2 [0; N 2? 1] an the interference set of XY ushes all elements of the reuse set of R 1 because a given cache line is use by R 1 before it is use by XY. Therefore, elements reference by X(j 1 ; j 2 ; 1) are ushe before they can be reuse by X(j 1? 1; j 2 ; 1) (cf gures?? an??; note also that for x 0?xy 0 mo C S = 0 an x 0?xy 0 mo C S = N 2, reference XY inuces ping-pong phenomenon with respectively reference X(j 1 ; j 2 ; 1) an reference X(j 1?1; j 2 ; 1)). The total number of aitional memory requests is equal to the number of cache lines of the reuse set of X(j 1 ; j 2 ; 1) that woul have been reuse by X(j 1? 1; j 2 ; 1). Since N 2? 1 array elements woul have been reuse, the number of aitional memory requests per iteration of loop 1 is N 2?1, an the total number of aitional memory requests ue to internal cross-interferences is equal to (N 1? 1) N 2?1. N2 0 < < N2 Cache Elements of X(J1-1,J2,1) Elements of X(J1,J2,1) Elements of XY(J1,J2) N2 -N2 < < 0 Figure 7: Perturbation of group-epenence reuse of array X. Hit ratio Total X(J1,J2) X(J1-1,J2) XY (XY0-X0) Figure 8: Inuence of initial positions of X an XY on the total hit ratio; N 1 = N 2 = 100. There again, by simply changing a base aress (the base aress of array XY ), it is possible to eliminate cache interferences. 15

16 Conclusions Internal cross-interferences can be very signicant because they occur on each reutilization of the reuse set. They are as frequent an can be as important as self-interferences. The above examples illustrate the fact that cross-interferences can vary signicantly, though apparently ranomly. Consiering arrays base aress is critical for etecting an estimating internal crossinterferences External cross-interferences If two references R 1 ; R 2 are not in translation, the cross-interferences on R 1 ue to R 2 are calle external cross-interferences. Computing such interferences amounts to estimating the relative position of R 1 an R 2. Each time the reuse set of R 1 an the interference set of R 2 overlap, the amount of overlapping is estimate (the time unit here is one iteration of the loop where reuse occurs for R 1, i.e. loop l). The virtual aresses of references of R 1 ; R 2 are r0 1 + r1 1 j 1 + : : :+ r 1 j n n an r0 2 + r2 1 j 1 + : : :+ r 2 j n n. The positions of the interference set an the reuse set are etermine by loop inices j i with i l (loop inices j i with i > l etermine the sets themselves). Therefore, the relative aress istance of the two sets is (r0 1? r2 0 ) + (r1 1? r2 1 )j 1 + : : : + (r 1? l r2)j l l. Let r i = r 1? i r2 for i > 0, i r 0 = r0 1? r2 0 mo C S, an D = gc 1il (r i ). Then for any value of (j i ) 1il, there exists 2 Z such that r 0 + r 1 j 1 + : : : + r l j l = r 0 + D. Let = gc (D; C S ). The possible relative cache positions of the two references are necessarily of the form r 0 + mo C S. Therefore, RS(R 1 ) l an IS(R 2 ) l have only C S possible relative cache positions. Approximately on each iteration of reuse loop l, a new relative position is reache. Therefore, after C S iterations of loop l, all possible relative positions have been reache. Consequently, the relative "movement" of the two references is perioic, of perio C S. The total number of iterations of loop l execute is N 1 : : : N l, so there are N 1:::Nl C S perios. Since the perio is C S, for any interval of C S values of, r 0 + mo C S escribes all possible relative cache positions. Let I be one such interval. Then for any value of, the number of cache lines use by both the reuse set an the interference set, i.e. CL(R 1 ; R 2 ; ), is compute the same way internal cross-interferences are evaluate, i.e. by computing the beginning an the en of each set an then eucing their overlap. Then, the total number of aitional memory requests for one perio is P 2I CL(R1 ; R 2 ; ) an the total number of aitional memory requests is N 1:::Nl C S P 2I CL(R1 ; R 2 ; ). So the process for etermining external cross-interferences on R 1 ue to R 2 is the following 1. Determine the reuse set of R Determine. The relative cache positions of the two sets are r 0 + mo C S. The perio of the movement is C S. Any interval I of C S values of is picke. Compute CL(R1 ; R 2 ; ) for any 2 I. 3. The total number of aitional memory requests is equal to N 1:::Nl C S P 2I CL(R1 ; R 2 ; ). Blocke Matrix-Vector multiply Vector multiply Let us consier the following blocke version of Matrix- 16

17 DO J1 = 0, N-1, Bs DO J2 = 0, N-1 DO J3 = J1, J1 + Bs-1 Y(J2) += A(J2,J3) * X(J3) where B S is assume to ivie N for simplicity's sake. As for the example of blocke Matrix-Matrix multiply, new inices are ene: j 3 = J3? J1, j 1 = J1 an j 2 = J2. The linear function corresponing to the subscripts of X an A are the BS following X: j 3 + 0j 2 + B S j 1 + x 0 A: j 3 + N j 2 + B S j 1 + a 0 (where N is the leaing imension of array A) Let us consier how array A perturbates the actual reuse of array X in Matrix-Vector multiply. The reuse set RS(X) 2 of X is an interval of size B S cache lines (reuse occurs on loop BS 2). The corresponing interference set of A is an interval of cache lines also (IS(A) 2 = f0 + N j 2 + B S j 1 + a 0 ; : : : ; B S? 1 + N j 2 + B S j 1 + a 0 ; 0 j 2 N? 1g). The relative position of the two sets change on each iteration of j 2. It is assume B S < C S 2 so that RS(X) 2 an IS(A) 2 o not necessarily overlap in cache. The relative positions of A an X are j 3 + N j 2 + B S j 1 + a 0? (j 3 + B S j 1 + x 0 ) = a 0? x 0 + N j 2. Let r 0 = a 0? x 0 mo C S. So D = gc (N) = N an = gc (N; C S ), an there are C S relative cache positions. Intuitively, stuying the relative cache positions of the two sets means that the reuse set is consiere to be xe, while the interference set is consiere to be moving in cache. General case Let us pick an interval I of C S values of such that? C S 2 r 0 + C S 2,? i.e. I = C S C S2 2?r0?r0 ;. Then, external cross-interferences occur when the beginning of the interference set, i.e. r 0 + is locate within the reuse set, i.e. within [0; B S ], or when the en of the interference set, i.e. r B mo C S is locate within the reuse set, i.e. [0; B S ]. This can be expresse by the constraints?b S r 0 + B S h. Therefore, it isi possible to ene I (B S ), the subinterval of I where overlapping occurs I (B S ) =. For any value of 2 I (B S ),?BS?r0 ; B S?r0 the overlapping is CL(X; A; ) = jb S?(r0+)j cache lines, an the number of aitional memory requests per perio is P 2I(BS) CL(X; A; ). The total number of aitional memory requests is N 2 P 2I(BS) CL(X; A; ). BS C S Spatial interferences Estimating spatial interferences is one exactly the same way as for temporal interferences, except that spatial interferences occur only if?l S r 0 + L S instea of?b S r 0 + B S for temporal interferences. The number of aitional memory requests ue to spatial interferences is equal to CL s (X; A; ) = B S (jl S? 1? (r 0 + )j) per value of (note that 17

18 L S?1 is use instea of L S because the rst reference to the line is consiere to be a temporal reuse, not a spatial reuse). As above, an interval I (L S ) of values of where spatial interferences occur can be ene. Then, the number of aitional memory requests per perio is P 2I() CL s(x; A; ). N The total number of aitional memory requests is 2 P BS BS C S 2I() CL s(x; A; ). Note that spatial interferences occur very few times (or none) over a perio. So, though there are numerous aitional memory requests for each value of where spatial interferences occur, there are few such values of so that the total number of spatial interferences is small in general. Particular cases Depening on the values of r 0 an many ierent situations can occur. If B S <, there are "holes" between the ierent possible cache positions of the interference set. One such hole has size? B S, so that if > 2B S, the reuse set can t entirely within such a hole. In that case, no external cross-interferences woul occur between the two arrays. Let us consier the case = C S, i.e. there is only one possible cache position for the interference set. Then if r 0 B S, no external cross-interference can occur, while if r 0 < B S, external cross-interferences occur on each iteration of the reuse loop j 2 (cf gure??). In that case, B S?r0 cache lines overlap, so aitional memory requests are ue to such external cross-interferences. The worse case correspons to r 0 = 0, total overlapping occurs, not only temporal locality but also spatial locality cannot be exploite. that a total of N 2 BS B S?r0 r0 small r0 r0 large Bs Cache Elements of A Elements of X Reusable elements of X r0 Figure 9: Cross-interferences between A an X (N mo C S = 0). Conclusions In opposition to internal cross-interferences, external cross-interferences occur perioically an with varying importance. Therefore etecting an estimating external crossinterferences is more icult. Still, a precise estimate can be erive by stuying such interferences over a perio. It is possible to compute the perio of interferences an the number of cross-interferences over a perio. Then, the total number of external cross-interferences is equal to the number of perios times the external cross-interferences over one perio. Though, external cross-interferences operate in a more irregular manner than internal cross-interferences, they can still be very amaging. 3.5 Interferences with multiple array references The previous section provies a metho for estimating how much an array can perturbate the potential reuse of another array. Now, when the reuse of an array is perturbate by several 18

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem Inuence of Cross-Interferences on Blocke Loops A Case Stuy with Matrix-Vector Multiply CHRISTINE FRICKER INRIA, France an OLIVIER TEMAM an WILLIAM JALBY University of Versailles, France State-of-the art

More information

Learning Polynomial Functions. by Feature Construction

Learning Polynomial Functions. by Feature Construction I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract Optimization Methos for Calculating Design Imprecision y William S. Law Eri K. Antonsson Engineering Design Research Laboratory Division of Engineering an Applie Science California Institute of Technology

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing Inexing the Eges A simple an yet efficient approach to high-imensional inexing Beng Chin Ooi Kian-Lee Tan Cui Yu Stephane Bressan Department of Computer Science National University of Singapore 3 Science

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. Preface Here are my online notes for my Calculus I course that I teach here at Lamar University. Despite the fact that these are my class notes, they shoul be accessible to anyone wanting to learn Calculus

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

A Plane Tracker for AEC-automation Applications

A Plane Tracker for AEC-automation Applications A Plane Tracker for AEC-automation Applications Chen Feng *, an Vineet R. Kamat Department of Civil an Environmental Engineering, University of Michigan, Ann Arbor, USA * Corresponing author (cforrest@umich.eu)

More information

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help CS2110 Spring 2016 Assignment A. Linke Lists Due on the CMS by: See the CMS 1 Preamble Linke Lists This assignment begins our iscussions of structures. In this assignment, you will implement a structure

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

d 3 d 4 d d d d d d d d d d d 1 d d d d d d

d 3 d 4 d d d d d d d d d d d 1 d d d d d d Proceeings of the IASTED International Conference Software Engineering an Applications (SEA') October 6-, 1, Scottsale, Arizona, USA AN OBJECT-ORIENTED APPROACH FOR MANAGING A NETWORK OF DATABASES Shu-Ching

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Linus Svärm Petter Stranmark Centre for Mathematical Sciences, Lun University {linus,petter}@maths.lth.se Abstract Shift-map image processing is a new framework base on energy

More information

Figure 1: Schematic of an SEM [source: ]

Figure 1: Schematic of an SEM [source:   ] EECI Course: -9 May 1 by R. Sanfelice Hybri Control Systems Eelco van Horssen E.P.v.Horssen@tue.nl Project: Scanning Electron Microscopy Introuction In Scanning Electron Microscopy (SEM) a (bunle) beam

More information

6 Gradient Descent. 6.1 Functions

6 Gradient Descent. 6.1 Functions 6 Graient Descent In this topic we will iscuss optimizing over general functions f. Typically the function is efine f : R! R; that is its omain is multi-imensional (in this case -imensional) an output

More information

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem Throughput Characterization of Noe-base Scheuling in Multihop Wireless Networks: A Novel Application of the Gallai-Emons Structure Theorem Bo Ji an Yu Sang Dept. of Computer an Information Sciences Temple

More information

Using Vector and Raster-Based Techniques in Categorical Map Generalization

Using Vector and Raster-Based Techniques in Categorical Map Generalization Thir ICA Workshop on Progress in Automate Map Generalization, Ottawa, 12-14 August 1999 1 Using Vector an Raster-Base Techniques in Categorical Map Generalization Beat Peter an Robert Weibel Department

More information

UNIT 9 INTERFEROMETRY

UNIT 9 INTERFEROMETRY UNIT 9 INTERFEROMETRY Structure 9.1 Introuction Objectives 9. Interference of Light 9.3 Light Sources for 9.4 Applie to Flatness Testing 9.5 in Testing of Surface Contour an Measurement of Height 9.6 Interferometers

More information

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,

More information

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

Improving Spatial Reuse of IEEE Based Ad Hoc Networks mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos

More information

AnyTraffic Labeled Routing

AnyTraffic Labeled Routing AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,

More information

Recitation Caches and Blocking. 4 March 2019

Recitation Caches and Blocking. 4 March 2019 15-213 Recitation Caches an Blocking 4 March 2019 Agena Reminers Revisiting Cache Lab Caching Review Blocking to reuce cache misses Cache alignment Reminers Due Dates Cache Lab (Thursay 3/7) Miterm Exam

More information

ACE: And/Or-parallel Copying-based Execution of Logic Programs

ACE: And/Or-parallel Copying-based Execution of Logic Programs ACE: An/Or-parallel Copying-base Execution of Logic Programs Gopal GuptaJ Manuel Hermenegilo* Enrico PontelliJ an Vítor Santos Costa' Abstract In this paper we present a novel execution moel for parallel

More information

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics CS 106 Winter 2016 Craig S. Kaplan Moule 01 Processing Recap Topics The basic parts of speech in a Processing program Scope Review of syntax for classes an objects Reaings Your CS 105 notes Learning Processing,

More information

On the Placement of Internet Taps in Wireless Neighborhood Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks 1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that

More information

Exercises of PIV. incomplete draft, version 0.0. October 2009

Exercises of PIV. incomplete draft, version 0.0. October 2009 Exercises of PIV incomplete raft, version 0.0 October 2009 1 Images Images are signals efine in 2D or 3D omains. They can be vector value (e.g., color images), real (monocromatic images), complex or binary

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES OLIVIER BERNARDI AND ÉRIC FUSY Abstract. We present bijections for planar maps with bounaries. In particular, we obtain bijections for triangulations an quarangulations

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

P. Fua and Y. G. Leclerc. SRI International. 333 Ravenswood Avenue, Menlo Park, CA

P. Fua and Y. G. Leclerc. SRI International. 333 Ravenswood Avenue, Menlo Park, CA Moel Driven Ege Detection P. Fua an Y. G. Leclerc SI International 333 avenswoo Avenue, Menlo Park, CA 9425 (fua@ai.sri.com leclerc@ai.sri.com) Machine Vision an Applications, 3, 199 Abstract Stanar ege

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

Waleed K. Al-Assadi. Anura P. Jayasumana. Yashwant K. Malaiya y. February Colorado State University

Waleed K. Al-Assadi. Anura P. Jayasumana. Yashwant K. Malaiya y. February Colorado State University Dierential I DDQ Testable Static RAM Architecture Walee K. Al-Assai Anura P. Jayasumana Yashwant K. Malaiya y Technical Report CS-96-102 February 1996 Department of Electrical Engineering/ y Department

More information

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract The Reconstruction of Graphs Dhananay P. Mehenale Sir Parashurambhau College, Tila Roa, Pune-4030, Inia. Abstract In this paper we iscuss reconstruction problems for graphs. We evelop some new ieas lie

More information

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Questions? Post on piazza, or  Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)! EE122 Fall 2013 HW3 Instructions Recor your answers in a file calle hw3.pf. Make sure to write your name an SID at the top of your assignment. For each problem, clearly inicate your final answer, bol an

More information

Design of Policy-Aware Differentially Private Algorithms

Design of Policy-Aware Differentially Private Algorithms Design of Policy-Aware Differentially Private Algorithms Samuel Haney Due University Durham, NC, USA shaney@cs.ue.eu Ashwin Machanavajjhala Due University Durham, NC, USA ashwin@cs.ue.eu Bolin Ding Microsoft

More information

Design of Controller for Crawling to Sitting Behavior of Infants

Design of Controller for Crawling to Sitting Behavior of Infants Design of Controller for Crawling to Sitting Behavior of Infants A Report submitte for the Semester Project To be accepte on: 29 June 2007 by Neha Priyaarshini Garg Supervisors: Luovic Righetti Prof. Auke

More information

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,

More information

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency Software Reliability Moeling an Cost Estimation Incorporating esting-effort an Efficiency Chin-Yu Huang, Jung-Hua Lo, Sy-Yen Kuo, an Michael R. Lyu -+ Department of Electrical Engineering Computer Science

More information

Performance Modelling of Necklace Hypercubes

Performance Modelling of Necklace Hypercubes erformance Moelling of ecklace ypercubes. Meraji,,. arbazi-aza,, A. atooghy, IM chool of Computer cience & harif University of Technology, Tehran, Iran {meraji, patooghy}@ce.sharif.eu, aza@ipm.ir a Abstract

More information

Table-based division by small integer constants

Table-based division by small integer constants Table-base ivision by small integer constants Florent e Dinechin, Laurent-Stéphane Diier LIP, Université e Lyon (ENS-Lyon/CNRS/INRIA/UCBL) 46, allée Italie, 69364 Lyon Ceex 07 Florent.e.Dinechin@ens-lyon.fr

More information

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways Ben, Jogs, An Wiggles for Railroa Tracks an Vehicle Guie Ways Louis T. Klauer Jr., PhD, PE. Work Soft 833 Galer Dr. Newtown Square, PA 19073 lklauer@wsof.com Preprint, June 4, 00 Copyright 00 by Louis

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Svärm, Linus; Stranmark, Petter Unpublishe: 2010-01-01 Link to publication Citation for publishe version (APA): Svärm, L., & Stranmark, P. (2010). Shift-map Image Registration.

More information

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System CS341: Operating System Lect31: 21 st Oct 2014 Dr A Sahu Dept o Comp Sc & Engg Inian Institute o Technology Guwahati ain Contiguous Allocation, Segmentation, Paging Page Table an TLB Paging : Larger Page

More information

1 Surprises in high dimensions

1 Surprises in high dimensions 1 Surprises in high imensions Our intuition about space is base on two an three imensions an can often be misleaing in high imensions. It is instructive to analyze the shape an properties of some basic

More information

Verifying performance-based design objectives using assemblybased vulnerability

Verifying performance-based design objectives using assemblybased vulnerability Verying performance-base esign objectives using assemblybase vulnerability K.A. Porter Calornia Institute of Technology, Pasaena, Calornia, USA A.S. Kiremijian Stanfor University, Stanfor, Calornia, USA

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis Multilevel Linear Dimensionality Reuction using Hypergraphs for Data Analysis Haw-ren Fang Department of Computer Science an Engineering University of Minnesota; Minneapolis, MN 55455 hrfang@csumneu ABSTRACT

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory Feature Extraction an Rule Classification Algorithm of Digital Mammography base on Rough Set Theory Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative

More information

PART 2. Organization Of An Operating System

PART 2. Organization Of An Operating System PART 2 Organization Of An Operating System CS 503 - PART 2 1 2010 Services An OS Supplies Support for concurrent execution Facilities for process synchronization Inter-process communication mechanisms

More information

When Clusters Meet Partitions: Dennis J.-H. Huang and Andrew B. Kahng. UCLA Computer Science Department, Los Angeles, CA USA

When Clusters Meet Partitions: Dennis J.-H. Huang and Andrew B. Kahng. UCLA Computer Science Department, Los Angeles, CA USA When Clusters Meet Partitions: New Density-Base Methos for Circuit Decomposition Dennis J.-H. Huang an Anrew B. Kahng UCLA Computer Science Department, Los Angeles, CA 90024-596 USA jenhsin@cs.ucla.eu,

More information

Multimodal Stereo Image Registration for Pedestrian Detection

Multimodal Stereo Image Registration for Pedestrian Detection Multimoal Stereo Image Registration for Peestrian Detection Stephen Krotosky an Mohan Trivei Abstract This paper presents an approach for the registration of multimoal imagery for peestrian etection when

More information

Lecture 1 September 4, 2013

Lecture 1 September 4, 2013 CS 84r: Incentives an Information in Networks Fall 013 Prof. Yaron Singer Lecture 1 September 4, 013 Scribe: Bo Waggoner 1 Overview In this course we will try to evelop a mathematical unerstaning for the

More information

Finite Automata Implementations Considering CPU Cache J. Holub

Finite Automata Implementations Considering CPU Cache J. Holub Finite Automata Implementations Consiering CPU Cache J. Holub The finite automata are mathematical moels for finite state systems. More general finite automaton is the noneterministic finite automaton

More information

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In

More information

NAND flash memory is widely used as a storage

NAND flash memory is widely used as a storage 1 : Buffer-Aware Garbage Collection for Flash-Base Storage Systems Sungjin Lee, Dongkun Shin Member, IEEE, an Jihong Kim Member, IEEE Abstract NAND flash-base storage evice is becoming a viable storage

More information

Local Path Planning with Proximity Sensing for Robot Arm Manipulators. 1. Introduction

Local Path Planning with Proximity Sensing for Robot Arm Manipulators. 1. Introduction Local Path Planning with Proximity Sensing for Robot Arm Manipulators Ewar Cheung an Vlaimir Lumelsky Yale University, Center for Systems Science Department of Electrical Engineering New Haven, Connecticut

More information

Depth Sizing of Surface Breaking Flaw on Its Open Side by Short Path of Diffraction Technique

Depth Sizing of Surface Breaking Flaw on Its Open Side by Short Path of Diffraction Technique 17th Worl Conference on Nonestructive Testing, 5-8 Oct 008, Shanghai, China Depth Sizing of Surface Breaking Flaw on Its Open Sie by Short Path of Diffraction Technique Hiroyuki FUKUTOMI, Shan LIN an Takashi

More information

Selection Strategies for Initial Positions and Initial Velocities in Multi-optima Particle Swarms

Selection Strategies for Initial Positions and Initial Velocities in Multi-optima Particle Swarms ACM, 2011. This is the author s version of the work. It is poste here by permission of ACM for your personal use. Not for reistribution. The efinitive version was publishe in Proceeings of the 13th Annual

More information

Department of Computer Science, POSTECH, Pohang , Korea. (x 0 (t); y 0 (t)) 6= (0; 0) and N(t) is well dened on the

Department of Computer Science, POSTECH, Pohang , Korea. (x 0 (t); y 0 (t)) 6= (0; 0) and N(t) is well dened on the Comparing Oset Curve Approximation Methos Gershon er +, In-Kwon, an Myung-Soo Kim + Department of Computer Science, Technion, IIT, Haifa 32000, Israel Department of Computer Science, POSTECH, Pohang 790-784,

More information

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises verview Frequent Pattern Mining comprises Frequent Pattern Mining hristian Borgelt School of omputer Science University of Konstanz Universitätsstraße, Konstanz, Germany christian.borgelt@uni-konstanz.e

More information

WLAN Indoor Positioning Based on Euclidean Distances and Fuzzy Logic

WLAN Indoor Positioning Based on Euclidean Distances and Fuzzy Logic WLAN Inoor Positioning Base on Eucliean Distances an Fuzzy Logic Anreas TEUBER, Bern EISSFELLER Institute of Geoesy an Navigation, University FAF, Munich, Germany, e-mail: (anreas.teuber, bern.eissfeller)@unibw.e

More information

X y. f(x,y,d) f(x,y,d) Peak. Motion stereo space. parameter space. (x,y,d) Motion stereo space. Parameter space. Motion stereo space.

X y. f(x,y,d) f(x,y,d) Peak. Motion stereo space. parameter space. (x,y,d) Motion stereo space. Parameter space. Motion stereo space. 3D Shape Measurement of Unerwater Objects Using Motion Stereo Hieo SAITO Hirofumi KAWAMURA Masato NAKAJIMA Department of Electrical Engineering, Keio Universit 3-14-1Hioshi Kouhoku-ku Yokohama 223, Japan

More information

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides Threshol Base Data Aggregation Algorithm To Detect Rainfall Inuce Lanslies Maneesha V. Ramesh P. V. Ushakumari Department of Computer Science Department of Mathematics Amrita School of Engineering Amrita

More information

A fast embedded selection approach for color texture classification using degraded LBP

A fast embedded selection approach for color texture classification using degraded LBP A fast embee selection approach for color texture classification using egrae A. Porebski, N. Vanenbroucke an D. Hama Laboratoire LISIC - EA 4491 - Université u Littoral Côte Opale - 50, rue Ferinan Buisson

More information

MR DAMPER OPTIMAL PLACEMENT FOR SEMI-ACTIVE CONTROL OF BUILDINGS USING AN EFFICIENT MULTI- OBJECTIVE BINARY GENETIC ALGORITHM

MR DAMPER OPTIMAL PLACEMENT FOR SEMI-ACTIVE CONTROL OF BUILDINGS USING AN EFFICIENT MULTI- OBJECTIVE BINARY GENETIC ALGORITHM 24th International Symposium on on Automation & Robotics in in Construction (ISARC 2007) Construction Automation Group I.I.T. Maras MR DAMPER OPTIMAL PLACEMENT FOR SEMI-ACTIVE CONTROL OF BUILDINGS USING

More information

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks Architecture Design of Mobile Access Coorinate Wireless Sensor Networks Mai Abelhakim 1 Leonar E. Lightfoot Jian Ren 1 Tongtong Li 1 1 Department of Electrical & Computer Engineering, Michigan State University,

More information

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques Politehnica University of Timisoara Mobile Computing, Sensors Network an Embee Systems Laboratory ing Techniques What is testing? ing is the process of emonstrating that errors are not present. The purpose

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly International Journal "Information Technologies an Knowlege" Vol. / 2007 309 [Project MINERVAEUROPE] Project MINERVAEUROPE: Ministerial Network for Valorising Activities in igitalisation -

More information

Rough Set Approach for Classification of Breast Cancer Mammogram Images

Rough Set Approach for Classification of Breast Cancer Mammogram Images Rough Set Approach for Classification of Breast Cancer Mammogram Images Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative Methos an Information Systems

More information

A Cost Model For Nearest Neighbor Search. High-Dimensional Data Space

A Cost Model For Nearest Neighbor Search. High-Dimensional Data Space A Cost Moel For Nearest Neighbor Search in High-Dimensional Data Space Stefan Berchtol University of Munich Germany berchtol@informatikuni-muenchene Daniel A Keim University of Munich Germany keim@informatikuni-muenchene

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

A Formal Model and Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Standard Floating Point Division

A Formal Model and Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Standard Floating Point Division A Formal Moel an Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Stanar Floating Point Division Davi W. Matula, Lee D. McFearin Department of Computer Science an Engineering

More information

Classical Mechanics Examples (Lagrange Multipliers)

Classical Mechanics Examples (Lagrange Multipliers) Classical Mechanics Examples (Lagrange Multipliers) Dipan Kumar Ghosh Physics Department, Inian Institute of Technology Bombay Powai, Mumbai 400076 September 3, 015 1 Introuction We have seen that the

More information

All-to-all Broadcast for Vehicular Networks Based on Coded Slotted ALOHA

All-to-all Broadcast for Vehicular Networks Based on Coded Slotted ALOHA Preprint, August 5, 2018. 1 All-to-all Broacast for Vehicular Networks Base on Coe Slotte ALOHA Mikhail Ivanov, Frerik Brännström, Alexanre Graell i Amat, an Petar Popovski Department of Signals an Systems,

More information

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks Coorinating Distribute Algorithms for Feature Extraction Offloaing in Multi-Camera Visual Sensor Networks Emil Eriksson, György Dán, Viktoria Foor School of Electrical Engineering, KTH Royal Institute

More information

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE БСУ Международна конференция - 2 THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE Evgeniya Nikolova, Veselina Jecheva Burgas Free University Abstract:

More information

Supporting Fully Adaptive Routing in InfiniBand Networks

Supporting Fully Adaptive Routing in InfiniBand Networks XIV JORNADAS DE PARALELISMO - LEGANES, SEPTIEMBRE 200 1 Supporting Fully Aaptive Routing in InfiniBan Networks J.C. Martínez, J. Flich, A. Robles, P. López an J. Duato Resumen InfiniBan is a new stanar

More information

A Comparative Evaluation of Iris and Ocular Recognition Methods on Challenging Ocular Images

A Comparative Evaluation of Iris and Ocular Recognition Methods on Challenging Ocular Images A Comparative Evaluation of Iris an Ocular Recognition Methos on Challenging Ocular Images Vishnu Naresh Boeti Carnegie Mellon University Pittsburgh, PA 523 naresh@cmu.eu Jonathon M Smereka Carnegie Mellon

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation On Effectively Determining the Downlink-to-uplink Sub-frame With Ratio for Mobile WiMAX Networks Using Spline Extrapolation Panagiotis Sarigianniis, Member, IEEE, Member Malamati Louta, Member, IEEE, Member

More information

Investigation into a new incremental forming process using an adjustable punch set for the manufacture of a doubly curved sheet metal

Investigation into a new incremental forming process using an adjustable punch set for the manufacture of a doubly curved sheet metal 991 Investigation into a new incremental forming process using an ajustable punch set for the manufacture of a oubly curve sheet metal S J Yoon an D Y Yang* Department of Mechanical Engineering, Korea

More information

E2EM-X4X1 2M *2 E2EM-X4X2 2M Shielded E2EM-X8X1 2M *2 E2EM-X8X2 2M *1 M30 15 mm E2EM-X15X1 2M *2 E2EM-X15X2 2M

E2EM-X4X1 2M *2 E2EM-X4X2 2M Shielded E2EM-X8X1 2M *2 E2EM-X8X2 2M *1 M30 15 mm E2EM-X15X1 2M *2 E2EM-X15X2 2M Long-istance Proximity Sensor EEM CSM_EEM_DS_E_7_ Long-istance Proximity Sensor Long-istance etection at up to mm enables secure mounting with reuce problems ue to workpiece collisions. No polarity for

More information

Robust Camera Calibration for an Autonomous Underwater Vehicle

Robust Camera Calibration for an Autonomous Underwater Vehicle obust Camera Calibration for an Autonomous Unerwater Vehicle Matthew Bryant, Davi Wettergreen *, Samer Aballah, Alexaner Zelinsky obotic Systems Laboratory Department of Engineering, FEIT Department of

More information

FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD

FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD Warsaw University of Technology Faculty of Physics Physics Laboratory I P Joanna Konwerska-Hrabowska 6 FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD.

More information

Adjacency Matrix Based Full-Text Indexing Models

Adjacency Matrix Based Full-Text Indexing Models 1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,

More information