Loop Scheduling and Partitions for Hiding Memory Latencies

Size: px
Start display at page:

Download "Loop Scheduling and Partitions for Hiding Memory Latencies"

Transcription

1 Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN Tel: 29µ Fax: 29µ Abstract Partition Scheuling with Prefetching (PSP) is a memory latency hiing technique which combines the loop pipelining technique with ata prefetching. In PSP, the iteration space is first ivie into regular partitions. Then two parts of the scheule, the ALU part an the memory part, are prouce an balance to prouce an overall scheule with high throughput. These two parts are execute simultaneously, an hence the remote memory latency are overlappe. We stuy the optimal partition shape an size so that a well balance overall scheule can be obtaine. Experiments on DSP benchmarks show that the propose methoology consistently prouces optimal or near optimal solutions. Introuction Because CPU spee has increase ramatically compare with memory spee, the slowness of memory hiners the overall system performance. A well planne ata prefetching scheme may reuce the memory miss penalty by overlapping the processor computations with the memory access operations to achieve high throughput computation. Multiimensional (MD) problems are of particular interests. These problems, for example a large number of DSP applications, are characterize by neste loops with uniform ate epenencies. Loop pipelining techniques are wiely use to expose the instruction level parallelism so that a goo scheule with high throughput can be obtaine. In this paper, we evelop a methoology calle Partition Scheuling with Prefetching (PSP) algorithm which combines the loop pipelining technique with a ata prefetching approach. This technique can be use in computational intensive applications (especially multi-imensional DSP applications) when two level memory hierarchies are existe. These two level memory are abstracte as the local memory an the remote memory. We assume it takes longer time to This work was partially supporte by NSF MIP 95-6, NSF MIP access the remote memory than it oes for the local memory. We also assume a process contains multiple ALUs an multiple memory units. The ALUs are for oing the computations. The memory units are special harwares we introuce for performing operations to prefetch ata from the remote memory to the local memory. The partition (tiling) technique is incorporate into the PSP algorithm. We partition the whole iteration space an execute one partition at a time. The benefit is that ata locality is improve within the partition, an therefore the number of prefetching operations is reuce. We stuy the legal partition shape an provie formulas for etermining an optimize partition size which guarantees a balance overall scheule. Furthermore, we estimate the requirement of local memory size for executing this partition. The estimate gives esigners a goo inication of local memory requirement. Traitional prefetching schemes [2, 5, 6] can be harware or software base. They either only gave the ynamic prefetching ecisions or i not give the complete static scheules. Partitioning the iteration space were not consiere in those approaches either. Several multi-imensional loop pipelining techniques have been propose. For example, Passos an Sha prove that in multi-imensional case (or neste loops), full-parallelism for MDFGs can always be achieve by using multi-imensional retiming [4]. However, none of the above research efforts inclue the prefetching iea in their loop scheuling algorithms. Experiments are one in many DSP benchmarks an the results are compare with other scheuling algorithms, such as list scheuling algorithm an PBS algorithm []. Experiments show that the average length obtaine by PSP is 26 7% of the one using list scheuling an 6 9% of the PBS. Since partitioning is not use in PBS, the result of our experiments also shows that partitioning the iteration space is very important for optimizing the overall scheule. 2 Algorithm framework Moeling the ALU computation

2 A neste loop of computation can be represente by a multi-imensional ata flow graph (MDFG) [4]. An MDFG G V E tµ is a noe- an ege-weighte irecte graph, where V is the set of computation noes, E V V is the set of epenence eges, is a function from E to Z n representing the multi-imensional inter-iteration epenency (elay) between two noes, where n is the number of imensions, an t is a function from V to positive integers representing the computation time of each noe. Partition Partition 2 O O Partition Partition 2 Figure 2: An illegal partition of the iteration space; A legal partition. for ( y = ; y<=m; y++) for ( x= ; x<=n; x++) { v[x][y] = v3[x-][y-] + 5; v2[x][y] = v3[x+][y-] *.2; v3[x][y] = v[x][y]+v2[x][y]*2; } (,) (,) 2 (,) (,) 3 (-,) 2 (,) (,) 3 (-,) Figure : a coe of neste loops; corresponing MDFG; (c) retime MDFG. The execution of all noes in V one time represents an iteration. An iteration is ientifie by a vector ĵ, which is equivalent to a multi-imensional inex starting from µ. Inter-iteration epenencies are represente by vector-weighte eges. For any iteration ĵ, an ege e from u to v (u e v) with elay vector eµ means that the computation of noe v at iteration ĵ epens on the execution of noe u at iteration ĵ eµ. An ege with elay vector µ represents a ata epenency within the same iteration. Figure is an example coe of a neste loop, with the corresponing MDFG in Figure. We call a legal MDFG G V E tµ is realizable, or implementable, if there exists a scheule vector s for the MDFG, such that s eµ for any e ¾ E [4]. This scheule vector s is regare as the normal vector for a set of parallel equitemporal hyperplanes, of which the iterations in the same hyperplane will be execute in sequence. For example, in Figure, s µ. Multi-imensional retiming technique [4] is use in our algorithm. In our stuy, the legal retiming vector r is chosen as the base vector orthogonal to s. Using Figure as an example, since s µ the base retiming vector r can be µ. The graph after MD retiming is shown in Figure (c). Partitioning the iteration space Instea of executing the whole iteration space in orer by rows or columns, we first partition it an then execute the partitions one by one. The two bounaries of a partition are calle the partition vectors. We will enote them by P x an P y Due to the epenencies in the MDFG, partition vectors (c) cannot be arbitrarily chosen. For example, note the iteration space in Figure 2, where ots represent iterations an vectors represent the inter-iteration epenencies. If we partition the iteration space to rectangular shape, as shown in Figure 2, this partition metho is illegal, because of the forwar epenencies from È ÖØ Ø ÓÒ to È ÖØ Ø ÓÒ ½ (the thin vectors) an the backwar epenencies from È ÖØ Ø ÓÒ ½ to È ÖØ Ø ÓÒ (the bol vectors). Due to these two-way epenencies between partitions, we cannot execute either one first. This partition is therefore not implementable an is illegal. In contrast, consier the alternative partition metho shown in Figure 2. Since there are no two-way epenencies, a feasible partition execution sequence exists. For example, execute È ÖØ Ø ÓÒ ½ first, then È ÖØ Ø ÓÒ ¾, an so on. Therefore, it is a legal partition. Architecture moel We assume a processor contains multiple ALUs an multiple special harwares calle memory units. Associate with the processor is a small local memory. Accessing the local memory is fast. There is also a large remote memory. However, accessing it is slow. The goal of our technique is to prefetch the operans of all computations into the local memory before the actual computations are taken place. These prefetching operations are performe by memory units. Two types of prefetching instructions, prefetch an keep, are supporte by memory units. The prefetch instruction prefetches the ata from the remote memory to the local memory; the keep instruction keeps the ata in the local memory for the execution of the next partition. Both of them are issue to make sure those ata about to be reference will be appeare in the local memory. PSP algorithm framework The scheule generate by PSP consists of two parts: the ALU part an the memory parts. In the ALU part of the scheule, we first use the multi-imensional rotation scheuling algorithm [3] to create the ALU scheule for one iteration. We then uplicate this one iteration ALU scheule an appen the copies consecutively to form the n iteration ALU scheule (where n is the number of iterations in the partition). The memory part of the scheule will be execute at the same time as the ALU part. It gives the global scheule for memory operations to prefetch all the operans neee by 2

3 iteration 4 iteration ALU CS : CS 2: CS 3: CS 4: CS 5: CS 6: CS 7: CS 8: CS 9: CS : CS : CS 2: CS 3: CS 4: CS 5: CS 6: CS 7: ALU FUs ALU4 Memory Units Prefetch Part Keep 2 Part mem unit mem unit 5 Figure 3: The overall scheule of a partition. Assuming there are four ALU functional units an five memory units. the next partition into the local memory. We call the partition which is being execute the current partition, an call the one that will be execute next the next partition. For all other partitions which have not been execute except the next one, we call them other partitions (see Figure 4(c)). In the memory part scheuling, if a non-zero elay ege passes from the current partition into other partitions, a prefetch operation is neee. Each irecte ege from the current partition to the next partition correspons to a keep operation. The framework of our algorithm is illustrate in Algorithm PSP. Figure 3 gives an example of our overall scheule the ALU part as well as the memory part. There are four iterations in one partition. In the ALU part, each iteration takes 4 control steps (CS) to finish, an hence all four iterations take len ALUµ 6 control steps. In the memory part, all prefetch operations are scheule from the top, an then the keep operations. The length of the memory part of the scheule is len memµ 7 Since these parts are execute simultaneously, the overall scheule length is the maximum of them, which is len overallµ max len ALUµ len memµ 7 If we ivie the overall scheule length by the number of iterations in the partition, we get the average scheule length ave len overallµ PSP scheuling We use multi-imensional rotation scheuling algorithm to scheule the ALU part for each iteration. Multi-imensional rotation scheuling is a loop pipelining technique which implicitly uses the multi-imensional retiming heuristic for scheuling cyclic graphs. The rotation scheuling is escribe in etail in [3]. Given an initial scheule, the rotation technique repeately transforms the ALU part of the scheule to a more compact one uner the resource constraint. Consier an example of MD rotation scheuling performe on the Algorithm PSP Partition Scheuling with Prefetching Input: Initial MDFG; ALU an memory constraint; execution time for ALU an memory operations. Output: An optimal partition an the optimize ALU an memory scheules for executing the partition. : /* using list scheuling to obtain ALU scheule */ 2: S initial ALU part scheule for one iteration 3: repeat 4: /* reucing the length of one iteration ALU part of the scheule by using MD rotation scheuling */ 5: S rotate the current scheule S 6: G r retime MDFG 7: /* ecie the optimal partition shape an size */ 8: Obtain the legal partition irections P x P y accoring to G r 9: Obtain the partition size so that the balancing property (Theorem 3) is satisfie : /* prouce the overall scheule */ : Number the iterations 2: Entire ALU part scheuling 3: Memory part scheuling 4: /* evaluation */ 5: Calculate the average length of the overall scheule 6: Calculate the local memory requirement 7: until the average scheule length cannot be reuce (-,) (,) (,) (-2,) (,) other partitions current partition (c) next partition Figure 4: The original MDFG of Wave Digital Filter; The retime MDFG after rotating noe ; (c) Soli eges represent prefetch operations; ashe eges represent keep operations; ot eges are the ata epenencies insie the partition, hence no memory operation is neee. Wave Digital Filter shown in Table, with the corresponing MDFG in Figure 4. Notation n iµ in the table conveys the computation noe n in the original i-th iteration in the partition. Accoring to the ata epenencies in the original MDFG shown in Figure 4, we have an initial scheule with length three, shown in the left part of Table. During the rotation, computation noe in control step (CS) is rotate, an the corresponing noe in the MDFG is retime by the base retiming vector r µ The scheule length is reuce to 2 after the rotation. In PSP scheuling algorithm, the ALU part then applies the same scheule pattern for each iteration in the partition. Iterations are execute one after the other in the ALU part of the scheule. Scheuling of the memory part consists of several steps. First, given the retime MDFG as a result from the MD rotation, we nee to ecie the irections of the legal partition vectors. Secon, the iterations in the partition shoul be numbere so that they can be scheule in that orer. Thir an 3

4 Ë Initial scheule Scheule after rotation ALU ALU2 ALU ALU2 ½ ¾ µ µ µ µ 2 µ 3 µ 2 µ 3 µ Iteration execution sequence in the partition. (s) hyperplane 3 hyperplane 2 y x Table : The ALU part of the scheule. hyperplane (r) CCW region P 4 (CCW) 2 5 Figure 6: Iterations will be execute from left to right in the P x irection an then precee to the next hyperplane along the irection perpenicular to P x ; Iteration orers in the partition. CW region 3(CW) Figure 5: The CW an CCW regions relative to vector p; The extreme CW an CCW vectors of vectors 2 an k an the partition vector P x an P y the most important step, calculate the optimal partition size to ensure a balance scheule. Fourth an the last, actually create the memory part of the scheule. We will explain these steps below in great etail. Among all the elay vectors in an MDFG, two extreme vectors, clockwise (CW) an counterclockwise (CCW), are the most important for eciing the irections of the legal partition vectors. They are given by the following efinition. Definition The extreme (outermost) clockwise vector CW of a vector set D 2 k satisfies these two conitions: () CW ¾ D; (2) all the vectors in D CW are in counterclockwise region of CW. The efinition of CCW vector is similar. Figure 5 illustrates the clockwise an counterclockwise regions relative to a vector p The magnitue of the cross prouct of two vectors p an p 2, enote by p Å p 2, is use to etermine the relative position of p an p 2 If p p x p yµ an p 2 p 2 x p 2 yµ then p Å p 2 p x p 2 y p 2 x p y If p Å p 2 is positive, then p is clockwise from p 2 with respect to the origin µ; if this cross prouct is negative, then p is counterclockwise from p 2 Legal partition vectors can only be outsie of CW an CCW or aligne with them. For example in Figure 5, we choose P y to be aligne with CCW, an P x to the aligne with x-axis, which is outsie of CW. This is a legal choice of partition vectors. In PSP algorithm, we assume the y elements of the elay vectors of the input MDFG are always which is often the case in real applications with neste loops. Therefore, vector s µ is always the legal scheuling vector. After choosing the base retiming vector r as µ the positive x-axis is always a legal irection for the partition vector. In our algorithm, the irection of the counterclockwise partition vector, P y is chosen to be aligne with the vector CCW; while the irection of the clockwise extreme partition vector, P x, is aligne with the positive x-axis. For convenience, we use P x an P y to enote the base partition vectors showing these two irections (The elements in the base partition vector have no common ivisors). The actual partition vectors are then enote by P x f x P x an P y f y P y where f x an f y are calle partition factors, which is relate to the size of the partition. The next step is to number the iterations within the partition so that they can be scheule in that orer. The iterations are numbere from left to right in the P x irection, as illustrate in Figure 6, an then to the next hyperplane along with the irection of the vector perpenicular to P x Figure 6 shows an example of the iteration orer. The black ots represent the iterations in the partition, while the numbers give the orer. The numbering can be easily one by sorting the iteration inices of ifferent iterations whoever has the smaller y element or has the same y element but the smaller x element will get the smaller number. We will iscuss how to obtain the optimal partition size in Section 4. Ë ALU part memory part ALU ALU2 MEM MEM2 ½ µ µ P2 2µ 2 µ P2 3µ 2 µ ¾ 2 µ 3 µ µ 2µ P3 2µ µ P3 3µ µ 2 µ 3 µ 2µ 3µ K 2µ µ K3 µ µ 2 2µ 3 2µ K3 µ µ 3µ 4µ 2 3µ 3 3µ K 4µ µ Table 2: The overall scheule with respect to the MDFG of Wave Digital Filter in Figure 4. After obtaining the partition irections an size, we can start to scheule the memory part. Prefetch operations are scheule as early as possible, because they o not have any ata epenencies. Keep operations have the ata epenencies from the ALU part. Therefore, a keep operation must be scheule after the corresponing computation, whichever provies the result of that ata instance, is finishe. For each keep, we efine the earliest starting time ( Ë) as the con- 4

5 O #iter= =fxfy( ) =fxfy*base_area Partition Partition Partition I II III base_area M =fy N β α O 2 =fx (c) Figure 7: f y restriction; f x restriction; (c) The number of iterations (# Ø Ö) insie one partition. trol step when the corresponing value is finishe computing. Then, starting from Ë we scheule the keep operation at the earliest available place in the memory part. Table 2 is the overall scheule of Wave Digital Filter shown in Figure 4. Here we assume two ALUs an two memory units. The ALU part of the scheule is a uplication of the four iterations of the scheule shown in Table. In the memory part, the notation Pn iµ x yµ conveys prefetch the ata instance which correspons to the elay vector x yµ from noe n in the i-th iteration. For example, P2 3µ 2 µ means prefetch the ata corresponing to the elay vector 2 µ from noe 2 3µ. We assume in Table 2 each prefetch operation takes two time units, that is, T pre 2 The own arrows ( ) in the table represent the continuation of the prefetch operation. Similarly, Kn iµ x yµ enotes the keep operation. We assume each keep operation takes one time unit, i.e., T keep In this example, the length of the overall scheule L is 8 Since there are 4 iterations in the partition, the average length of the overall length, enote by L ave is L 4 2 which is equal to the lower boun. 4 Partition size an memory size In the previous section, we have ecie the partition irections, enote by P x an P y Here we will etermine the two partition factors f x an f y so that a balance scheule can be achieve. First, we impose the restriction to f y that it shoul be large enough so that no elays can pass through the entire partition along the irection of P y For example, the partition vector P y in Figure 7 is not large enough, because the elay vector crosses both the bottom an the top bounaries of the partition. Denoting the set of all the non-zero elay vectors in the MDFG as D the above restriction can be represente by inequality: f y P y y y x yµ ¾ D Partition vector P x is restricte so that no elays starting from the current partition can reach two partitions later. In other wors, in Figure 7, elay eges starting from Partition I cannot reach Partition III. Therefore, we have NM P x sinα sin α βµ f sinα x P x sin α βµ This gives us: δ δ U V U V h=.y-y β α W l W P Q R S (c) =(x,y) =(x,y) area(uvw) =l*h=l*(.y-y) area(pqrs)= *y =fx* *y P Q y S R () Figure 8: Calculating the number of the elay eges crossing the bounary of the current partition an entering the next partition; (c)() Calculating the number of the elay eges crossing the bounary of the current partition an entering other partitions. f x sin α βµ P x sinα x yµ ¾ D Since f x is an integer, sin α βµ P x sinα this inequality is equivalent to f x x yµ ¾ D Below we erive the conitions for a balance scheule. Lemma shows how to calculate the length of the ALU part of the scheule, referring to Figure 7(c). Lemma The length of the ALU part of the scheule is L ALU #iter L ALU f x f y P x Å P y µ where L ALU enotes the length of the one-iteration ALU part of the scheule, an #iter is the number of iterations in the partition. Then we estimate how many memory operations are neee by calculating the areas of two shae regions in Figure 8. Given a elay vector x yµ region UVW in the current partition, shown in Figure 8, is the region where will enter the next partition. Similarly, region PQRS is where will enter other partitions. We enote the areas of the above two regions as A goto next µ an A goto others µ respectively, with respect to a given elay vector x yµ Lemma 2 Given a elay vector x yµ A goto next µ f y P y y yµ an A goto others µ f x y P x sin α βµ sinα Note that the number of elay eges entering the next partition, i.e. keep operations, is very close to the area of UVW Summing up all these areas for every istinct we get the total number of keep operations, # Ô A goto next µ sin α βµ f y P y y yµ sinα for all x yµ Similarly, the total number of prefetch operations is #ÔÖ Ø A goto others µ area PQRSµ P x y f x y P x Theorem 3 gives the conitions of what we call as a balance scheule. The iea here is to scheule prefetch 5

6 =(,) =(,) Figure 9: One memory location is neee for elay µ; P x y memory locations are neee for elay x y µ when y operations from the top of the memory part of the scheule, an scheule the keep from the bottom. The left-han sie of Inequality is the estimate length of the memory part scheule, an we only allow it to be at most T keep control steps longer than the ALU part, as shown in the righthan sie. The reason of leaving out T keep steps is to make rooms for those potential keep operations corresponing to the computational noes at the last control step in the ALU part. Corollary 4 concerns about the average overall scheule length. Theorem 3 Assume that N ALU N mem T ALU T keep an Inequality is satisfie. #pre T pre N mem #keep T keep L ALU # Ø Ö T keep N mem The length of the memory part of the scheule is at most T keep control steps longer than that of the ALU part. Corollary 4 If the partition satisfies the conitions presente in Theorem 3, the average length of the overall scheule is at most T keep #iter plus the average length of the ALU part of the scheule. Experiments show that rotation scheuling in most cases can generates the ALU part of the scheule which achieves the lower boun, i.e., L ALU boun ALUµ Therefore, the overall scheule either reaches its lower boun or is very T close to it; the ifference is at most keep P x ÅP y Now we estimate the local memory size for executing the partition. We classify the memory usage into two categories: basic memory for the working set an reserve memory for prefetch an keep operations. The former correspons to all the internal elay eges in the partition. The elay ege µ in Figure 9 inicates a ata instance prouce in Iteration I an consume in the next Iteration I Only one memory location is neee to hol this ata because we can reuse the same location for later iterations. In general, we nee x memory locations for each x µ However, when x µ as shown in Figure 9, a whole row of intermeiate values nee to be () kept. Thus a total of P x memory locations are neee. In general, for each x y µ where y P x y memory locations are neee. Summarizing the above, the size of the basic memory for the working set is equal to Size ws x y µ x when y P x y when y Now let us consier the secon category: reserve memory for prefetch an keep operations. These operations represent the ata instances pre-loae or pre-occupie in the local memory before we execute this partition. Each one of them nees a reserve memory location. The total number of these pre-occupie ata is two times the total number of memory operations (one for the pre-loae ata for the current partition; the other for the new generate ata for the next partition). Therefore, the size of this part of the memory is Size reserve 2 #pre #keepµ Finally, the local memory neee to execute this partition is Local size Size ws Size reserve 5 Experimental Results In this section, the effectiveness of the PSP algorithm is evaluate by running a set of simulations on DSP benchmarks. Table 3 an Table 4 show our scheuling results. The first column presents the benchmarks names. The secon to fourth columns are the parameters of the input MDFG, with the secon column showing the number of noes an the thir an fourth columns showing the ALU an memory unit resource constraints. The partition generate by the algorithm is shown in the fifth to seventh columns. The final scheule is shown in the next three columns. Column L gives the length of the overall scheule an Column L ave is the average ( #iter L ). In orer to compare our results with the lower boun, as well as the results from other algorithms, we calculate the lower bouns of the scheule length, N an put them in Column LB. We also ran the same set of benchmarks using list scheuling an Prefetch Balance rotation Scheuling (PBS). The results are shown in Columns List an PBS, respectively, where the sub-column len is the scheule length an the sub-column ratio is the ratio comparing the PSP scheule length with that of list scheuling an PBS scheuling, i.e. ratio L ave len N alu The abbreviations for our benchmarks WDF, IIR, DPCM, 2D an Floy stan for Wave Digital filter, Infinite Impulse Response filter, Differential Pulse-Coe Moulation evice, Two Dimensional filter, an Floy-Steinberg algorithm, respectively. In Table 3, we assume that each ALU operation takes time unit, each keep operation also takes time unit, an each prefetch takes 2 time units, while in Table 4, we assume each prefetch takes time units. In the PBS experiments in Table 4, the graphs are first 6

7 Benchmark Parameters Partition PSP Scheule List PBS N N alu N mem P x P y #iter L L ave LB len ratio len ratio WDF() (3,) (-4,2) % % WDF(2) 3 (4,) (-3,) % 4 6.3% IIR (6,) (-4,2) % 6.3% DPCM (6,) (-4,2) % % 2D() (3,) (,) % 2 % 2D(2) (2,) (-4,2) % 3 75% MDFG (4,) (-3,) % 4 6.3% MDFG (4,) (-6,6) % 8 5.5% Floy (4,) (-6,2) % 6 % Table 3: Experimental results on DSP filter benchmarks assuming T pre f etch 2 Benchmark Parameters Partition PSP Scheule List PBS unfol by 2 2 N N alu N mem P x P y #iter L L ave LB len ratio len ratio WDF() (3,) (-4,7) % % WDF(2) 3 (4,) (-2,4) % 5 8.2% IIR (6,) (-4,7) % 6.3 % DPCM (6,) (-4,7) % % 2D() (3,) (,4) % 2 6% 2D(2) (2,) (-6,8) % 5 4.2% MDFG (4,) (-2,4) % % MDFG (4,) (-35,35) % % Floy (4,) (-2,4) % 6% Table 4: Experimental results on DSP filter benchmarks assuming T pre f etch unfole by a factor of 2 2 before performing PBS scheuling. As we can see, list scheuling rarely achieves the optimal scheule length; the scheules are often ominate by a long memory part. In orer wors, the list scheules are not well balance. Although PBS is better than list scheuling, it too becomes less effective to generate a balance scheule especially when T pre f etch is large. Moreover, PBS nees to explicitly unfol by large factors in orer to generate goo scheules. This may cause a lot of computations (For example, after unfole by a factor of 2 2 the total number of noes is 4 times that of the original). The PSP algorithm consistently prouces optimal or near optimal scheules, as shown by the bol figures in the tables. Even in case of long memory latency, when T pre f etch is large, the algorithm still gives goo overall scheules without oing any unfoling. Almost all of the resulting scheules are very close to the optimal. In Table 3, the average ratio of the scheule length from the PSP algorithm to that from list scheuling an PBS are 63 2% an 84 9%, respectively; an in Table 4, 26 7% an 6 9% respectively. Moreover, since we o not unfol the graph, the computation time of this algorithm is very little. Almost all the experiments are finishe in less than two to three secons. Comparing Tables 3 an 4, we also see that when the memory latency is increase, the PSP algorithm tens to create a larger partition in orer to compensate for this long latency. It shows that the larger the partition, the closer the average scheule length is to the lower boun, because the overhea (T keep ) control steps are amortize over more iterations. References [] F. Chen, S. Tongsima, an E. H.-M. Sha. Loop scheuling optimization with ata prefetching base on multiimensional retiming. In Proc. ISCA th International Conference on Parallel an Distribute Computing Systems, pages 29 34, 998. [2] F. Dahlgren an M. Dubois. Sequential harware prefetching in share-memory multiprocessors. IEEE Transactions on Parallel an Distribute Systems, Vol. 6, No. 7, pages , Jul [3] N. L. Passos an Ewin H.-M. Sha. Scheuling of uniform multi-imensional systems uner resource constraints. To appear in the IEEE Transactions on VLSI systems. [4] N. L. Passos an Ewin H.-M. Sha. Achieving full parallelism using multi-imensional retiming. IEEE Transactions on Parallel an Distribute Systems, Vol. 7, No.,, pages 5 63, Nov [5] J. Skeppstet an M. Dubois. Hybri compiler/harware prefetching for multiprocessors using low-overhea cache miss traps. In the Proceeings of the International Conference on Parallel Processing, pages , 997. [6] M. K. Tcheun, H. oon, an S. R. Maeng. An aaptive sequential prefetching scheme in share-memory multiprocessors. In the Proceeings of the International Conference on Parallel Processing, pages 36 33,

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES OLIVIER BERNARDI AND ÉRIC FUSY Abstract. We present bijections for planar maps with bounaries. In particular, we obtain bijections for triangulations an quarangulations

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex boies is har Navin Goyal Microsoft Research Inia navingo@microsoftcom Luis Raemacher Georgia Tech lraemac@ccgatecheu Abstract We show that learning a convex boy in R, given ranom samples

More information

Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm NASA/CR-1998-208733 ICASE Report No. 98-45 Parallel Directionally Split Solver Base on Reformulation of Pipeline Thomas Algorithm A. Povitsky ICASE, Hampton, Virginia Institute for Computer Applications

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introuction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threas 6. CPU Scheuling 7. Process Synchronization 8. Dealocks 9. Memory Management 10.Virtual Memory

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Image Segmentation using K-means clustering and Thresholding

Image Segmentation using K-means clustering and Thresholding Image Segmentation using Kmeans clustering an Thresholing Preeti Panwar 1, Girhar Gopal 2, Rakesh Kumar 3 1M.Tech Stuent, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,

More information

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE БСУ Международна конференция - 2 THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE Evgeniya Nikolova, Veselina Jecheva Burgas Free University Abstract:

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. Preface Here are my online notes for my Calculus I course that I teach here at Lamar University. Despite the fact that these are my class notes, they shoul be accessible to anyone wanting to learn Calculus

More information

Optimal Partitioning and Balanced Scheduling with the Maximal Overlap of Data Footprints Λ

Optimal Partitioning and Balanced Scheduling with the Maximal Overlap of Data Footprints Λ Optimal Partitioning and Balanced Scheduling with the Maximal Overlap of Data Footprints Λ Zhong Wang Dept. of Comp. Sci. & Engr University of Notre Dame Notre Dame, IN 46556 zwang1@cse.nd.edu Edwin H.-M.

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

Table-based division by small integer constants

Table-based division by small integer constants Table-base ivision by small integer constants Florent e Dinechin, Laurent-Stéphane Diier LIP, Université e Lyon (ENS-Lyon/CNRS/INRIA/UCBL) 46, allée Italie, 69364 Lyon Ceex 07 Florent.e.Dinechin@ens-lyon.fr

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

Optimal Oblivious Path Selection on the Mesh

Optimal Oblivious Path Selection on the Mesh Optimal Oblivious Path Selection on the Mesh Costas Busch Malik Magon-Ismail Jing Xi Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 280, USA {buschc,magon,xij2}@cs.rpi.eu Abstract

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Algorithm for Intermodal Optimal Multidestination Tour with Dynamic Travel Times

Algorithm for Intermodal Optimal Multidestination Tour with Dynamic Travel Times Algorithm for Intermoal Optimal Multiestination Tour with Dynamic Travel Times Neema Nassir, Alireza Khani, Mark Hickman, an Hyunsoo Noh This paper presents an efficient algorithm that fins the intermoal

More information

Performance Modelling of Necklace Hypercubes

Performance Modelling of Necklace Hypercubes erformance Moelling of ecklace ypercubes. Meraji,,. arbazi-aza,, A. atooghy, IM chool of Computer cience & harif University of Technology, Tehran, Iran {meraji, patooghy}@ce.sharif.eu, aza@ipm.ir a Abstract

More information

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM International Journal of Physics an Mathematical Sciences ISSN: 2277-2111 (Online) 2016 Vol. 6 (1) January-March, pp. 24-6/Mao an Shi. THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM Hua Mao

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Characterizing Decoding Robustness under Parametric Channel Uncertainty Characterizing Decoing Robustness uner Parametric Channel Uncertainty Jay D. Wierer, Wahee U. Bajwa, Nigel Boston, an Robert D. Nowak Abstract This paper characterizes the robustness of ecoing uner parametric

More information

PART 2. Organization Of An Operating System

PART 2. Organization Of An Operating System PART 2 Organization Of An Operating System CS 503 - PART 2 1 2010 Services An OS Supplies Support for concurrent execution Facilities for process synchronization Inter-process communication mechanisms

More information

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters Available online at www.scienceirect.com Proceia Engineering 4 (011 ) 34 38 011 International Conference on Avances in Engineering Cluster Center Initialization Metho for K-means Algorithm Over Data Sets

More information

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace A Classification of R Orthogonal Manipulators by the Topology of their Workspace Maher aili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S.

More information

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation On Effectively Determining the Downlink-to-uplink Sub-frame With Ratio for Mobile WiMAX Networks Using Spline Extrapolation Panagiotis Sarigianniis, Member, IEEE, Member Malamati Louta, Member, IEEE, Member

More information

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics CS 106 Winter 2016 Craig S. Kaplan Moule 01 Processing Recap Topics The basic parts of speech in a Processing program Scope Review of syntax for classes an objects Reaings Your CS 105 notes Learning Processing,

More information

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks Coorinating Distribute Algorithms for Feature Extraction Offloaing in Multi-Camera Visual Sensor Networks Emil Eriksson, György Dán, Viktoria Foor School of Electrical Engineering, KTH Royal Institute

More information

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 1, NO. 4, APRIL 01 74 Towar Efficient Distribute Algorithms for In-Network Binary Operator Tree Placement in Wireless Sensor Networks Zongqing Lu,

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING PAMELA RUSSELL ADVISOR: PAUL CULL OREGON STATE UNIVERSITY ABSTRACT. Birchall an Teor prove that

More information

Computer Graphics Chapter 7 Three-Dimensional Viewing Viewing

Computer Graphics Chapter 7 Three-Dimensional Viewing Viewing Computer Graphics Chapter 7 Three-Dimensional Viewing Outline Overview of Three-Dimensional Viewing Concepts The Three-Dimensional Viewing Pipeline Three-Dimensional Viewing-Coorinate Parameters Transformation

More information

Classical Mechanics Examples (Lagrange Multipliers)

Classical Mechanics Examples (Lagrange Multipliers) Classical Mechanics Examples (Lagrange Multipliers) Dipan Kumar Ghosh Physics Department, Inian Institute of Technology Bombay Powai, Mumbai 400076 September 3, 015 1 Introuction We have seen that the

More information

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In

More information

arxiv: v1 [math.co] 15 Dec 2017

arxiv: v1 [math.co] 15 Dec 2017 Rectilinear Crossings in Complete Balance -Partite -Uniform Hypergraphs Rahul Gangopahyay Saswata Shannigrahi arxiv:171.05539v1 [math.co] 15 Dec 017 December 18, 017 Abstract In this paper, we stuy the

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stolyar Abstract Backpressure-base aaptive routing

More information

Polygon Simplification by Minimizing Convex Corners

Polygon Simplification by Minimizing Convex Corners Polygon Simplification by Minimizing Convex Corners Yeganeh Bahoo 1, Stephane Durocher 1, J. Mark Keil 2, Saee Mehrabi 3, Sahar Mehrpour 1, an Debajyoti Monal 1 1 Department of Computer Science, University

More information

Linear First-Order PDEs

Linear First-Order PDEs MODULE 2: FIRST-ORDER PARTIAL DIFFERENTIAL EQUATIONS 9 Lecture 2 Linear First-Orer PDEs The most general first-orer linear PDE has the form a(x, y)z x + b(x, y)z y + c(x, y)z = (x, y), (1) where a, b,

More information

6 Gradient Descent. 6.1 Functions

6 Gradient Descent. 6.1 Functions 6 Graient Descent In this topic we will iscuss optimizing over general functions f. Typically the function is efine f : R! R; that is its omain is multi-imensional (in this case -imensional) an output

More information

A Cost Model for Query Processing in High-Dimensional Data Spaces

A Cost Model for Query Processing in High-Dimensional Data Spaces A Cost Moel for Query Processing in High-Dimensional Data Spaces Christian Böhm Luwig Maximilians Universität München This is a preliminary release of an article accepte by ACM Transactions on Database

More information

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,

More information

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract The Reconstruction of Graphs Dhananay P. Mehenale Sir Parashurambhau College, Tila Roa, Pune-4030, Inia. Abstract In this paper we iscuss reconstruction problems for graphs. We evelop some new ieas lie

More information

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing Inexing the Eges A simple an yet efficient approach to high-imensional inexing Beng Chin Ooi Kian-Lee Tan Cui Yu Stephane Bressan Department of Computer Science National University of Singapore 3 Science

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stoylar arxiv:15.4984v1 [cs.ni] 27 May 21 Abstract

More information

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem Throughput Characterization of Noe-base Scheuling in Multihop Wireless Networks: A Novel Application of the Gallai-Emons Structure Theorem Bo Ji an Yu Sang Dept. of Computer an Information Sciences Temple

More information

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help CS2110 Spring 2016 Assignment A. Linke Lists Due on the CMS by: See the CMS 1 Preamble Linke Lists This assignment begins our iscussions of structures. In this assignment, you will implement a structure

More information

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition ITERATIOAL JOURAL OF MATHEMATICS AD COMPUTERS I SIMULATIO A eural etwork Moel Base on Graph Matching an Annealing :Application to Han-Written Digits Recognition Kyunghee Lee Abstract We present a neural

More information

Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA

Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA Implementation an Evaluation of AS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA Kazuya Matsumoto 1, orihisa Fujita 2, Toshihiro Hanawa 3, an Taisuke Boku 1,2 1 Center for Computational

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

UNIT 9 INTERFEROMETRY

UNIT 9 INTERFEROMETRY UNIT 9 INTERFEROMETRY Structure 9.1 Introuction Objectives 9. Interference of Light 9.3 Light Sources for 9.4 Applie to Flatness Testing 9.5 in Testing of Surface Contour an Measurement of Height 9.6 Interferometers

More information

Kinematic Analysis of a Family of 3R Manipulators

Kinematic Analysis of a Family of 3R Manipulators Kinematic Analysis of a Family of R Manipulators Maher Baili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S. 6597 1, rue e la Noë, BP 92101,

More information

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks

Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks 01 01 01 01 01 00 01 01 Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks Mihaela Carei, Yinying Yang, an Jie Wu Department of Computer Science an Engineering Floria Atlantic University

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

Just-In-Time Software Pipelining

Just-In-Time Software Pipelining Just-In-Time Software Pipelining Hongbo Rong Hyunchul Park Youfeng Wu Cheng Wang Programming Systems Lab Intel Labs, Santa Clara What is software pipelining? A loop optimization exposing instruction-level

More information

Section 19. Thin Prisms

Section 19. Thin Prisms Section 9 Thin Prisms 9- OPTI-50 Optical Design an Instrumentation I opyright 08 John E. Greivenkamp Thin Prism Deviation Thin prisms introuce small angular beam eviations an are useful as alignment evices.

More information

Modifying ROC Curves to Incorporate Predicted Probabilities

Modifying ROC Curves to Incorporate Predicted Probabilities Moifying ROC Curves to Incorporate Preicte Probabilities Cèsar Ferri DSIC, Universitat Politècnica e València Peter Flach Department of Computer Science, University of Bristol José Hernánez-Orallo DSIC,

More information

Section 20. Thin Prisms

Section 20. Thin Prisms OPTI-0/0 Geometrical an Instrumental Optics opyright 08 John E. Greivenkamp 0- Section 0 Thin Prisms Thin Prism Deviation Thin prisms introuce small angular beam eviations an are useful as alignment evices.

More information

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways Ben, Jogs, An Wiggles for Railroa Tracks an Vehicle Guie Ways Louis T. Klauer Jr., PhD, PE. Work Soft 833 Galer Dr. Newtown Square, PA 19073 lklauer@wsof.com Preprint, June 4, 00 Copyright 00 by Louis

More information

NAND flash memory is widely used as a storage

NAND flash memory is widely used as a storage 1 : Buffer-Aware Garbage Collection for Flash-Base Storage Systems Sungjin Lee, Dongkun Shin Member, IEEE, an Jihong Kim Member, IEEE Abstract NAND flash-base storage evice is becoming a viable storage

More information

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space Analysis of half-space range search using the k- search skip list Mario A. Lopez Brafor G. Nickerson y 1 Abstract We analyse the average cost of half-space range reporting for the k- search skip list.

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

Learning Polynomial Functions. by Feature Construction

Learning Polynomial Functions. by Feature Construction I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate

More information

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

Improving Spatial Reuse of IEEE Based Ad Hoc Networks mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

Towards a Low-Power Accelerator of Many FPGAs for Stencil Computations

Towards a Low-Power Accelerator of Many FPGAs for Stencil Computations 2012 Thir International Conference on Networking an Computing Towars a Low-Power Accelerator of Many FPGAs for Stencil Computations Ryohei Kobayashi Tokyo Institute of Technology, Japan E-mail: kobayashi@arch.cs.titech.ac.jp

More information

Variable Independence and Resolution Paths for Quantified Boolean Formulas

Variable Independence and Resolution Paths for Quantified Boolean Formulas Variable Inepenence an Resolution Paths for Quantifie Boolean Formulas Allen Van Geler http://www.cse.ucsc.eu/ avg University of California, Santa Cruz Abstract. Variable inepenence in quantifie boolean

More information

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember 107 IEICE TRANS INF & SYST, VOLE88 D, NO5 MAY 005 LETTER An Improve Neighbor Selection Algorithm in Collaborative Filtering Taek-Hun KIM a), Stuent Member an Sung-Bong YANG b), Nonmember SUMMARY Nowaays,

More information

Lab work #8. Congestion control

Lab work #8. Congestion control TEORÍA DE REDES DE TELECOMUNICACIONES Grao en Ingeniería Telemática Grao en Ingeniería en Sistemas e Telecomunicación Curso 2015-2016 Lab work #8. Congestion control (1 session) Author: Pablo Pavón Mariño

More information

Authenticated indexing for outsourced spatial databases

Authenticated indexing for outsourced spatial databases The VLDB Journal (2009) 8:63 648 DOI 0.007/s00778-008-03-2 REGULAR PAPER Authenticate inexing for outsource spatial atabases Yin Yang Stavros Papaopoulos Dimitris Papaias George Kollios Receive: February

More information

d 3 d 4 d d d d d d d d d d d 1 d d d d d d

d 3 d 4 d d d d d d d d d d d 1 d d d d d d Proceeings of the IASTED International Conference Software Engineering an Applications (SEA') October 6-, 1, Scottsale, Arizona, USA AN OBJECT-ORIENTED APPROACH FOR MANAGING A NETWORK OF DATABASES Shu-Ching

More information

Image compression predicated on recurrent iterated function systems

Image compression predicated on recurrent iterated function systems 2n International Conference on Mathematics & Statistics 16-19 June, 2008, Athens, Greece Image compression preicate on recurrent iterate function systems Chol-Hui Yun *, Metzler W. a an Barski M. a * Faculty

More information

Socially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach

Socially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach Socially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach Jian Zhao, Chuan Wu The University of Hong Kong {jzhao,cwu}@cs.hku.hk Abstract Peer-to-peer (P2P) technology is popularly

More information

Fault Simulation with Parallel Critical Path Tracing for Combinational Circuits Using Structurally Synthesized BDDs

Fault Simulation with Parallel Critical Path Tracing for Combinational Circuits Using Structurally Synthesized BDDs Fault Simulation with Parallel Critical Path Tracing for Combinational Circuits Using Structurall Snthesize BDDs Sergei Devaze, Jaan Rai, Artur Jutman, Raimun Ubar Tallinn Universit of Technolog, Estonia

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem Inuence of Cross-Interferences on Blocke Loops A Case Stuy with Matrix-Vector Multiply CHRISTINE FRICKER INRIA, France an OLIVIER TEMAM an WILLIAM JALBY University of Versailles, France State-of-the art

More information

A fast embedded selection approach for color texture classification using degraded LBP

A fast embedded selection approach for color texture classification using degraded LBP A fast embee selection approach for color texture classification using egrae A. Porebski, N. Vanenbroucke an D. Hama Laboratoire LISIC - EA 4491 - Université u Littoral Côte Opale - 50, rue Ferinan Buisson

More information

AnyTraffic Labeled Routing

AnyTraffic Labeled Routing AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,

More information

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,

More information

Top-down Connectivity Policy Framework for Mobile Peer-to-Peer Applications

Top-down Connectivity Policy Framework for Mobile Peer-to-Peer Applications Top-own Connectivity Policy Framework for Mobile Peer-to-Peer Applications Otso Kassinen Mika Ylianttila Junzhao Sun Jussi Ala-Kurikka MeiaTeam Department of Electrical an Information Engineering University

More information

A shortest path algorithm in multimodal networks: a case study with time varying costs

A shortest path algorithm in multimodal networks: a case study with time varying costs A shortest path algorithm in multimoal networks: a case stuy with time varying costs Daniela Ambrosino*, Anna Sciomachen* * Department of Economics an Quantitative Methos (DIEM), University of Genoa Via

More information

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides Threshol Base Data Aggregation Algorithm To Detect Rainfall Inuce Lanslies Maneesha V. Ramesh P. V. Ushakumari Department of Computer Science Department of Mathematics Amrita School of Engineering Amrita

More information

Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles

Impact of cache interferences on usual numerical dense loop. nests. O. Temam C. Fricker W. Jalby. University of Leiden INRIA University of Versailles Impact of cache interferences on usual numerical ense loop nests O. Temam C. Fricker W. Jalby University of Leien INRIA University of Versailles Niels Bohrweg 1 Domaine e Voluceau MASI 2333 CA Leien 78153

More information

+ E. Bit-Alignment for Retargetable Code Generators * 1 Introduction A D T A T A A T D A A. Keen Schoofs Gert Goossens Hugo De Mant

+ E. Bit-Alignment for Retargetable Code Generators * 1 Introduction A D T A T A A T D A A. Keen Schoofs Gert Goossens Hugo De Mant Bit-lignment for Retargetable Coe Generators * Keen Schoofs Gert Goossens Hugo e Mant IMEC, Kapelreef 75, B-3001 Leuven, Belgium bstract When builing a bit-true retargetable compiler, every signal type

More information

Module13:Interference-I Lecture 13: Interference-I

Module13:Interference-I Lecture 13: Interference-I Moule3:Interference-I Lecture 3: Interference-I Consier a situation where we superpose two waves. Naively, we woul expect the intensity (energy ensity or flux) of the resultant to be the sum of the iniviual

More information

FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD

FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD Warsaw University of Technology Faculty of Physics Physics Laboratory I P Joanna Konwerska-Hrabowska 6 FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD.

More information

Algebraic transformations of Gauss hypergeometric functions

Algebraic transformations of Gauss hypergeometric functions Algebraic transformations of Gauss hypergeometric functions Raimunas Viūnas Faculty of Mathematics, Kobe University Abstract This article gives a classification scheme of algebraic transformations of Gauss

More information

All-to-all Broadcast for Vehicular Networks Based on Coded Slotted ALOHA

All-to-all Broadcast for Vehicular Networks Based on Coded Slotted ALOHA Preprint, August 5, 2018. 1 All-to-all Broacast for Vehicular Networks Base on Coe Slotte ALOHA Mikhail Ivanov, Frerik Brännström, Alexanre Graell i Amat, an Petar Popovski Department of Signals an Systems,

More information

Design of Policy-Aware Differentially Private Algorithms

Design of Policy-Aware Differentially Private Algorithms Design of Policy-Aware Differentially Private Algorithms Samuel Haney Due University Durham, NC, USA shaney@cs.ue.eu Ashwin Machanavajjhala Due University Durham, NC, USA ashwin@cs.ue.eu Bolin Ding Microsoft

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

Supporting Fully Adaptive Routing in InfiniBand Networks

Supporting Fully Adaptive Routing in InfiniBand Networks XIV JORNADAS DE PARALELISMO - LEGANES, SEPTIEMBRE 200 1 Supporting Fully Aaptive Routing in InfiniBan Networks J.C. Martínez, J. Flich, A. Robles, P. López an J. Duato Resumen InfiniBan is a new stanar

More information

A Metric for Routing in Delay-Sensitive Wireless Sensor Networks

A Metric for Routing in Delay-Sensitive Wireless Sensor Networks A Metric for Routing in Delay-Sensitive Wireless Sensor Networks Zhen Jiang Jie Wu Risa Ito Dept. of Computer Sci. Dept. of Computer & Info. Sciences Dept. of Computer Sci. West Chester University Temple

More information