Efficient Parallel Hierarchical Clustering

Size: px
Start display at page:

Download "Efficient Parallel Hierarchical Clustering"

Transcription

1 Efficient Parallel Hierarchical Clustering Manoranjan Dash 1,SimonaPetrutiu, and Peter Scheuermann 1 Deartment of Information Systems, School of Comuter Engineering, Nanyang Technological University, Singaore Deartment of Electrical & Comuter Engineering, Northwestern University, Evanston, IL 6008 Abstract. Hierarchical agglomerative clustering (HAC) is a common clustering method that oututs a dendrogram showing all N levels of agglomerations where N is the number of objects in the data set. High time and memory comlexities are some of the major bottlenecks in its alication to real-world roblems. In the literature arallel algorithms are roosed to overcome these limitations. But, as this aer shows, existing arallel HAC algorithms are inefficient due to ineffective artitioning of the data. We first show how HAC follows a rule where most agglomerations have very small dissimilarity and only a small ortion towards the end have large dissimilarity. Partially overlaing artitioning (POP) exloits this rincile and obtains efficient yet accurate HAC algorithms. The total number of dissimilarities is reduced by a factor close to the number of cells in the artition. We resent POP, the arallel version of POP, that is imlemented on a shared memory multirocessor architecture. Extensive theoretical analysis and exerimental results are resented and show that POP gives close to linear seedu and outerforms the existing arallel algorithms significantly both in CPU time and memory requirements. Keywords: hierarchical agglomerative clustering, artitioning, arallel algorithm, shared memory architecture. 1 Introduction Hierarchical agglomerative clustering (HAC) is often used in various alications due to its caability to outut a dendrogram showing all agglomerations. Unlike K-means and other tyes of clustering where objects are clustered into a given number of clusters, a dendrogram can be used to get any number of clusters. HAC algorithms are non-arametric, natural and simle in grouing objects, and caable of finding clusters of different shaes by using different similarity measures. However, they are limited in their alication to real-world roblems mainly due to high CPU time and memory comlexities. Existing algorithms take O(N log N) CPUtimeandrequireO(N ) memory. Parallel algorithms Research of the third author on this roject was suorted by NSF grant IIS M. Danelutto, D. Laforenza, M. Vanneschi (Eds.): Euro-Par 004, LNCS 3149, , 004. c Sringer-Verlag Berlin Heidelberg 004

2 364 M. Dash, S. Petrutiu, and P. Scheuermann are roosed to alleviate this limitation. Existing arallel algorithms either arallelize other clustering methods such as K-means (Dhillon and Modha [1]) and subsace clustering (Nagesh et al. []), or are not very efficient due to lack of erformance enhancing artitioning [3]. In [4] we have shown that comlexities of the existing sequential HAC algorithms can be reduced significantly by an efficient artitioning scheme without losing accuracy. The roosed methods are based on an observation that in HAC most iterations agglomerate very small clusters searated by very small dissimilarity. Only a small number of iterations towards the end agglomerate the large clusters. Using this observation a structure called artially overlaing artitioning (POP) divides the data into a number of overlaing cells. Analysis and exeriments showed that POP-based sequential HAC algorithms reduce existing time and memory comlexities by a factor close to the number of cells c. In this aer we resent arallel versions of POP, called POP. Due to the indeendent nature of each artitioned cell, arallelization is able to achieve similar reduction in time and memory comlexities as POP, i.e., by a factor close to the number of cells c. We imlement POP over a shared memory architecture. Exerimental evaluations show that for large data sets POP obtains near linear seedu. In addition, for stored matrix imlementations, POP results in a two order of magnitude imrovement in comutation time over the existing arallel HAC algorithms. Background Let us assume that there are N objects each with M attributes. We use real tye data and Euclidean (L ) distance to measure dissimilarity. Other distance measures, e.g. Manhattan, can be used (see [4]). The Rule: In an exeriment we ran the centroid tye HAC method over a -D data set with 100 clusters and some noise. In the centroid tye, each cluster is reresented by a centroid and the air with the closest centroids is merged in each iteration. In Figure 1, we lot the closest air distance for each iteration. Notice that most agglomerations excet for a small ortion towards the end have very small closest air distance comared to the maximum closest air distance. This maximum distance is taken over all agglomerations. If we lot the size of clusters merged in an iteration it also shows a similar lot. We exerimented with many data sets having varying characteristics. For varying M, N (tyically large at least a few thousand objects), and K (number of clusters), the general trend is as follows: if a majority of the objects are inside clusters then the shae of the distance lot is as shown in Figure. Wename this as rule to convey the idea that in a dendrogram, most levels from the bottom merge airs of very small clusters searated by a very small ortion of the maximum closest air distance. The rule extends to other HAC algorithms beyond the centroid method for both the geometric and the grah metrics. For sace constraints, we restrict all discussions in this aer to centroid method.

3 Efficient Parallel Hierarchical Clustering Distance Plot Closest Pair Distance % iterations merge clusters with distance less than 6% of maximum merging distance 0.5 6% % 100 Iteration Number(%) Fig. 1. An imortant roerty of HAC: the distance lot shows that the closest air distance is very small even until last stage of agglomeration. See [4] for detailed discussion on the rule and other metrics. Next we show how to exloit this inherent characteristic of HAC..1 Partially Overlaing Partitioning (POP) An axis-arallel POP divides the data-sace uniformly into c number of overlaing cells. The overlaing region is called -region where is the overlaing distance between two cells. Figure deicts the axis-arallel POP. For the centroid metric (and other geometric metrics), if the reresentative oint of a cluster falls in a -region then each affected cell that contains this -region holds it, otherwise only one cell holds it. Partially Overlaing Partitioning (POP) Fig.. The rule is exloited by POP for efficient HAC.

4 366 M. Dash, S. Petrutiu, and P. Scheuermann Before discussing POP any further, we very briefly describe some existing HAC algorithms. HAC algorithms are mainly of two tyes: stored matrix (e.g., dissimilarity matrix and riority queues) and stored data (e.g., nearest neighbor). The dissimilarity matrix method stores dissimilarities between each air of clusters. When a air is merged dissimilarities are comuted for the new cluster and the matrix is udated. The memory comlexity of this method is O(N ) and the time comlexity is O(N 3 ). In the riority queue method a hea-based riority queue is maintained for each cluster. Because a riority queue requires O(log n) time for each insert and delete oeration for n elements, the time comlexity reduces to O(N log N) although the memory comlexity stays at O(N ). The nearest neighbor array method maintains nearest neighbors for each cluster in an array. If after each iteration the average number of clusters whose nearest neighbors need to be changed is α, then the time comlexity reduces to O(αN ) and the memory comlexity reduces to O(N). An uer bound for α is (3 M ). When memory is enough to store O(N ) dissimilarities, stored matrix algorithms are referred as they do fewer comutations. Otherwise, the stored data tye is referred. -Phase Algorithm: In [4] we roosed a new -hase algorithm for HAC based on the axis-arallel POP. In hase 1 clusters are artitioned into c overlaing cells. The basic idea is that in each iteration the closest air is found for each cell and from those the overall closest air is found. If the overall closest air distance is less than then the air is merged and the riority queues (or the dissimilarity matrix or the nearest neighbor array) of only the container cell are udated. If the closest air or the merged cluster is in a -region then the riority queues of the affected cells are also udated. Phase 1 terminates when the closest air distance exceeds. Phase merges the remaining clusters of hase 1 using the existing algorithm, thus comleting the dendrogram. Accuracy: POPinhase 1 ensures that any air with distance less than must reside together in at least one cell. Hence, as hase is the existing algorithm itself, the -hase algorithm guarantees the correct dendrogram. Comlexity Analysis: By setting to the closest air distance at the turning oint of the distance lot (see Figure 1), a large number of small clusters are merged in hase 1 while only a small number of larger clusters are merged in hase. Recall that hase 1 uses POP which is very efficient whereas hase uses the existing algorithm which is not so efficient. In Figure 1, if is set to the turning oint of the distance lot, 96% agglomerations from the beginning are merged in hase 1 and the remaining 4% in hase. Therefore, the overall comutational time is reduced drastically. So, we see that when is set to the turning oint, the number of clusters remaining (k ) for hase is very small and the total number of clusters in the -region ( )is also very small. For simlification of the comlexity analysis, we consider k and to be negligible. This is reasonable because the rule holds for all data sets that have clusters in it. We assume equal cell size and equal -region size for each cell. In [4] we give the detailed comlexity analysis comarison between the existing and the -hase algorithms. Following is a brief overview of that. Stored matrix tye that requires O(N ) memory now requires

5 Efficient Parallel Hierarchical Clustering 367 O( N c ) in the -hase algorithm. Hence memory is reduced by a factor close to c. Because of this reduction, the -hase dissimilarity matrix algorithm, whose time comlexity is dominated by the time required to create the matrix, enjoys a reduction by a factor close to c. The time comlexity of the riority queue algorithm is dominated by the udate effort required to maintain the riority queues. After each agglomeration of the closest air, the riority queues of all other clusters are udated. But in the -hase algorithm this effort is restricted only to the cell that holds the closest air, and if it haens to be in a -region then it is restricted only to the affected cells. So after simlification the reduction factor is log N N c, i.e., the time comlexity reduces from O(N log N) to c O( N c log N c ). In stored data tye there is no reduction in the memory comlexity of O(N). The time comlexity is dominated by the time required to udate the nearest neighbors of the affected clusters. For the existing algorithm the time required to find the nearest neighbor of one affected cluster is O(N) but for the -hase algorithm it is O( N c ). So, the overall reduction factor is close to c. Setting and c Nested Algorithm: The erformance of the -hase algorithm deends on c and. As shown in the distance lot of Figure 1, there exists an ideal at the turning oint at which the total time taken by the -hase algorithm is minimum. But it is not straightforward to comute. So, we adoted a nested aroach where in the beginning POP artitioning starts with a very small and gradually increases it until a few or just one cluster remain. As increases, c which is set initially to a high value, is gradually reduced. Accuracy of this nested algorithm is assured from the accuracy of the -hase algorithm. Exeriments show that the nested algorithm is more efficient than the -hase algorithm even when is set ideally for the -hase algorithm. For examle, for the data set described in Section, the minimum time for the -hase algorithm is 15.4 cu sec while the nested algorithm takes only 57.8 cu sec. Higher Dimensional Data: The above discussion focuses on -D data. For higher dimensions we roosed a very efficient data structure as a relacement for the axis-arallel artitioning. Due to sace constraint we limit the scoe of this aer to -D and refer the interested reader to [4]. 3 POP Algorithms Parallel HAC algorithms have been studied by Li [5], Li and Fang [6], Olson [3], and Wu et al. [7]. The common feature of these algorithms is: for stored matrix tye the task of comuting and maintaining O(N ) dissimilarities is divided among the rocessors, whereas for stored data tye the task of comuting and maintaining the O(N) nearest neighbors is divided among the rocessors. For examle, Olson used rocessors to reduce the time comlexity of the dissimilarity matrix method to O( N 3 ) and that of the riority queue method to O( N log N ) [3]. The time comlexity for the nearest neighbor array method

6 368 M. Dash, S. Petrutiu, and P. Scheuermann reduces to O( α N ). These algorithms are not very efficient because they still require O(N ) total memory for stored matrix tye, and in each iteration they require to udate all the riority queues or dissimilarity matrix. For stored data tye the existing methods need to check all the clusters after each agglomeration to determine whether the newly merged cluster is nearer than the revious nearest. So, the reduction in these arallel algorithms is mostly because of arallelization, but not due to efficient artitioning. The advantage of POP is that each cell is sufficient by itself, and hence arallelization benefits by dividing the task of creating and maintaining the dissimilarities or riority queues or nearest neighbors of each cell among the rocessors. This reduces the total comutation of searching for the closest air and maintaining the data structure drastically. Below we give the comlexities of sequential, existing arallel and POP algorithms. For comlexity analysis we select the -hase algorithm of the stored matrix tye since, as we shall show later, this algorithm achieves larger seedus comared to the existing algorithms. As before, we assume equal cell sizes, negligible size, and negligible hase time. Among existing algorithms, those described by Olson [3] are selected. The number of rocessors is denoted by. Table 1. Comarison of time comlexities of sequential, existing arallel, and POP algorithms. RF - Reduction Factor (= ExistingP arallel P OP ). Priority Queues Sequential Existing Parallel POP RF 1. Create riority queues O(N ) O( N ) N O( ) c. for n = N to O(N) O(N) O(N) 3. find smallest distance O(n) O( n ) O( n ) 4. merge and udate P O(n log n) O( n log n ) O( n log n c ) Overall Overall (Dissimilarity Matrix) O(N 3 ) O(N log N) O( N log N ) O( N log N c ) log N log N c O( N3 ) N3 O( ) c c In Table 1 (riority queues) ste 1 of POP comutes riority queues in O( N N c ) time. Recall that POP reduces the memory by a factor of c, i.e., O( c ). POP divides the total comutation for the c cells among rocessors, and hence, assuming no synchronization delays the comlexity becomes O( N c ). Ste 4 udates the riority queues of the affected clusters. In POP a riority queue holds N c elements in the beginning. Hence, due to arallelization the total time comlexity of this ste is O( n log n c ). So, the overall reduction factor is log N. log N c Table 1 shows the overall comlexities for the dissimilarity matrix tye as well. It has a reduction factor close to c. The memory requirement for riority queues and dissimilarity matrix tyes is reduced by a factor close to c. Forthenearest neighbor tye, the gain of POP over the existing arallel algorithms cannot be obtained directly from the comlexity analysis. For the ste where each cluster

7 Efficient Parallel Hierarchical Clustering 369 is checked to find whether it is affected by the agglomeration, POP needs to do it for one (or a few, if in -region) cell whereas the existing algorithm needs to do it for all clusters. Similarly, the existing algorithm needs to check all the clusters to find the new nearest neighbor of each affected cluster. But POP requires only the container cell to be checked. Exerimental results in the next section show that POP outerforms the existing algorithms substantially for all the above three tyes of HAC. 4 Exerimental Results We erformed a number of exeriments to study the erformance and scalability of our roosed POP algorithms. Both stored matrix (riority queues) and stored data (nearest neighbors) tyes of POP were imlemented using the -hase algorithm. For comarison uroses we imlemented the corresonding existing arallel algorithms, hereby denoted as existing algorithms. These are described in [3]. The erformance was measured in terms of CPU time, memory sace and seedu. We exerimented using several real, benchmark, and artificial data sets. Due to sace constraint we show the results over an artificial data set that is used in [8]. Other results are available from manoranj/research.html. The exeriments were run on the SGI Origin000 multirocessor system which is a shared memory machine consisting of 8 R10000 CPUs running at clock rate of 195MHs. The secondary cache size is 4MB. We used OenMP which is an API for directed based arallel rogramming alications in a shared memory environment [9]. We decided to use it because it is designed for fine-grained arallelism, which was redominant in our algorithm. The POP imlementation in OenMP uses guided self scheduling clause in the assignment of iterations to threads, i.e., rocessors. During each iteration of HAC each rocessor is assigned in turn a chunk of cells to work on, with the chunk size being reduced as we roceed with the iteration. After an iteration is finished, a critical region is established in order to find the overall closest air of clusters and merge them. The riority queues of the cells affected by the agglomeration can be udated in arallel. In Figure 3 we show the results over the synthetic data set whose size varies from 3K to 60K. The existing stored matrix algorithms require O(N )memory, hence we could exeriment only with a data size u to 5K; on the other hand for POP we reort results for data sets u to 30K. The number of rocessors varies from 1 to 8. In Figure 3 (a-b) we reort the seedus of POP. Although the seedu of POP is small for smaller data sets, we observe that for larger data sets (30K or higher) the seedu of POP imroves substantially and aroaches linear seedu for data sets of 60K. Figure 3 (c-d) gives the relative seedu of POP over the existing algorithm. POP is always suerior over the existing algorithm because of its efficient artitioning, and indeendent nature of each cell. The relative seedu increases with data size. Among stored matrix and stored data tyes, POP s erformance is much better for stored matrix. It

8 370 M. Dash, S. Petrutiu, and P. Scheuermann (a) (b) Stored Matrix POP Algorithm Stored Data POP Algorithm 8 7 IdealSeedu 3k-oints 5k-oints 10k-oints 15k-oints 30k-oints 8 7 IdealSeedu 10k-oints 15k-oints 30k-oints 60k-oints Seedu Seedu Number Of Processors 450 (c) Stored Matrix Relative Seedu = (CPU Time Existing / CPU Time POP) 500 3k-oints 5k-oints Number Of Processors (d) Stored Data Relative Seedu = (CPU Time Existing/ CPU Time POP) k-oints 15k-oints 30k-oints Relative Seedu = Existing / POP Relative Seedu = Existing / POP Number Of Processors Number Of Processors Fig. 3. Synthetic data results: For stored matrix and stored data tyes, and for varying #rocessors (1 to 8), (a-b) show erformance of POP, and (c-d) show RelativeSeedU = Existing P OP. achieves a two order of magnitude imrovement in comutation time over the existing algorithm. ExistingCPU As shown in Figures 3 (c-d) the relative seedu, P OP CP U, decreases as the number of rocessors increases. This is due to the fact that for a small number of cells, when the number of rocessors is increased, some rocessors end u working on cells containing a very small number of clusters, and will therefore send a lot of time being idle when they are done with the comutation in a given iteration. However, as the data set size increases and/or the number of clusters increases, load balancing among the rocessors becomes better. This henomena can be observed in our figures. Although for both 3K and 5K sizes for stored

9 Efficient Parallel Hierarchical Clustering 371 matrix tye the relative seedu dros by aroximately the same amount (85) when the number of rocessors increased from 1 to 8, the noticeable fact is that relative seedu for 1 rocessor for 3K size is 375, but that for 5K size is 460. That is to say as the number of rocessors increased, with increasing size of data the rate of dro in seedu decreased. Although due to the high memory requirement of the existing arallel algorithms we could not test for higher data sizes, we ostulate that for larger data sets this trend of reduction in relative seedu for more rocessors will continue to slow down further. We comared the memory for stored matrix tye. For 3K and 5k POP reduced the memory requirement by a factor of 97 and 189 resectively. For stored data tye both algorithms require similar amount of memory. 5 Conclusion and Future Directions In this aer we roosed POP for efficient arallel HAC. Analysis and exeriments showed that, for both stored matrix and stored data tyes, POP outerforms the existing algorithms significantly both in CPU time and memory requirements. This is achieved by exloiting a rule of HAC which states that in a dendrogram, most levels from the bottom merge airs of very small clusters searated by a very small ortion of the maximum closest air distance. The data sace was artitioned by artially overlaing cells each of which could be rocessed indeendent of other such cells without affecting accuracy. Future work includes arallelizing the high-dimensional data structure. References 1. Dhillon, I.S., Modha, D.M.: Large-scale arallel data mining. Lecture Notes in Artificial Intelligence 1759 (000) Nagesh, H., Goil, S., Choudhary, A.: PMAFIA: A scalable arallel subsace clustering algorithm for massive datasets. In: Proc. International Conference on Parallel Processing. (000) Olson, C.F.: Parallel algorithms for hierarchical clustering. Parallel Comuting 1 (1995) Dash, M., Liu, H., Scheuermann, P., Tan, K.L.: Fast hierarchical clustering and its validation. Data and Knowledge Engineering 44(1) (003) Li, X.: Parallel algorithms for hierarchical clustering and cluster validity. IEEE Transactions on Pattern Analysis and Machine Intelligence 1 (1990) Li, X., Fang, Z.: Parallel clustering algorithms. Parallel Comuting 11 (1989) Wu, C.H., Horng, S.J., Tsai, H.R.: Efficient arallel algorithms for hierarchical clustering on arrays with reconfigurable otical buses. Journal of Parallel and Distributed Comuting 60 (000) Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD Conference on Management of Data, Montreal, Canada (1996) Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, J., Menon, R., eds.: Parallel Programming in OenMP. Morgan Kaufmann Publishers (000)

A Novel Iris Segmentation Method for Hand-Held Capture Device

A Novel Iris Segmentation Method for Hand-Held Capture Device A Novel Iris Segmentation Method for Hand-Held Cature Device XiaoFu He and PengFei Shi Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200030, China {xfhe,

More information

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation

SPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation To aear in IEEE VLSI Test Symosium, 1997 SITFIRE: Scalable arallel Algorithms for Test Set artitioned Fault Simulation Dili Krishnaswamy y Elizabeth M. Rudnick y Janak H. atel y rithviraj Banerjee z y

More information

A Parallel Algorithm for Constructing Obstacle-Avoiding Rectilinear Steiner Minimal Trees on Multi-Core Systems

A Parallel Algorithm for Constructing Obstacle-Avoiding Rectilinear Steiner Minimal Trees on Multi-Core Systems A Parallel Algorithm for Constructing Obstacle-Avoiding Rectilinear Steiner Minimal Trees on Multi-Core Systems Cheng-Yuan Chang and I-Lun Tseng Deartment of Comuter Science and Engineering Yuan Ze University,

More information

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of

More information

Stereo Disparity Estimation in Moment Space

Stereo Disparity Estimation in Moment Space Stereo Disarity Estimation in oment Sace Angeline Pang Faculty of Information Technology, ultimedia University, 63 Cyberjaya, alaysia. angeline.ang@mmu.edu.my R. ukundan Deartment of Comuter Science, University

More information

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Min Hu, Saad Ali and Mubarak Shah Comuter Vision Lab, University of Central Florida {mhu,sali,shah}@eecs.ucf.edu Abstract Learning tyical

More information

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University

Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University Query Processing Shuigeng Zhou May 18, 2016 School of Comuter Science Fudan University Overview Outline Measures of Query Cost Selection Oeration Sorting Join Oeration Other Oerations Evaluation of Exressions

More information

10. Parallel Methods for Data Sorting

10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting... 1 10.1. Parallelizing Princiles... 10.. Scaling Parallel Comutations... 10.3. Bubble Sort...3 10.3.1. Sequential Algorithm...3

More information

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets An imroved algorithm for Hausdorff Voronoi diagram for non-crossing sets Frank Dehne, Anil Maheshwari and Ryan Taylor May 26, 2006 Abstract We resent an imroved algorithm for building a Hausdorff Voronoi

More information

Continuous Visible k Nearest Neighbor Query on Moving Objects

Continuous Visible k Nearest Neighbor Query on Moving Objects Continuous Visible k Nearest Neighbor Query on Moving Objects Yaniu Wang a, Rui Zhang b, Chuanfei Xu a, Jianzhong Qi b, Yu Gu a, Ge Yu a, a Deartment of Comuter Software and Theory, Northeastern University,

More information

A GPU Heterogeneous Cluster Scheduling Model for Preventing Temperature Heat Island

A GPU Heterogeneous Cluster Scheduling Model for Preventing Temperature Heat Island A GPU Heterogeneous Cluster Scheduling Model for Preventing Temerature Heat Island Yun-Peng CAO 1,2,a and Hai-Feng WANG 1,2 1 School of Information Science and Engineering, Linyi University, Linyi Shandong,

More information

S16-02, URL:

S16-02, URL: Self Introduction A/Prof ay Seng Chuan el: Email: scitaysc@nus.edu.sg Office: S-0, Dean s s Office at Level URL: htt://www.hysics.nus.edu.sg/~hytaysc I was a rogrammer from to. I have been working in NUS

More information

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K.

AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K. inuts er clock cycle Streaming ermutation oututs er clock cycle AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS Ren Chen and Viktor K.

More information

2010 First International Conference on Networking and Computing

2010 First International Conference on Networking and Computing First International Conference on Networking and Comuting Imlementations of Parallel Comutation of Euclidean Distance Ma in Multicore Processors and GPUs Duhu Man, Kenji Uda, Hironobu Ueyama, Yasuaki Ito,

More information

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism

A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism A New and Efficient Algorithm-Based Fault Tolerance Scheme for A Million Way Parallelism Erlin Yao, Mingyu Chen, Rui Wang, Wenli Zhang, Guangming Tan Key Laboratory of Comuter System and Architecture Institute

More information

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2

An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 An Efficient Coding Method for Coding Region-of-Interest Locations in AVS2 Mingliang Chen 1, Weiyao Lin 1*, Xiaozhen Zheng 2 1 Deartment of Electronic Engineering, Shanghai Jiao Tong University, China

More information

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model.

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model. U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 18 Professor Satish Rao Lecturer: Satish Rao Last revised Scribe so far: Satish Rao (following revious lecture notes quite closely. Lecture

More information

The Spatial Skyline Queries

The Spatial Skyline Queries The Satial Skyline Queries Mehdi Sharifzadeh Comuter Science Deartment University of Southern California Los Angeles, CA 90089-078 sharifza@usc.edu Cyrus Shahabi Comuter Science Deartment University of

More information

Introduction to Parallel Algorithms

Introduction to Parallel Algorithms CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has

More information

Implementation of Evolvable Fuzzy Hardware for Packet Scheduling Through Online Context Switching

Implementation of Evolvable Fuzzy Hardware for Packet Scheduling Through Online Context Switching Imlementation of Evolvable Fuzzy Hardware for Packet Scheduling Through Online Context Switching Ju Hui Li, eng Hiot Lim and Qi Cao School of EEE, Block S Nanyang Technological University Singaore 639798

More information

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS

PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS Kevin Miller, Vivian Lin, and Rui Zhang Grou ID: 5 1. INTRODUCTION The roblem we are trying to solve is redicting future links or recovering missing links

More information

An accurate and fast point-to-plane registration technique

An accurate and fast point-to-plane registration technique Pattern Recognition Letters 24 (23) 2967 2976 www.elsevier.com/locate/atrec An accurate and fast oint-to-lane registration technique Soon-Yong Park *, Murali Subbarao Deartment of Electrical and Comuter

More information

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties

Improved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties Imroved heuristics for the single machine scheduling roblem with linear early and quadratic tardy enalties Jorge M. S. Valente* LIAAD INESC Porto LA, Faculdade de Economia, Universidade do Porto Postal

More information

COMP Parallel Computing. BSP (1) Bulk-Synchronous Processing Model

COMP Parallel Computing. BSP (1) Bulk-Synchronous Processing Model COMP 6 - Parallel Comuting Lecture 6 November, 8 Bulk-Synchronous essing Model Models of arallel comutation Shared-memory model Imlicit communication algorithm design and analysis relatively simle but

More information

A Fast Image Restoration Method Based on an Improved Criminisi Algorithm

A Fast Image Restoration Method Based on an Improved Criminisi Algorithm A Fast Image Restoration Method Based on an Imroved Algorithm Yue Chi1, Ning He2*, Qi Zhang1 Beijing Key Laboratory of Information Services Engineering, Beijing Union University, Beijing 100101, China.

More information

Vehicle Logo Recognition Using Modest AdaBoost and Radial Tchebichef Moments

Vehicle Logo Recognition Using Modest AdaBoost and Radial Tchebichef Moments Proceedings of 0 4th International Conference on Machine Learning and Comuting IPCSIT vol. 5 (0) (0) IACSIT Press, Singaore Vehicle Logo Recognition Using Modest AdaBoost and Radial Tchebichef Moments

More information

Grouping of Patches in Progressive Radiosity

Grouping of Patches in Progressive Radiosity Grouing of Patches in Progressive Radiosity Arjan J.F. Kok * Abstract The radiosity method can be imroved by (adatively) grouing small neighboring atches into grous. Comutations normally done for searate

More information

Using Rational Numbers and Parallel Computing to Efficiently Avoid Round-off Errors on Map Simplification

Using Rational Numbers and Parallel Computing to Efficiently Avoid Round-off Errors on Map Simplification Using Rational Numbers and Parallel Comuting to Efficiently Avoid Round-off Errors on Ma Simlification Maurício G. Grui 1, Salles V. G. de Magalhães 1,2, Marcus V. A. Andrade 1, W. Randolh Franklin 2,

More information

Image Segmentation Using Topological Persistence

Image Segmentation Using Topological Persistence Image Segmentation Using Toological Persistence David Letscher and Jason Fritts Saint Louis University Deartment of Mathematics and Comuter Science {letscher, jfritts}@slu.edu Abstract. This aer resents

More information

level 0 level 1 level 2 level 3

level 0 level 1 level 2 level 3 Communication-Ecient Deterministic Parallel Algorithms for Planar Point Location and 2d Voronoi Diagram? Mohamadou Diallo 1, Afonso Ferreira 2 and Andrew Rau-Chalin 3 1 LIMOS, IFMA, Camus des C zeaux,

More information

Privacy Preserving Moving KNN Queries

Privacy Preserving Moving KNN Queries Privacy Preserving Moving KNN Queries arxiv:4.76v [cs.db] 4 Ar Tanzima Hashem Lars Kulik Rui Zhang National ICT Australia, Deartment of Comuter Science and Software Engineering University of Melbourne,

More information

Ad Hoc Networks. Latency-minimizing data aggregation in wireless sensor networks under physical interference model

Ad Hoc Networks. Latency-minimizing data aggregation in wireless sensor networks under physical interference model Ad Hoc Networks (4) 5 68 Contents lists available at SciVerse ScienceDirect Ad Hoc Networks journal homeage: www.elsevier.com/locate/adhoc Latency-minimizing data aggregation in wireless sensor networks

More information

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.)

Lecture 3: Geometric Algorithms(Convex sets, Divide & Conquer Algo.) Advanced Algorithms Fall 2015 Lecture 3: Geometric Algorithms(Convex sets, Divide & Conuer Algo.) Faculty: K.R. Chowdhary : Professor of CS Disclaimer: These notes have not been subjected to the usual

More information

Earthenware Reconstruction Based on the Shape Similarity among Potsherds

Earthenware Reconstruction Based on the Shape Similarity among Potsherds Original Paer Forma, 16, 77 90, 2001 Earthenware Reconstruction Based on the Shae Similarity among Potsherds Masayoshi KANOH 1, Shohei KATO 2 and Hidenori ITOH 1 1 Nagoya Institute of Technology, Gokiso-cho,

More information

Lecture 8: Orthogonal Range Searching

Lecture 8: Orthogonal Range Searching CPS234 Comutational Geometry Setember 22nd, 2005 Lecture 8: Orthogonal Range Searching Lecturer: Pankaj K. Agarwal Scribe: Mason F. Matthews 8.1 Range Searching The general roblem of range searching is

More information

Leak Detection Modeling and Simulation for Oil Pipeline with Artificial Intelligence Method

Leak Detection Modeling and Simulation for Oil Pipeline with Artificial Intelligence Method ITB J. Eng. Sci. Vol. 39 B, No. 1, 007, 1-19 1 Leak Detection Modeling and Simulation for Oil Pieline with Artificial Intelligence Method Pudjo Sukarno 1, Kuntjoro Adji Sidarto, Amoranto Trisnobudi 3,

More information

Hardware-Accelerated Formal Verification

Hardware-Accelerated Formal Verification Hardare-Accelerated Formal Verification Hiroaki Yoshida, Satoshi Morishita 3 Masahiro Fujita,. VLSI Design and Education Center (VDEC), University of Tokyo. CREST, Jaan Science and Technology Agency 3.

More information

Skip List Based Authenticated Data Structure in DAS Paradigm

Skip List Based Authenticated Data Structure in DAS Paradigm 009 Eighth International Conference on Grid and Cooerative Comuting Ski List Based Authenticated Data Structure in DAS Paradigm Jieing Wang,, Xiaoyong Du,. Key Laboratory of Data Engineering and Knowledge

More information

Space-efficient Region Filling in Raster Graphics

Space-efficient Region Filling in Raster Graphics "The Visual Comuter: An International Journal of Comuter Grahics" (submitted July 13, 1992; revised December 7, 1992; acceted in Aril 16, 1993) Sace-efficient Region Filling in Raster Grahics Dominik Henrich

More information

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH

A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH A CLASS OF STRUCTURED LDPC CODES WITH LARGE GIRTH Jin Lu, José M. F. Moura, and Urs Niesen Deartment of Electrical and Comuter Engineering Carnegie Mellon University, Pittsburgh, PA 15213 jinlu, moura@ece.cmu.edu

More information

To appear in IEEE TKDE Title: Efficient Skyline and Top-k Retrieval in Subspaces Keywords: Skyline, Top-k, Subspace, B-tree

To appear in IEEE TKDE Title: Efficient Skyline and Top-k Retrieval in Subspaces Keywords: Skyline, Top-k, Subspace, B-tree To aear in IEEE TKDE Title: Efficient Skyline and To-k Retrieval in Subsaces Keywords: Skyline, To-k, Subsace, B-tree Contact Author: Yufei Tao (taoyf@cse.cuhk.edu.hk) Deartment of Comuter Science and

More information

split split (a) (b) split split (c) (d)

split split (a) (b) split split (c) (d) International Journal of Foundations of Comuter Science c World Scientic Publishing Comany ON COST-OPTIMAL MERGE OF TWO INTRANSITIVE SORTED SEQUENCES JIE WU Deartment of Comuter Science and Engineering

More information

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network

Sensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network 1 Sensitivity Analysis for an Otimal Routing Policy in an Ad Hoc Wireless Network Tara Javidi and Demosthenis Teneketzis Deartment of Electrical Engineering and Comuter Science University of Michigan Ann

More information

Complexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks

Complexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Journal of Comuting and Information Technology - CIT 8, 2000, 1, 1 12 1 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Eunice E. Santos Deartment of Electrical

More information

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications OMNI: An Efficient Overlay Multicast Infrastructure for Real-time Alications Suman Banerjee, Christoher Kommareddy, Koushik Kar, Bobby Bhattacharjee, Samir Khuller Abstract We consider an overlay architecture

More information

The Spatial Skyline Queries

The Spatial Skyline Queries Coffee sho The Satial Skyline Queries Mehdi Sharifzadeh and Cyrus Shahabi VLDB 006 Presented by Ali Khodaei Coffee sho Three friends Coffee sho Three friends Don t choose this lace is closer to each three

More information

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs

An empirical analysis of loopy belief propagation in three topologies: grids, small-world networks and random graphs An emirical analysis of looy belief roagation in three toologies: grids, small-world networks and random grahs R. Santana, A. Mendiburu and J. A. Lozano Intelligent Systems Grou Deartment of Comuter Science

More information

The Anubis Service. Paul Murray Internet Systems and Storage Laboratory HP Laboratories Bristol HPL June 8, 2005*

The Anubis Service. Paul Murray Internet Systems and Storage Laboratory HP Laboratories Bristol HPL June 8, 2005* The Anubis Service Paul Murray Internet Systems and Storage Laboratory HP Laboratories Bristol HPL-2005-72 June 8, 2005* timed model, state monitoring, failure detection, network artition Anubis is a fully

More information

RST(0) RST(1) RST(2) RST(3) RST(4) RST(5) P4 RSR(0) RSR(1) RSR(2) RSR(3) RSR(4) RSR(5) Processor 1X2 Switch 2X1 Switch

RST(0) RST(1) RST(2) RST(3) RST(4) RST(5) P4 RSR(0) RSR(1) RSR(2) RSR(3) RSR(4) RSR(5) Processor 1X2 Switch 2X1 Switch Sub-logarithmic Deterministic Selection on Arrays with a Recongurable Otical Bus 1 Yijie Han Electronic Data Systems, Inc. 750 Tower Dr. CPS, Mail Sto 7121 Troy, MI 48098 Yi Pan Deartment of Comuter Science

More information

Patterned Wafer Segmentation

Patterned Wafer Segmentation atterned Wafer Segmentation ierrick Bourgeat ab, Fabrice Meriaudeau b, Kenneth W. Tobin a, atrick Gorria b a Oak Ridge National Laboratory,.O.Box 2008, Oak Ridge, TN 37831-6011, USA b Le2i Laboratory Univ.of

More information

Wavelet Based Statistical Adapted Local Binary Patterns for Recognizing Avatar Faces

Wavelet Based Statistical Adapted Local Binary Patterns for Recognizing Avatar Faces Wavelet Based Statistical Adated Local Binary atterns for Recognizing Avatar Faces Abdallah A. Mohamed 1, 2 and Roman V. Yamolskiy 1 1 Comuter Engineering and Comuter Science, University of Louisville,

More information

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling *

Pivot Selection for Dimension Reduction Using Annealing by Increasing Resampling * ivot Selection for Dimension Reduction Using Annealing by Increasing Resamling * Yasunobu Imamura 1, Naoya Higuchi 1, Tetsuji Kuboyama 2, Kouichi Hirata 1 and Takeshi Shinohara 1 1 Kyushu Institute of

More information

[9] J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker, \A Proposal for a User-Level,

[9] J. J. Dongarra, R. Hempel, A. J. G. Hey, and D. W. Walker, \A Proposal for a User-Level, [9] J. J. Dongarra, R. Hemel, A. J. G. Hey, and D. W. Walker, \A Proosal for a User-Level, Message Passing Interface in a Distributed-Memory Environment," Tech. Re. TM-3, Oak Ridge National Laboratory,

More information

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal

An Efficient VLSI Architecture for Adaptive Rank Order Filter for Image Noise Removal International Journal of Information and Electronics Engineering, Vol. 1, No. 1, July 011 An Efficient VLSI Architecture for Adative Rank Order Filter for Image Noise Removal M. C Hanumantharaju, M. Ravishankar,

More information

SPARSE SIGNAL REPRESENTATION FOR COMPLEX-VALUED IMAGING Sadegh Samadi 1, M üjdat Çetin 2, Mohammad Ali Masnadi-Shirazi 1

SPARSE SIGNAL REPRESENTATION FOR COMPLEX-VALUED IMAGING Sadegh Samadi 1, M üjdat Çetin 2, Mohammad Ali Masnadi-Shirazi 1 SPARSE SIGNAL REPRESENTATION FOR COMPLEX-VALUED IMAGING Sadegh Samadi 1, M üjdat Çetin, Mohammad Ali Masnadi-Shirazi 1 1. Shiraz University, Shiraz, Iran,. Sabanci University, Istanbul, Turkey ssamadi@shirazu.ac.ir,

More information

Brief Contributions. A Geometric Theorem for Network Design 1 INTRODUCTION

Brief Contributions. A Geometric Theorem for Network Design 1 INTRODUCTION IEEE TRANSACTIONS ON COMPUTERS, VOL. 53, NO., APRIL 00 83 Brief Contributions A Geometric Theorem for Network Design Massimo Franceschetti, Member, IEEE, Matthew Cook, and Jehoshua Bruck, Fellow, IEEE

More information

Implementations of Partial Document Ranking Using. Inverted Files. Wai Yee Peter Wong. Dik Lun Lee

Implementations of Partial Document Ranking Using. Inverted Files. Wai Yee Peter Wong. Dik Lun Lee Imlementations of Partial Document Ranking Using Inverted Files Wai Yee Peter Wong Dik Lun Lee Deartment of Comuter and Information Science, Ohio State University, 36 Neil Ave, Columbus, Ohio 4321, U.S.A.

More information

Collective Communication: Theory, Practice, and Experience. FLAME Working Note #22

Collective Communication: Theory, Practice, and Experience. FLAME Working Note #22 Collective Communication: Theory, Practice, and Exerience FLAME Working Note # Ernie Chan Marcel Heimlich Avi Purkayastha Robert van de Geijn Setember, 6 Abstract We discuss the design and high-erformance

More information

Submission. Verifying Properties Using Sequential ATPG

Submission. Verifying Properties Using Sequential ATPG Verifying Proerties Using Sequential ATPG Jacob A. Abraham and Vivekananda M. Vedula Comuter Engineering Research Center The University of Texas at Austin Austin, TX 78712 jaa, vivek @cerc.utexas.edu Daniel

More information

An Indexing Framework for Structured P2P Systems

An Indexing Framework for Structured P2P Systems An Indexing Framework for Structured P2P Systems Adina Crainiceanu Prakash Linga Ashwin Machanavajjhala Johannes Gehrke Carl Lagoze Jayavel Shanmugasundaram Deartment of Comuter Science, Cornell University

More information

3D Surface Simplification Based on Extended Shape Operator

3D Surface Simplification Based on Extended Shape Operator 3D Surface Simlification Based on Extended Shae Oerator JUI-LIG TSEG, YU-HSUA LI Deartment of Comuter Science and Information Engineering, Deartment and Institute of Electrical Engineering Minghsin University

More information

Collective communication: theory, practice, and experience

Collective communication: theory, practice, and experience CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Comutat.: Pract. Exer. 2007; 19:1749 1783 Published online 5 July 2007 in Wiley InterScience (www.interscience.wiley.com)..1206 Collective

More information

A Study of Protocols for Low-Latency Video Transport over the Internet

A Study of Protocols for Low-Latency Video Transport over the Internet A Study of Protocols for Low-Latency Video Transort over the Internet Ciro A. Noronha, Ph.D. Cobalt Digital Santa Clara, CA ciro.noronha@cobaltdigital.com Juliana W. Noronha University of California, Davis

More information

Equality-Based Translation Validator for LLVM

Equality-Based Translation Validator for LLVM Equality-Based Translation Validator for LLVM Michael Ste, Ross Tate, and Sorin Lerner University of California, San Diego {mste,rtate,lerner@cs.ucsd.edu Abstract. We udated our Peggy tool, reviously resented

More information

Improved Image Super-Resolution by Support Vector Regression

Improved Image Super-Resolution by Support Vector Regression Proceedings of International Joint Conference on Neural Networks, San Jose, California, USA, July 3 August 5, 0 Imroved Image Suer-Resolution by Suort Vector Regression Le An and Bir Bhanu Abstract Suort

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Distrib. Comut. 71 (2011) 288 301 Contents lists available at ScienceDirect J. Parallel Distrib. Comut. journal homeage: www.elsevier.com/locate/jdc Quality of security adatation in arallel

More information

521493S Computer Graphics Exercise 3 (Chapters 6-8)

521493S Computer Graphics Exercise 3 (Chapters 6-8) 521493S Comuter Grahics Exercise 3 (Chaters 6-8) 1 Most grahics systems and APIs use the simle lighting and reflection models that we introduced for olygon rendering Describe the ways in which each of

More information

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1

Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Spanning Trees 1 Multicast in Wormhole-Switched Torus Networks using Edge-Disjoint Sanning Trees 1 Honge Wang y and Douglas M. Blough z y Myricom Inc., 325 N. Santa Anita Ave., Arcadia, CA 916, z School of Electrical and

More information

Blind Separation of Permuted Alias Image Base on Four-phase-difference and Differential Evolution

Blind Separation of Permuted Alias Image Base on Four-phase-difference and Differential Evolution Sensors & Transducers, Vol. 63, Issue, January 204,. 90-95 Sensors & Transducers 204 by IFSA Publishing, S. L. htt://www.sensorsortal.com lind Searation of Permuted Alias Image ase on Four-hase-difference

More information

arxiv: v1 [cs.mm] 18 Jan 2016

arxiv: v1 [cs.mm] 18 Jan 2016 Lossless Intra Coding in with 3-ta Filters Saeed R. Alvar a, Fatih Kamisli a a Deartment of Electrical and Electronics Engineering, Middle East Technical University, Turkey arxiv:1601.04473v1 [cs.mm] 18

More information

Building Polygonal Maps from Laser Range Data

Building Polygonal Maps from Laser Range Data ECAI Int. Cognitive Robotics Worksho, Valencia, Sain, August 2004 Building Polygonal Mas from Laser Range Data Longin Jan Latecki and Rolf Lakaemer and Xinyu Sun and Diedrich Wolter Abstract. This aer

More information

The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing

The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing The VEGA Moderately Parallel MIMD, Moderately Parallel SIMD, Architecture for High Performance Array Signal Processing Mikael Taveniku 2,3, Anders Åhlander 1,3, Magnus Jonsson 1 and Bertil Svensson 1,2

More information

Detection of Occluded Face Image using Mean Based Weight Matrix and Support Vector Machine

Detection of Occluded Face Image using Mean Based Weight Matrix and Support Vector Machine Journal of Comuter Science 8 (7): 1184-1190, 2012 ISSN 1549-3636 2012 Science Publications Detection of Occluded Face Image using Mean Based Weight Matrix and Suort Vector Machine 1 G. Nirmala Priya and

More information

Learning Robust Locality Preserving Projection via p-order Minimization

Learning Robust Locality Preserving Projection via p-order Minimization Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Learning Robust Locality Preserving Projection via -Order Minimization Hua Wang, Feiing Nie, Heng Huang Deartment of Electrical

More information

Non-Strict Independence-Based Program Parallelization Using Sharing and Freeness Information

Non-Strict Independence-Based Program Parallelization Using Sharing and Freeness Information Non-Strict Indeendence-Based Program Parallelization Using Sharing and Freeness Information Daniel Cabeza Gras 1 and Manuel V. Hermenegildo 1,2 Abstract The current ubiuity of multi-core rocessors has

More information

Face Recognition Using Legendre Moments

Face Recognition Using Legendre Moments Face Recognition Using Legendre Moments Dr.S.Annadurai 1 A.Saradha Professor & Head of CSE & IT Research scholar in CSE Government College of Technology, Government College of Technology, Coimbatore, Tamilnadu,

More information

The Research on Curling Track Empty Value Fill Algorithm Based on Similar Forecast

The Research on Curling Track Empty Value Fill Algorithm Based on Similar Forecast Research Journal of Alied Sciences, Engineering and Technology 6(8): 1472-1478, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scientific Organization, 2013 Submitted: October 31, 2012 Acceted: January

More information

Efficient Sequence Generator Mining and its Application in Classification

Efficient Sequence Generator Mining and its Application in Classification Efficient Sequence Generator Mining and its Alication in Classification Chuancong Gao, Jianyong Wang 2, Yukai He 3 and Lizhu Zhou 4 Tsinghua University, Beijing 0084, China {gaocc07, heyk05 3 }@mails.tsinghua.edu.cn,

More information

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model

IMS Network Deployment Cost Optimization Based on Flow-Based Traffic Model IMS Network Deloyment Cost Otimization Based on Flow-Based Traffic Model Jie Xiao, Changcheng Huang and James Yan Deartment of Systems and Comuter Engineering, Carleton University, Ottawa, Canada {jiexiao,

More information

Energy consumption model over parallel programs implemented on multicore architectures

Energy consumption model over parallel programs implemented on multicore architectures Energy consumtion model over arallel rograms imlemented on multicore architectures Ricardo Isidro-Ramírez Instituto Politécnico Nacional SEPI-ESCOM M exico, D.F. Amilcar Meneses Viveros Deartamento de

More information

Brigham Young University Oregon State University. Abstract. In this paper we present a new parallel sorting algorithm which maximizes the overlap

Brigham Young University Oregon State University. Abstract. In this paper we present a new parallel sorting algorithm which maximizes the overlap Aeared in \Journal of Parallel and Distributed Comuting, July 1995 " Overlaing Comutations, Communications and I/O in Parallel Sorting y Mark J. Clement Michael J. Quinn Comuter Science Deartment Deartment

More information

INFLUENCE POWER-BASED CLUSTERING ALGORITHM FOR MEASURE PROPERTIES IN DATA WAREHOUSE

INFLUENCE POWER-BASED CLUSTERING ALGORITHM FOR MEASURE PROPERTIES IN DATA WAREHOUSE The International Archives of the Photogrammetry, Remote Sensing and Satial Information Sciences, Vol. 38, Part II INFLUENCE POWER-BASED CLUSTERING ALGORITHM FOR MEASURE PROPERTIES IN DATA WAREHOUSE Min

More information

Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4

Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4 Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4 M. Gajbe a A. Canning, b L-W. Wang, b J. Shalf, b H. Wasserman, b and R. Vuduc, a a Georgia Institute of Technology,

More information

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka Parallel Construction of Multidimensional Binary Search Trees Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka School of CIS and School of CISE Northeast Parallel Architectures Center Syracuse

More information

Mitigating the Impact of Decompression Latency in L1 Compressed Data Caches via Prefetching

Mitigating the Impact of Decompression Latency in L1 Compressed Data Caches via Prefetching Mitigating the Imact of Decomression Latency in L1 Comressed Data Caches via Prefetching by Sean Rea A thesis resented to Lakehead University in artial fulfillment of the requirement for the degree of

More information

Simultaneous Tracking of Multiple Objects Using Fast Level Set-Like Algorithm

Simultaneous Tracking of Multiple Objects Using Fast Level Set-Like Algorithm Simultaneous Tracking of Multile Objects Using Fast Level Set-Like Algorithm Martin Maška, Pavel Matula, and Michal Kozubek Centre for Biomedical Image Analysis, Faculty of Informatics Masaryk University,

More information

Experimental Comparison of Shortest Path Approaches for Timetable Information

Experimental Comparison of Shortest Path Approaches for Timetable Information Exerimental Comarison of Shortest Path roaches for Timetable Information Evangelia Pyrga Frank Schulz Dorothea Wagner Christos Zaroliagis bstract We consider two aroaches that model timetable information

More information

A Method to Determine End-Points ofstraight Lines Detected Using the Hough Transform

A Method to Determine End-Points ofstraight Lines Detected Using the Hough Transform RESEARCH ARTICLE OPEN ACCESS A Method to Detere End-Points ofstraight Lines Detected Using the Hough Transform Gideon Kanji Damaryam Federal University, Lokoja, PMB 1154, Lokoja, Nigeria. Abstract The

More information

Design Trade-offs in Customized On-chip Crossbar Schedulers

Design Trade-offs in Customized On-chip Crossbar Schedulers J Sign Process Syst () 8:9 8 DOI.7/s-8--x Design Trade-offs in Customized On-chi Crossbar Schedulers Jae Young Hur Stehan Wong Todor Stefanov Received: October 7 / Revised: June 8 / cceted: ugust 8 / Published

More information

A Scalable Parallel Approach for Peptide Identification from Large-scale Mass Spectrometry Data

A Scalable Parallel Approach for Peptide Identification from Large-scale Mass Spectrometry Data 2009 International Conference on Parallel Processing Workshos A Scalable Parallel Aroach for Petide Identification from Large-scale Mass Sectrometry Data Gaurav Kulkarni, Ananth Kalyanaraman School of

More information

Multigrain Parallel Delaunay Mesh Generation: Challenges and Opportunities for Multithreaded Architectures

Multigrain Parallel Delaunay Mesh Generation: Challenges and Opportunities for Multithreaded Architectures Multigrain Parallel Delaunay Mesh Generation: Challenges and Oortunities for Multithreaded Architectures Christos D. Antonooulos, Xiaoning Ding, Andrey Chernikov, Fili Blagojevic, Dimitrios S. Nikolooulos,

More information

Record Route IP Traceback: Combating DoS Attacks and the Variants

Record Route IP Traceback: Combating DoS Attacks and the Variants Record Route IP Traceback: Combating DoS Attacks and the Variants Abdullah Yasin Nur, Mehmet Engin Tozal University of Louisiana at Lafayette, Lafayette, LA, US ayasinnur@louisiana.edu, metozal@louisiana.edu

More information

A Scalable Parallel Sorting Algorithm Using Exact Splitting

A Scalable Parallel Sorting Algorithm Using Exact Splitting A Scalable Parallel Sorting Algorithm Using Exact Slitting Christian Siebert 1,2 and Felix Wolf 1,2,3 1 German Research School for Simulation Sciences, 52062 Aachen, Germany 2 RWTH Aachen University, Comuter

More information

APPLICATION OF PARTICLE FILTERS TO MAP-MATCHING ALGORITHM

APPLICATION OF PARTICLE FILTERS TO MAP-MATCHING ALGORITHM APPLICATION OF PARTICLE FILTERS TO MAP-MATCHING ALGORITHM Pavel Davidson 1, Jussi Collin 2, and Jarmo Taala 3 Deartment of Comuter Systems, Tamere University of Technology, Finland e-mail: avel.davidson@tut.fi

More information

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics structure arises in many alications of geometry. The dual structure, called a Delaunay triangulation also has many interesting roerties. Figure 3: Voronoi diagram and Delaunay triangulation. Search: Geometric

More information

GDP: Using Dataflow Properties to Accurately Estimate Interference-Free Performance at Runtime

GDP: Using Dataflow Properties to Accurately Estimate Interference-Free Performance at Runtime GDP: Using Dataflow Proerties to Accurately Estimate Interference-Free Performance at Runtime Magnus Jahre Deartment of Comuter Science Norwegian University of Science and Technology (NTNU) Email: magnus.jahre@ntnu.no

More information

AUTOMATIC EXTRACTION OF BUILDING OUTLINE FROM HIGH RESOLUTION AERIAL IMAGERY

AUTOMATIC EXTRACTION OF BUILDING OUTLINE FROM HIGH RESOLUTION AERIAL IMAGERY AUTOMATIC EXTRACTION OF BUILDING OUTLINE FROM HIGH RESOLUTION AERIAL IMAGERY Yandong Wang EagleView Technology Cor. 5 Methodist Hill Dr., Rochester, NY 1463, the United States yandong.wang@ictometry.com

More information

PARALLEL ALGORITHMS FOR SEGMENTATION OF CELLULAR STRUCTURES IN 2D+TIME AND 3D MORPHOGENESIS DATA

PARALLEL ALGORITHMS FOR SEGMENTATION OF CELLULAR STRUCTURES IN 2D+TIME AND 3D MORPHOGENESIS DATA Proceedings of ALGORITMY 2012. 416 426 PARALLEL ALGORITHMS FOR SEGMENTATION OF CELLULAR STRUCTURES IN 2D+TIME AND 3D MORPHOGENESIS DATA KAROL MIKULA, MICHAL SMÍŠEK AND RÓBERT ŠPIR Abstract. In this aer

More information

A Metaheuristic Scheduler for Time Division Multiplexed Network-on-Chip

A Metaheuristic Scheduler for Time Division Multiplexed Network-on-Chip Downloaded from orbit.dtu.dk on: Jan 25, 2019 A Metaheuristic Scheduler for Time Division Multilexed Network-on-Chi Sørensen, Rasmus Bo; Sarsø, Jens; Pedersen, Mark Ruvald; Højgaard, Jasur Publication

More information

Texture Mapping with Vector Graphics: A Nested Mipmapping Solution

Texture Mapping with Vector Graphics: A Nested Mipmapping Solution Texture Maing with Vector Grahics: A Nested Mimaing Solution Wei Zhang Yonggao Yang Song Xing Det. of Comuter Science Det. of Comuter Science Det. of Information Systems Prairie View A&M University Prairie

More information