An Efficient Clustering Algorithm for Moving Object Trajectories
|
|
- Archibald Thornton
- 6 years ago
- Views:
Transcription
1 3rd International Conference on Computational Techniques and Artificial Intelligence (ICCTAI'214) Feb , 214 Singapore An Efficient Clustering Algorithm for Moving Object Trajectories Hnin Su Khaing, and Thandar Thein Abstract Evidence of increasing and continuous diffusion of low cost GPS devices, it is becoming the challenges to analyze the moving objects trajectory data. To analyze the moving object trajectories, there is a need for mechanism that how to effectively cluster on moving objects. Trajectory clustering has long been an important research direction on move mining, but still remains which algorithm is more effective among existing algorithms. In this paper, we propose a clustering algorithm which is based on Density-Based Spatial Clustering of Applications with Noise (). It cannot cluster data sets well with large differences in densities. We address this problem by proposed clustering algorithm which enhanced the by solving time consuming. Finally we evaluate an efficient trajectory clustering algorithm with real trajectory dataset by comparing with. Evaluation results show that proposed clustering algorithm can provide better performance and minimal error than. Keywords, MoveMine, Moving Object Trajectory, Trajectory Clustering. W I. INTRODUCTION ITH a widespread use of location aware devices such as mobile phones and GPS-enabled devices, huge amount of moving object data have been collected. This leads to a growing research area with automatic analysis of animal behavior and traffic management using computer vision techniques. Many researchers pay a lot of attention on trajectory data modeling, indexing and query processing issues for trajectories and proposing new models specifically dedicated to moving objects and their trajectories. Based on the above motivation, MoveMine system is designed for the discovery of various kinds of movement patterns and knowledge in numerous applications such traffic control, climatological forecast and animal movement pattern. For instance, the animal migration demonstrates that there is a temporally and spatially correlation with the movement of creatures. In biological domains, many researchers discovered that some wild animals form large social groups when migration occurs. The study of animals' social behavior and wildlife migration are more concerned with a group of animals' movement patterns than each individual's. MoveMine System is integrated into two functions: moving Hnin Su Khaing is with University of Computer Studies, Mandalay, Myanmar. ( hninsukhaing@gmail.com). Thandar Thein is with University of Computer Studies, Yangon, Myanmar. ( thandartheinn@gmail.com). object pattern mining and trajectory mining. Trajectory data associated with moving objects is one of the fields which have increased in volume considerably. This indication becomes a challenge of finding moving animal belonging to the same group. Trajectory clustering take part in trajectory mining and there exits many algorithms using data mining techniques. In general, there are a lot of data mining methods developed for analyzing moving animal based on the nature of methods. Especially, the data analysis task of clustering is to find objects that have move in a similar way. is the one of the algorithms for clustering the trajectory data. It can find a number of clusters starting from the estimated density distribution of corresponding nodes but it cannot well cluster with very large densities. The goal of this work is to propose an efficient clustering algorithm which can solve the problem of for moving object trajectories. This algorithm is composed of three phases: partitioning; clustering and grouping. In partitioning phase, we divide the trajectory data into 'k' partitions. Then, we develop the clustering phase by exploiting with and finally we group the separated clusters. The rest of this paper is organized as follows: Section II presents the related work and Section III describes background theory for trajectory clustering. In Section IV, proposed clustering algorithm is discussed and evaluation is conducted in Section V. Finally conclusion is conducted in Section VI. II. RELATED WORK Trajectory clustering, one of which plays a major role in moving object trajectory mining. There are a lot of studies for trajectory data such as transportation management and behavioral analysis. The author [3] observed that the moving objects similarity between trajectory sets. He designed a similarity metric to find the similarity between trajectory sets where each set is generated by a moving object and based on these measures, he proposed a clustering algorithm to cluster trajectory sets. In order to prove the effective and efficiency of algorithm his algorithm, he conducted with intensive experiments using mobile phones data. To reduce the estimating of complex parameters, complexity and computational cost for human analyst, a vector field k-means clustering technique was proposed in [8] that took together ideas from visualization [2], data clustering and scalar field design to find a locally optimal cluster and demonstrated that how can find global patterns and handle 74
2 3rd International Conference on Computational Techniques and Artificial Intelligence (ICCTAI'214) Feb , 214 Singapore partial trajectories. An extended k-means technique for clustering moving objects was proposed in [9]. They use the direction as a heuristic to determine the different number of cluster. They use silhouette coefficient as a measure for quality of their approach and they showed the performance and accuracy on both real and synthetic dataset. The authors [1] presented a density based k-nearest Neighbors Clustering Algorithm for trajectory data which can resolve the sensitive user defined parameters problem in. This cluster method has three main features; discovering clusters of arbitrary shape, strong ability of disposing noise; easily setting the input-parameter; and the recommended value is more accurate than others. They use two real datasets of moving vehicles in Milan (Italy) and Athens (Greece) and extensive experiments were conducted. To predict the locations of moving objects, clustered periodical trajectories used a compact representation of spatiotemporal trajectory in [1]. They suggested an algorithm by using cluster's centroids to predict future locations with experimental real-world data and evaluated the precision and recall of the result. A new partition and group framework for trajectory clustering (TRACLUS) was proposed in [4]. In this algorithm, a trajectory is partitioned into a set of line segments and then, grouped similar line segments together into a cluster. For partitioning algorithm, they used the minimum description length (MDL) principle. They demonstrated that TRACLUS correctly discover common sub-trajectories from real trajectory data. In this paper, a new clustering algorithm is purposed and we show that how the algorithm is more efficient and effective than others by comparing with real world trajectory dataset. III. PRELIMINARY CONCEPTS Despite the growing demands for diverse applications, there have been few scalable tools available for mining massive and sophisticated moving object data. MoveMine system has two categories based on the nature of methods: pattern mining and trajectory mining [1]. A. Pattern Mining The first category is moving object pattern mining which emphasizes the analysis of discrete locations with temporal information [11]. It includes swarm pattern, periodic pattern and follower pattern in Fig 1. B. Trajectory Mining Trajectory mining in Fig 2, focuses more on the mining of trajectories associated with geometric shapes, such as clustering and finding outliers from hurricane path across years [11]. Trajectory clustering is the process of finding a set of physical or abstract objects into classes of similar object by applying the various clustering algorithms such as k-means, k- nearest neighbors and etc depend on their trajectory dataset. Fig. 1 Pattern Mining Trajectory outlier is a object that is different from or inconsistent with the remaining set of data. It can be used by outlier algorithm such as distribution-based, distance-based, density-based and deviation-based [6]. Trajectory classification is model construction for predicting the class labels of moving objects based on their trajectories and other features. C. Clustering Techniques Fig. 2 Trajectory Mining Clustering is a dynamic field of research in data mining and an unsupervised learning process because there are no class labels to help. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. A cluster of data objects can be treated collectively as one group and so may be considered as a form of data compression. In general, the major clustering methods can be classified into the following categories. A partitioning method first creates an initial set of k partitions, where parameter k is the number of partitions to construct. It then uses an iterative relocation technique that attempts to improve the partitioning by moving objects from one group to another. Typical partitioning methods include k- means, k-medoids, CLARANS, and etc. A hierarchical method creates a hierarchical decomposition of the given set of data objects. The method can be classified as being either agglomerative (bottom-up) or divisive (top-down), based on how the hierarchical decomposition is formed. In densitybased method, it clusters objects based on the notion of density. It either grows clusters according to the density of neighborhood objects (such as in ) or according to some density function (such as in DENCLUE). A grid-based 75
3 3rd International Conference on Computational Techniques and Artificial Intelligence (ICCTAI'214) Feb , 214 Singapore method first quantizes the object space into a finite number of cells that form a grid structure, and then performs clustering on the grid structure. A model-based method hypothesizes a model for each of the clusters and finds the best fit of the data to that model [5]. the points which have similar distance of NeighborPts. All points that are found in eps, neighbor are added into cluster (C). This process continues until the connected cluster is completely found. D.Distance Measure In our analysis scenario, we evaluate the distance between Latitude and Longitude of points using Euclidean distance [5]. where X1 = (x11, x12,, x1n) and X2 = (x21, x22,, x2n), shown in equation (1). Algorithm: Efficient Clustering Algorithm Input: number of clusters K, epsilon eps, minimum point MinPts, threshold t, trajectory dataset D Output: set of trajectory clusters Set C to be ; Partition (D,K); Grouping (t); /*PARTITIONING PHASE*/ Partition(D, K) for each( k ε K) //partition the data to k Clustering(D); /*CLUSTERING PHASE*/ Clustering (D, eps, MinPts, t) for each (d ε D) do visited = P;// randomly selected NeighborPts = regionquery (P,eps) // find the neighborpts by using distance function if (sizeof(neighborpts) < MinPts) then Noise=P; else C++; expandcluster (P, NeighborPts, C, eps, MinPts) function expandcluster (P, NeighborPts, C, eps, MinPts) C=P; for each (n ε NeighborPts) do if(p!=visited) then visited=p ; NeighborPts = regionquery(p,eps) // find the neighborpts by using distance function if (sizeof (NeighborPts ) >= MinPts) then NeighborPts = NeighborPts joined with NeighborPts // join the NeighborPts if (P is not yet member of any cluster C) then C=P retrun; function regionquery (P, eps) Euclidean Distance//calculate distance return all points within P s eps-neighborhood /*GROUPING PHASE*/ Grouping (t) for each (c ε C) mean(c) // calculate the mean value of each cluster diff= difference of mean value of c with previous c if(diff<t) then c= join the two c; // join the two cluster return joined clusters dist (X1, X 2 ) n i 1 ( x1i x2i ) 2 (1) IV. PROPOSED EFFICIENT CLUSTERING ALGORITHM The proposed trajectory clustering algorithm consists of three phases; partitioning; clustering; and grouping. Initially we perform the partitioning phase by decomposing the trajectory into k partition. In second, we apply the clustering phase on each partition. In grouping phase, we reform the separated clusters. Architecture of proposed clustering algorithm is shown in Fig 3. Fig. 3 System Flow for Proposed Clustering Algorithm A. Partitioning Firstly, we perform the partitioning phase on the trajectory dataset in order to improve the efficiency of our algorithm. To reduce the computation time in [7] which take more time to perform the similarity measure, we make enhancing it by dividing the data into k partitions. This algorithm mainly emphasizes on huge amount of data and it requires a parameter k for number of partitions. B. Clustering After partitioning the trajectory data, here, we apply the clustering algorithm. Having k partitions from previous steps, we now apply to cluster on each partition and it also needs two parameters epsilon (eps) which is the distance within we form cluster and minimum point (MinPts) in each cluster respectively. In this phase, it starts with arbitrary point that has not been visited and then compute the similarity using the Euclidean Distance in (1) for finding the neighbor points (NeighborPts) within eps and if the size of neighbor is less than MinPts, we eliminate the point as noise. For expanding the cluster, we find Fig. 4 Proposed Clustering Algorithm 76
4 3rd International Conference on Computational Techniques and Artificial Intelligence (ICCTAI'214) Feb , 214 Singapore C. Grouping Now, we present the grouping of resulted clusters. In order to improve the effectiveness of clustering algorithm, we group the clusters in each partition. This phase is necessary to protect the spread clusters without including in dense region. Due to the spread clusters from partitioning phase, uncertain cluster will produce. In this phase, we calculate the mean values in each cluster, then, comparing with each cluster to others. Here, we need to define the threshold (t) for grouping the two or more cluster. If the difference is less than threshold, group the clusters and we verify the effectiveness of our algorithm by measuring with Sum Squared Error (SSE). V. EXPERIMENTAL EVALUATION In this section, we evaluate the proposed clustering algorithm by trajectory data set. We compare proposed algorithm with. We also describe the data set used in experiment and discuss the experimental results. A. Experimental Study The animal trajectory dataset is used to conduct the effectiveness of the proposed clustering algorithm. It has been generated by Starkey project. This data set contains the radiotelemetry locations (with other information) of elk, deer, and cattle from the years 1993 through 1996We use elk's movements in 1993 and deer's movements in 1995 and cattle's movement in Elk has 33 trajectories and points; Deer 32 trajectories and 265 points; Cattle 41 trajectories and points. They have coordinates points which define by Universal Transverse Mercator (UTM) and 2 fields such as UTMGrid, UTMGridEast, UTMGridNorth and etc [13]. We extract the x, UTMGridEast and y, UTMGridNorth coordinates from the telemetry data for our experiments. We perform the evaluation of proposed clustering algorithm by comparing with on trajectory data. B. Performance Matrix We show the performance of computation time on varying data size of animal trajectory by making a comparison of and proposed clustering algorithm. In our study we find the fact that changing of data size effect the number of cluster. We also attempt to measure the clustering quality by employing Sum Squared Error (SSE). In order to measure the clustering quality independent from the features used for clustering and the number of clusters produced as a result our analysis use SSE in (2). numclus 2 (1/ 2 i (, ) ) i 1 x C i y C i SSE C dis x y (2) We conduct the experiments on core i7 with 8GBytes of main memory, running on Windows 7. We implement our algorithm in jdk 1.7 on Eclipse Juno. C. Result Discussion The experiment studies the effect of changing the data size among trajectory on clustering computation time for both and proposed algorithm. In this experiment, we find that our clustering algorithm performs well in large datasets. This experiment shows that due to the increasing number of data size as a result of less computation time. algorithm takes more time for clustering of all objects. Fig 5 proofs that the differences of performance gain is more significant on large datasets. Although changing the data size, our algorithm changes the running time slightly. Time(milisec) Animal Trajectory Data Size 9 Proposed Agorithm Fig. 5 Performance Comparison of and Proposed Algorithm Fig 6 shows that SSE values of proposed algorithm. We discover that SSE value of our algorithm is drastic compare with. We define that error of proposed algorithm is less than. It means that there are small numbers of SSE. The small number of SSE, our algorithm correctly classified. SSE Animal Trajectory Data Size Fig. 6 Sum Square Error of vs. We also study the changing of data size effect the number of cluster. The small number of cluster means an increase in cluster size. Our algorithm well cluster without depending on the changes of data size. We find that has dependent of data size due to the expand cluster. So, it has more computation time and large number of clusters. We address these problems by an efficient clustering algorithm for large trajectory datasets. 77
5 3rd International Conference on Computational Techniques and Artificial Intelligence (ICCTAI'214) Feb , 214 Singapore No. of Cluster Data Size Fig. 7 Accuracy of vs. D. Effect of Parameter Values We study of changing the parameter value of eps on the clustering result. If we use a smaller eps, we discover a larger number of clusters. But if the value of eps is less than 3, we find that only cluster discover in algorithm. We have tested the effects of varying parameter values for both algorithms. To study the effect of epsilon value on number of cluster, we conduct the experiment with various epsilon values. According to the experimental result shown in Fig 8, we observe that epsilon value is less than 45, the number of cluster is smaller. The epsilon value is between 45 and 55, the optimal number of cluster is achieved. The epsilon value is greater than 55, the number of cluster is decreasing. Fig. 8 shows the clustering result of optimal parameter using the different values between 35 and 125. No. of Cluster Epsilon(eps) algorithm we conducted the performance evaluation and analyze the results by comparing proposed algorithm and. REFERENCES [1] A.K. Akasapu, P.S. Rao, L. K. Sharma and S. K. Satpathy, Density Based k-nearest Neighbors Clustering Algorithm for Trajectory Data, International Journal of Advanced Science and Technology, Vol. 31, June 211. [2] G. McArdle, A. Tahir, M. Bertolotto, "Spatio-Temporal Clustering of Movement Data: An Application to Trajectories Generated by Human- Computer Interaction", ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume I-2, 212, XXII ISPRS Congress, 25 August 1 September 212, Melbourne, Australia. [3] J. Dai, "A Novel Moving Object Trajectories Clustering Approach for Very Large Datasets", in: Proceeding of 2nd International Conference on computer Science and Electronic Engineering (ICCSEE 213). [4] J.G. Lee, J. Han, and K.-Y. Whang. Trajectory Clustering: A partitionand-group framework, in SIGMOD '7: Proceeding of the 27 ACM SIGMOD International Conference on Management of Data. New Yourk, NY, USA: ACM, 27. p [5] J. Han and M. Kamber, "Data Ming: Concept and Technique", 2nd edition, Morgan Kaufmann, p. 348 and 398, 26. [6] J. G. Lee, J. Han and X. Li, "Trajectory Outlier Detection: A Partition and Detect Framework", Data Engineering 28,ICDE, 28, IEEE International Conference, April 7-12,28. p [7] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. "A density-based algorithm for discovering clusters in large spatial databases", in: Proceeding of 1996 International Conference Knowledge Discovery and Data Mining (KDD 96), pages , Portland, OR, Aug [8] N.Ferreira1, J. Klosowski, C. E. Scheidegger, C. T. Silva1, " Vector Field k-means: Clustering Trajectories by Fitting Multiple Vector Fields", Eurographics Conference on Visualization (EuroVis) 213, Volume 32 (213), Number 3. [9] O. Omnia, H. M.O. Mokhtar, M.E. El-Sharkawi, An extended k-means technique for clustering moving objects, Egyptian Informatics Journal, Cairo University, March 211, Volume 12, Issue 1, p [1] S.Elnekave, M. Last, O. Maimon, "Predicting Future Locations Using Clusters' Centroids", in: Proceeding of 15th annual ACM international symposium on Advances in geographic information systems, ACMGIS 7, November 7 9, 27, Seattle, WA, USA. [11] Z. Li, M. Ji, J.G. Lee, L.A. Tang, Y. Yu, J. Han and R. Kays, "MoveMine: Mining Moving Object Databases", in: Proceeding of SIGMOD 1, ACM SIGMOD International Conference on Management of Data, June 6 11, 21, Indianapolis, Indiana, USA. [12] Z. Li, J. Han, M. Ji, L. Tang, Y. Yu, B. Ding, MoveMine: Mining Moving Object Data for Discovery of Animal Movement Patterns, Journal of ACM Transactions on Intelligent Systems and Technology (TIST), Volume 2 Issue 4, July 211, Article 37, ACM New York, NY, USA. [13] Fig. 8 Effect of eps values on number of clusters VI. CONCLUSION In this paper, we propose an efficient clustering algorithm for trajectory data. It composes of three phases; partitioning; clustering and grouping. clustering algorithm cannot cluster well in very large densities and distance calculation is time consuming. To overcome time consuming issue, we conducted the partitioning of dataset first and then trajectories are clustered by applying algorithm in each partition. Finally we perform the grouping phase to integrate the spread clusters. To evaluate the effectiveness of proposed 78
Unsupervised learning on Color Images
Unsupervised learning on Color Images Sindhuja Vakkalagadda 1, Prasanthi Dhavala 2 1 Computer Science and Systems Engineering, Andhra University, AP, India 2 Computer Science and Systems Engineering, Andhra
More informationAnalyzing Outlier Detection Techniques with Hybrid Method
Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,
More informationDS504/CS586: Big Data Analytics Big Data Clustering II
Welcome to DS504/CS586: Big Data Analytics Big Data Clustering II Prof. Yanhua Li Time: 6pm 8:50pm Thu Location: AK 232 Fall 2016 More Discussions, Limitations v Center based clustering K-means BFR algorithm
More informationAnalysis and Extensions of Popular Clustering Algorithms
Analysis and Extensions of Popular Clustering Algorithms Renáta Iváncsy, Attila Babos, Csaba Legány Department of Automation and Applied Informatics and HAS-BUTE Control Research Group Budapest University
More informationData Clustering Hierarchical Clustering, Density based clustering Grid based clustering
Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms
More informationDensity Based Clustering using Modified PSO based Neighbor Selection
Density Based Clustering using Modified PSO based Neighbor Selection K. Nafees Ahmed Research Scholar, Dept of Computer Science Jamal Mohamed College (Autonomous), Tiruchirappalli, India nafeesjmc@gmail.com
More informationHeterogeneous Density Based Spatial Clustering of Application with Noise
210 Heterogeneous Density Based Spatial Clustering of Application with Noise J. Hencil Peter and A.Antonysamy, Research Scholar St. Xavier s College, Palayamkottai Tamil Nadu, India Principal St. Xavier
More informationA Review on Cluster Based Approach in Data Mining
A Review on Cluster Based Approach in Data Mining M. Vijaya Maheswari PhD Research Scholar, Department of Computer Science Karpagam University Coimbatore, Tamilnadu,India Dr T. Christopher Assistant professor,
More informationDENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE
DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE Sinu T S 1, Mr.Joseph George 1,2 Computer Science and Engineering, Adi Shankara Institute of Engineering
More informationClustering Algorithms for Data Stream
Clustering Algorithms for Data Stream Karishma Nadhe 1, Prof. P. M. Chawan 2 1Student, Dept of CS & IT, VJTI Mumbai, Maharashtra, India 2Professor, Dept of CS & IT, VJTI Mumbai, Maharashtra, India Abstract:
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/28/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationKnowledge Discovery in Databases
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Lecture notes Knowledge Discovery in Databases Summer Semester 2012 Lecture 8: Clustering
More informationData Clustering With Leaders and Subleaders Algorithm
IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719, Volume 2, Issue 11 (November2012), PP 01-07 Data Clustering With Leaders and Subleaders Algorithm Srinivasulu M 1,Kotilingswara
More informationNotes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/10/2017)
1 Notes Reminder: HW2 Due Today by 11:59PM TA s note: Please provide a detailed ReadMe.txt file on how to run the program on the STDLINUX. If you installed/upgraded any package on STDLINUX, you should
More informationScalable Varied Density Clustering Algorithm for Large Datasets
J. Software Engineering & Applications, 2010, 3, 593-602 doi:10.4236/jsea.2010.36069 Published Online June 2010 (http://www.scirp.org/journal/jsea) Scalable Varied Density Clustering Algorithm for Large
More informationDS504/CS586: Big Data Analytics Big Data Clustering II
Welcome to DS504/CS586: Big Data Analytics Big Data Clustering II Prof. Yanhua Li Time: 6pm 8:50pm Thu Location: KH 116 Fall 2017 Updates: v Progress Presentation: Week 15: 11/30 v Next Week Office hours
More informationA Parallel Community Detection Algorithm for Big Social Networks
A Parallel Community Detection Algorithm for Big Social Networks Yathrib AlQahtani College of Computer and Information Sciences King Saud University Collage of Computing and Informatics Saudi Electronic
More informationDynamic Clustering of Data with Modified K-Means Algorithm
2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq
More informationMobility Data Management & Exploration
Mobility Data Management & Exploration Ch. 07. Mobility Data Mining and Knowledge Discovery Nikos Pelekis & Yannis Theodoridis InfoLab University of Piraeus Greece infolab.cs.unipi.gr v.2014.05 Chapter
More informationCOMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS
COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS Mariam Rehman Lahore College for Women University Lahore, Pakistan mariam.rehman321@gmail.com Syed Atif Mehdi University of Management and Technology Lahore,
More informationDetermination of Optimal Epsilon (Eps) Value on DBSCAN Algorithm to Clustering Data on Peatland Hotspots in Sumatra
IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Determination of Optimal Epsilon (Eps) Value on DBSCAN Algorithm to Clustering Data on Peatland Hotspots in Sumatra Related content
More informationClustering in Data Mining
Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,
More informationNotes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/09/2018)
1 Notes Reminder: HW2 Due Today by 11:59PM TA s note: Please provide a detailed ReadMe.txt file on how to run the program on the STDLINUX. If you installed/upgraded any package on STDLINUX, you should
More informationEfficient and Effective Clustering Methods for Spatial Data Mining. Raymond T. Ng, Jiawei Han
Efficient and Effective Clustering Methods for Spatial Data Mining Raymond T. Ng, Jiawei Han 1 Overview Spatial Data Mining Clustering techniques CLARANS Spatial and Non-Spatial dominant CLARANS Observations
More informationClustering Part 4 DBSCAN
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationData Mining Algorithms
for the original version: -JörgSander and Martin Ester - Jiawei Han and Micheline Kamber Data Management and Exploration Prof. Dr. Thomas Seidl Data Mining Algorithms Lecture Course with Tutorials Wintersemester
More informationClustering Lecture 4: Density-based Methods
Clustering Lecture 4: Density-based Methods Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced
More informationKeywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.
Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Survey of Clustering
More informationDBSCAN. Presented by: Garrett Poppe
DBSCAN Presented by: Garrett Poppe A density-based algorithm for discovering clusters in large spatial databases with noise by Martin Ester, Hans-peter Kriegel, Jörg S, Xiaowei Xu Slides adapted from resources
More informationFaster Clustering with DBSCAN
Faster Clustering with DBSCAN Marzena Kryszkiewicz and Lukasz Skonieczny Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland Abstract. Grouping data
More informationA Survey on DBSCAN Algorithm To Detect Cluster With Varied Density.
A Survey on DBSCAN Algorithm To Detect Cluster With Varied Density. Amey K. Redkar, Prof. S.R. Todmal Abstract Density -based clustering methods are one of the important category of clustering methods
More informationBBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler
BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for
More informationBalanced COD-CLARANS: A Constrained Clustering Algorithm to Optimize Logistics Distribution Network
Advances in Intelligent Systems Research, volume 133 2nd International Conference on Artificial Intelligence and Industrial Engineering (AIIE2016) Balanced COD-CLARANS: A Constrained Clustering Algorithm
More informationK-Mean Clustering Algorithm Implemented To E-Banking
K-Mean Clustering Algorithm Implemented To E-Banking Kanika Bansal Banasthali University Anjali Bohra Banasthali University Abstract As the nations are connected to each other, so is the banking sector.
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 4
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationWorking with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan
Working with Unlabeled Data Clustering Analysis Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan chanhl@mail.cgu.edu.tw Unsupervised learning Finding centers of similarity using
More informationCLUSTERING. CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16
CLUSTERING CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16 1. K-medoids: REFERENCES https://www.coursera.org/learn/cluster-analysis/lecture/nj0sb/3-4-the-k-medoids-clustering-method https://anuradhasrinivas.files.wordpress.com/2013/04/lesson8-clustering.pdf
More informationKeywords: clustering algorithms, unsupervised learning, cluster validity
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based
More informationDensity-Based Clustering of Polygons
Density-Based Clustering of Polygons Deepti Joshi, Ashok K. Samal, Member, IEEE and Leen-Kiat Soh, Member, IEEE Abstract Clustering is an important task in spatial data mining and spatial analysis. We
More informationReview of Spatial Clustering Methods
ISSN 2320 2629 Volume 2, No.3, May - June 2013 Neethu C V et al., International Journal Journal of Information of Information Technology Technology Infrastructure, Infrastructure 2(3), May June 2013, 15-24
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationNormalization based K means Clustering Algorithm
Normalization based K means Clustering Algorithm Deepali Virmani 1,Shweta Taneja 2,Geetika Malhotra 3 1 Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi Email:deepalivirmani@gmail.com
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Cluster Analysis Reading: Chapter 10.4, 10.6, 11.1.3 Han, Chapter 8.4,8.5,9.2.2, 9.3 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber &
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationDetect tracking behavior among trajectory data
Detect tracking behavior among trajectory data Jianqiu Xu, Jiangang Zhou Nanjing University of Aeronautics and Astronautics, China, jianqiu@nuaa.edu.cn, jiangangzhou@nuaa.edu.cn Abstract. Due to the continuing
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster
More informationIntroduction to Trajectory Clustering. By YONGLI ZHANG
Introduction to Trajectory Clustering By YONGLI ZHANG Outline 1. Problem Definition 2. Clustering Methods for Trajectory data 3. Model-based Trajectory Clustering 4. Applications 5. Conclusions 1 Problem
More informationSpatial Outlier Detection
Spatial Outlier Detection Chang-Tien Lu Department of Computer Science Northern Virginia Center Virginia Tech Joint work with Dechang Chen, Yufeng Kou, Jiang Zhao 1 Spatial Outlier A spatial data point
More informationUnsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More informationCOMP 465: Data Mining Still More on Clustering
3/4/015 Exercise COMP 465: Data Mining Still More on Clustering Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Describe each of the following
More informationClustering part II 1
Clustering part II 1 Clustering What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods 2 Partitioning Algorithms:
More informationK-DBSCAN: Identifying Spatial Clusters With Differing Density Levels
15 International Workshop on Data Mining with Industrial Applications K-DBSCAN: Identifying Spatial Clusters With Differing Density Levels Madhuri Debnath Department of Computer Science and Engineering
More informationAN IMPROVED DENSITY BASED k-means ALGORITHM
AN IMPROVED DENSITY BASED k-means ALGORITHM Kabiru Dalhatu 1 and Alex Tze Hiang Sim 2 1 Department of Computer Science, Faculty of Computing and Mathematical Science, Kano University of Science and Technology
More informationA Comparative Study of Various Clustering Algorithms in Data Mining
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,
More informationC-NBC: Neighborhood-Based Clustering with Constraints
C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is
More informationInternational Journal of Advance Engineering and Research Development CLUSTERING ON UNCERTAIN DATA BASED PROBABILITY DISTRIBUTION SIMILARITY
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 08, August -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 CLUSTERING
More informationClustering in Ratemaking: Applications in Territories Clustering
Clustering in Ratemaking: Applications in Territories Clustering Ji Yao, PhD FIA ASTIN 13th-16th July 2008 INTRODUCTION Structure of talk Quickly introduce clustering and its application in insurance ratemaking
More informationAn Enhanced Density Clustering Algorithm for Datasets with Complex Structures
An Enhanced Density Clustering Algorithm for Datasets with Complex Structures Jieming Yang, Qilong Wu, Zhaoyang Qu, and Zhiying Liu Abstract There are several limitations of DBSCAN: 1) parameters have
More informationLecture-17: Clustering with K-Means (Contd: DT + Random Forest)
Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Medha Vidyotma April 24, 2018 1 Contd. Random Forest For Example, if there are 50 scholars who take the measurement of the length of the
More informationDatasets Size: Effect on Clustering Results
1 Datasets Size: Effect on Clustering Results Adeleke Ajiboye 1, Ruzaini Abdullah Arshah 2, Hongwu Qin 3 Faculty of Computer Systems and Software Engineering Universiti Malaysia Pahang 1 {ajibraheem@live.com}
More informationKapitel 4: Clustering
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.
More informationEfficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points
Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points Dr. T. VELMURUGAN Associate professor, PG and Research Department of Computer Science, D.G.Vaishnav College, Chennai-600106,
More informationResearch on Data Mining Technology Based on Business Intelligence. Yang WANG
2018 International Conference on Mechanical, Electronic and Information Technology (ICMEIT 2018) ISBN: 978-1-60595-548-3 Research on Data Mining Technology Based on Business Intelligence Yang WANG Communication
More informationThe Effect of Word Sampling on Document Clustering
The Effect of Word Sampling on Document Clustering OMAR H. KARAM AHMED M. HAMAD SHERIN M. MOUSSA Department of Information Systems Faculty of Computer and Information Sciences University of Ain Shams,
More informationA New Approach to Determine Eps Parameter of DBSCAN Algorithm
International Journal of Intelligent Systems and Applications in Engineering Advanced Technology and Science ISSN:2147-67992147-6799 www.atscience.org/ijisae Original Research Paper A New Approach to Determine
More informationNORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM
NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM Saroj 1, Ms. Kavita2 1 Student of Masters of Technology, 2 Assistant Professor Department of Computer Science and Engineering JCDM college
More informationISSN: (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Paper / Case Study Available online at:
More informationData Mining 4. Cluster Analysis
Data Mining 4. Cluster Analysis 4.5 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction DBSCAN Algorithm OPTICS Algorithm DENCLUE Algorithm References Outline Introduction Introduction Density-based
More informationData Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data
More informationChapter 8: GPS Clustering and Analytics
Chapter 8: GPS Clustering and Analytics Location information is crucial for analyzing sensor data and health inferences from mobile and wearable devices. For example, let us say you monitored your stress
More informationEnhancing Cluster Quality by Using User Browsing Time
Enhancing Cluster Quality by Using User Browsing Time Rehab Duwairi Dept. of Computer Information Systems Jordan Univ. of Sc. and Technology Irbid, Jordan rehab@just.edu.jo Khaleifah Al.jada' Dept. of
More informationData Mining: Concepts and Techniques. Chapter March 8, 2007 Data Mining: Concepts and Techniques 1
Data Mining: Concepts and Techniques Chapter 7.1-4 March 8, 2007 Data Mining: Concepts and Techniques 1 1. What is Cluster Analysis? 2. Types of Data in Cluster Analysis Chapter 7 Cluster Analysis 3. A
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationOutlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data
Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University
More informationIteration Reduction K Means Clustering Algorithm
Iteration Reduction K Means Clustering Algorithm Kedar Sawant 1 and Snehal Bhogan 2 1 Department of Computer Engineering, Agnel Institute of Technology and Design, Assagao, Goa 403507, India 2 Department
More informationOSM-SVG Converting for Open Road Simulator
OSM-SVG Converting for Open Road Simulator Rajashree S. Sokasane, Kyungbaek Kim Department of Electronics and Computer Engineering Chonnam National University Gwangju, Republic of Korea sokasaners@gmail.com,
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationClustering. Chapter 10 in Introduction to statistical learning
Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What
More informationFosca Giannotti et al,.
Trajectory Pattern Mining Fosca Giannotti et al,. - Presented by Shuo Miao Conference on Knowledge discovery and data mining, 2007 OUTLINE 1. Motivation 2. T-Patterns: definition 3. T-Patterns: the approach(es)
More informationCentroid Based Text Clustering
Centroid Based Text Clustering Priti Maheshwari Jitendra Agrawal School of Information Technology Rajiv Gandhi Technical University BHOPAL [M.P] India Abstract--Web mining is a burgeoning new field that
More informationEnhancing Cluster Quality by Using User Browsing Time
Enhancing Cluster Quality by Using User Browsing Time Rehab M. Duwairi* and Khaleifah Al.jada'** * Department of Computer Information Systems, Jordan University of Science and Technology, Irbid 22110,
More informationAn Efficient Clustering for Crime Analysis
An Efficient Clustering for Crime Analysis Malarvizhi S 1, Siddique Ibrahim 2 1 UG Scholar, Department of Computer Science and Engineering, Kumaraguru College Of Technology, Coimbatore, Tamilnadu, India
More informationClustering Documentation
Clustering Documentation Release 0.3.0 Dahua Lin and contributors Dec 09, 2017 Contents 1 Overview 3 1.1 Inputs................................................... 3 1.2 Common Options.............................................
More informationSOMSN: An Effective Self Organizing Map for Clustering of Social Networks
SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,
More informationCHAPTER 7. PAPER 3: EFFICIENT HIERARCHICAL CLUSTERING OF LARGE DATA SETS USING P-TREES
CHAPTER 7. PAPER 3: EFFICIENT HIERARCHICAL CLUSTERING OF LARGE DATA SETS USING P-TREES 7.1. Abstract Hierarchical clustering methods have attracted much attention by giving the user a maximum amount of
More informationDensity-Based Clustering Based on Probability Distribution for Uncertain Data
International Journal of Engineering and Advanced Technology (IJEAT) Density-Based Clustering Based on Probability Distribution for Uncertain Data Pramod Patil, Ashish Patel, Parag Kulkarni Abstract: Today
More informationUsing Association Rules for Better Treatment of Missing Values
Using Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine Intelligence Group) National University
More informationd(2,1) d(3,1 ) d (3,2) 0 ( n, ) ( n ,2)......
Data Mining i Topic: Clustering CSEE Department, e t, UMBC Some of the slides used in this presentation are prepared by Jiawei Han and Micheline Kamber Cluster Analysis What is Cluster Analysis? Types
More informationOutlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering
World Journal of Computer Application and Technology 5(2): 24-29, 2017 DOI: 10.13189/wjcat.2017.050202 http://www.hrpub.org Outlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering
More informationAn Enhanced K-Medoid Clustering Algorithm
An Enhanced Clustering Algorithm Archna Kumari Science &Engineering kumara.archana14@gmail.com Pramod S. Nair Science &Engineering, pramodsnair@yahoo.com Sheetal Kumrawat Science &Engineering, sheetal2692@gmail.com
More informationData Mining Cluster Analysis: Advanced Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining, 2 nd Edition
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Outline Prototype-based Fuzzy c-means
More informationTemporal Weighted Association Rule Mining for Classification
Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider
More informationTowards New Heterogeneous Data Stream Clustering based on Density
, pp.30-35 http://dx.doi.org/10.14257/astl.2015.83.07 Towards New Heterogeneous Data Stream Clustering based on Density Chen Jin-yin, He Hui-hao Zhejiang University of Technology, Hangzhou,310000 chenjinyin@zjut.edu.cn
More informationInternational Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at
Performance Evaluation of Ensemble Method Based Outlier Detection Algorithm Priya. M 1, M. Karthikeyan 2 Department of Computer and Information Science, Annamalai University, Annamalai Nagar, Tamil Nadu,
More informationMining Dense Trajectory Pattern Regions of Various Temporal Tightness Ms. Sumaiya I. Shaikh 1, Prof. K. N. Shedge 2
Mining Dense Trajectory Pattern Regions of Various Temporal Tightness Ms. Sumaiya I. Shaikh 1, Prof. K. N. Shedge 2 1 Ms.Sumaiya I. Shaikh, ComputerEngineering Department,SVIT, Chincholi, Nashik, Maharashtra,
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI
More informationClustering to Reduce Spatial Data Set Size
Clustering to Reduce Spatial Data Set Size Geoff Boeing arxiv:1803.08101v1 [cs.lg] 21 Mar 2018 1 Introduction Department of City and Regional Planning University of California, Berkeley March 2018 Traditionally
More informationK-Means Clustering With Initial Centroids Based On Difference Operator
K-Means Clustering With Initial Centroids Based On Difference Operator Satish Chaurasiya 1, Dr.Ratish Agrawal 2 M.Tech Student, School of Information and Technology, R.G.P.V, Bhopal, India Assistant Professor,
More informationAn Efficient Approach towards K-Means Clustering Algorithm
An Efficient Approach towards K-Means Clustering Algorithm Pallavi Purohit Department of Information Technology, Medi-caps Institute of Technology, Indore purohit.pallavi@gmail.co m Ritesh Joshi Department
More information