Flock by Leader: A Novel Machine Learning Biologically Inspired Clustering Algorithm

Size: px
Start display at page:

Download "Flock by Leader: A Novel Machine Learning Biologically Inspired Clustering Algorithm"

Transcription

1 Flock by Leader: A Novel Machine Learning Biologically Inspired Clustering Algorithm Abdelghani Bellaachia 1, Anasse Bari 1 1 The George Washington University, School of Engineering and Applied Sciences Computer Science Department, nd Street NW, Washington DC 20052, USA {bell, bari}@gwu.edu Abstract. In the April 2010 Nature research report, it was announced that biological physicists only very recently discovered that there exists a leadership pattern in flocks of pigeon birds. The most authoritative birds of the pigeons flock take the lead, and followers follow the leaders directions. Pigeon leaders roles vary over time. Following this unprecedented discovery made by zoologists at the University of Oxford and Eötvös University, we extend in this paper the flocking model largely used in computer science. We define a new biologically inspired clustering algorithm entitled FlockbyLeader that detects hierarchical leaders, discovers their followers, and enables them to flock based on local proximity in an artificial virtual space to create clusters. We offer empirical evidence that the algorithm outperforms both the existing flocking algorithm and the K-means algorithm. We analyze the performance of the algorithm based on widely used datasets in the literature. Keywords: Swarm Intelligence, Information Retrieval, Machine Learning, Data Mining, Social Networks Analysis, Bioinformatics. 1 Introduction The long lasting mystery behind the phenomenon of flocking birds and the certainty of the existence of leaders orchestrating the flock logistics has finally been revealed. Biological physicists from Oxford University and Eötvös University s Department of Zoology found that flying pigeons flock following an organized chain of instructions. The recently published Nature research report [1] revealed that GPS loggers that were fitted into backpacks carried by flocks of pigeons allowed bird scientists to find hierarchies within flocks. It is now confirmed that there exist certain flock members that are authoritative over other birds. Dr. Biro of Oxford University's Department of Zoology claims[1], We found that, whilst most birds have a say in decisionmaking, a flexible system of 'rank' ensures that some birds are more likely to lead and others to boids. In computer science, flocking behavior is also known as Swarm Intelligence. Swarm Intelligence is the property of a system in which the collective behaviors of unsophisticated agents interacting locally with their environment cause 1 Corresponding author.

2 coherent functional global patterns to emerge [3]. Flocking modeling was initially introduced by Craig Reynolds [5]. Reynolds termed the generic simulated flocking birds as boids. The behavior of each bird (boid) is described by three simple rules: separation, cohesion, and alignment. Separation allows a boid to keep a certain distance from its nearest flock-mates, whereas cohesion permits a boid to join a local flock, and alignment enables a boid to move towards the average heading of local flock-mates. Examples of applications in which flocking modeling was successfully used would include but not be limited to robotics and computer animation. X. Cui et. al [2] were among the first researchers in the Swarm Intelligence literature who applied the flocking behavior into information retrieval. Also, very recently Bellaachia and Bari are the first in the Swarm Intelligence literature to introduce a flocking-based framework for community detection in dynamic social networks where a social network is modeled as an artificial life [7]. 2 Motivation The flocking model, also known as Craig Reynolds Model [5] introduced in 1985, lacks an important discovered component described earlier in the introduction: Leadership in Flocks Dynamics. The flocking clustering algorithm used in machine learning [2], [7], [13] is based on a pair-wise proximity in order to find similar data points. The recent discovery mentioned in the introduction shed light on considering mining leaders within the dataset, and thus, instead of a one-to-one proximity to discover similar data points, the algorithm performs a leader-to-many proximity through detecting local leaders and followers that will form subflocks. The existing algorithm in literature relies on a set of predefined heuristics that can significantly affect the clustering results [13]. Our proposed algorithm is motivated by the following open questions on the existing flocking algorithm: (1) Is it possible to minimize the number of moves of agents (birds) and yet maintain relatively good clustering results? (2) Is it possible to make the algorithm parameter free and make the maximum distance (d max ) a dynamic adaptive threshold? In this paper, we incorporate the recently discovered leadership dynamics in pigeon flocks into the existing flocking model, and we introduce a new biologically-inspired algorithm based on the extended model we present in this work. The rest of this paper is structured as follows: we present a formal definition of a Swarm Clustering Framework which serves as a clustering platform for several data mining applications that we have recently tackled in our research on but not limited to microarrays bioinformatics [8], social networks analysis [7], and information retrieval [2]. The fourth section introduces the Flock by Leader algorithm; the fifth section illustrates the experimental result; and the last section provides the conclusion. 3 Swarm Clustering Framework We define a multidisciplinary data mining framework that can be used in different clustering such as [8], [7] and [2]. We present the fundamental components that

3 constitute a Swarm Clustering Framework under which the FlockbyLeader algorithm will be defined in the next section. A swarm network can be modeled using algebraic graph theory. Formally, a graph consists of a set of vertices and a set of edges containing unordered pairs of distinct vertices. The graph has no self-loops and is undirected if. The scalar is referred as the order of graph, and is referred as the size of. Let be a set of heterogeneous data points to be clustered. We define a swarm clustering framework that consists of four main components: (0) Swarm Metric Space (1) Swarm Virtual Space, (2) Agents Position Graph, and (3) Feature Similarity Graph. Consider the following definitions: Definition 1 (Swarm Metric Space ). A Swarm Metric Space is a Metric Space that consists of a set and a distance function that satisfies three properties of a metric: Reflexivity, Symmetry, and Triangle inequality. An instance of a Swarm Metric Space is as follows: defined as the Euclidean distance in a d-dimensional space: (1) The Swarm metric space is taken to be instantiated and it is defined by the user depending on the application as Figure 1 shows. Definition 2 (Swarm Virtual Space ). The Swarm Virtual Space of a set is the Euclidean 2-dimensional space where n data points are being initially deployed at random. We refer to those points as agents. Every data point in is uniquely indexed by an agent. Agents in the virtual space move according to the flocking clustering algorithm that will be defined in the next sections. Let d min be the minimum distance that an agent must have to avoid collision with other agents in virtual space. The swarm virtual space serves as a simplified visualization of the clusters into a 2-dimensional space. Definition 3 (Agents Position Graph ). The agents position graph denoted as is a weighted graph that consists of the set of vertices and the set of edges. Let be the adjacency matrix of. is a matrix of size scalar such that: (2) where and are the position vectors of both Agents and j in the swarm virtual space. The scalar represents the distance between agents and agent in the swarm virtual space. The agents position graph

4 maintains the positions of the agents in the virtual space at every step of the algorithm and will be used to extract the topology of the clusters generated by the algorithm. Definition 4 (Feature Similarity Graph ). The feature similarity graph maintains the similarity between the entities involved in the clustering process. We define the feature similarity graph denoted as to be the weighted graph that consists of the set of vertices and the set of edges. Let be adjacency matrix of. is a matrix of size scalar such that: where and are the feature vectors of both. The scalar ρ represents the metric space that defines the similarity between node i and j. The feature similarity graph drives the movements of the agents in virtual space. We define a Swarm Clustering Framework denoted as to be the quadruple that consists of a metric space a position graph, a feature similarly graph, and a swarm virtual space. A flocking algorithm under the framework is a graph transformation process that transforms an ambiguous structure of heterogeneous entities into a partitioned structure. 4 Flock by Leader Clustering Algorithm We present in this section Flock by Leader clustering algorithm as an extension to the flocking algorithm known in [2] and [13]. In order to give a better understanding of the work presented in this paper, we invite the reader to a summary of the flocking model known as Reynolds model and presented in [2] and [13]. 4.1 Enhanced Flocking Model The enhanced Reynolds model we introduce in this section aims to (1) minimize the moves of the agents in virtual space; and (2) make the process parameters free in term of both the number of iterations and predefined maximum distance. The enhanced model uses the same flocking rules as Reynolds. However, instead of processing every boid and finding its neighbors, the enhanced model analyzes the data and discovers potential leaders. For every leader it finds its corresponding followers that will flock under their leader directions. In Reynolds model, the maximum distance is predefined and assigned to all boids. Boids within the maximum distance from a boid are considered its neighbors. In the enhanced model the maximum distance is relative to the leader. In Figure 1(right) leader (a) and leader (b) both have different distances that define their local neighbors. This observation is inspired from the pigeon leadership dynamics where the leaders distances are different from one to another. In the next sections we will explain how to find leaders and associate a maximum distance to them. In Figure 1 (right) a leader and its followers flock following the flocking rules (cohesion, alignment, and separations). The moves are

5 minimized as opposed to the original model shown in Figure 1(left): Instead of moving every boid to every other neighbor, we migrate the neighbors to their corresponding leaders. Fig. 1. Reynolds Model (left) and Enhanced Model (right) Flock by Leader Algorithm In every iteration, the algorithm starts by finding flock leaders. Then for every flock leader associated with a distance denoted as, the algorithm finds a leader s corresponding followers. The method that finds leaders and calculates their corresponding will be shown in the next section. Once a leader is identified, its corresponding followers agents in the virtual space will perform a flocking behavior and follow their leader. Then the followers are marked as visited in the feature graph and will be excluded in the flocking process on subsequent iterations. The leaders of every subflock are sent back to the virtual space as subflocks representatives. input:, the swarm clustering framework returns:, the new position graph While there are still nodes in that has are not been visited Do 1.1 LeadersList FindFlockLeaders ( 1.2 For leader Agent neighbor of in i

6 LeadersList within AgentFlock (Agent, L ) i i Agent.visited = true i Agent.leader = i End for Update ( End of do while Remark 1. An illustrative example of the aerobatics of agents in the virtual space following the FlockbyLeader algorithm. Every input data point in X (the set of data points to be clustered) is uniquely indexed by an agent in the virtual space. (a) Unvisited (blue) agents randomly deployed in the virtual space. In (b) six flock leaders (green) are detected and (c) their corresponding followers start flocking under their leaders direction in accordance with flocking rules (alignment, separation, cohesion). Figure (d) illustrates the beginning of another iteration of the algorithm: agent#1, agent#5 change roles into followers (yellow) (leaders in previous iteration (c)), agent#3, and agent#4 became outliers (gray) (leaders in (c)). In (e) the flocking process continues, agent#1 and its followers joined agent#2(leader) subflock, and agent#5 and its follower joined agent#6 subflock. Fig 2. Aerobatics of Agents It is important to note the following points as shown in Figure 2: In every iteration a node can be an unvisited node, a leader, a follower, or an outlier. A follower node is set as visited and its leader will serve as a representative in the next iteration. The

7 visited node will be excluded from the flocking process in the next iteration. A node that was a leader at iteration might stay a leader at iteration ; or might become a follower of a highly ranked leader; or might become an outlier as will be explained in the next section. The question arises how to distinguish between a leader, a follower, and an outlier. The following section illustrates our approach. 4.2 Mining Flock Leaders as Initial Clusters Centroids We rely on neighborhood and reverse neighborhood analysis to find potential flock leaders. The analysis is similar to the neighborhood and reverse-neighborhood approach that is mentioned in [6], [10] and [11]. The main difference is that the notion of neighborhood in the swarm framework is dynamic. During each iteration of the flocking process every agent s neighborhood changes depending on the flocking behavior of previous iterations. Let X be a dataset to be clustered. Let be a given distance function between objects and.let the set of nearest neighbors of at iteration is denoted by is a node in the feature graph, and is its corresponding Agent deployed in the virtual space. We adopt the definitions from [10] and apply them to the swarm framework as follows: Definition 5 (Dynamic k-neighborhood - DkNB). The k-neighborhood of at iteration denoted as ( is a set of data points that lie within a circle with as a center and as radius associated with leader at iteration t such that Definition 6 (Dynamic Reverse k-neighborhood DR-kNB). The reverse k- neighborhood of at iteration denoted as ( ) is the set of data points whose sets contains The ratio / has been widely used in the neighborhood based clustering literature [10] in order to determine which points are dense, even or spare. Several factors have been introduced, such as neighborhood density factor (NDF), and the structural role index (SRI) that was recently introduced in [10] and [11]. We define Dynamic Agent Role Factor denoted as of an Agent at Iteration to be: (3) (4) Intuitively a centroid of a cluster occupies the center position of a mass of associated data points. The larger is the more objects approaches. The initial centroid candidates should have the most reverse k nearest neighbors. Specifically, if then is a flockleader at iteration otherwise is a follower. If is close to zero then is an outlier. We extend the Agent role factor to introduce a local rank of an agent at iteration t to be:

8 (5) is the number of the neighbors at iteration and is the number of unvisited nodes at iteration. The rank is being used to sort the list of leaders. A leader of higher rank will be given priority to be processed first (finding its followers). 5 Experiments and Results 5.1 Datasets Two large datasets were used in our experiments. The first dataset consists of real news articles, details about the dataset can be found in [7]. The dataset consist of 100 news articles collected from cyberspace, which have been categorized by human experts into 12 clusters. We used KNIME tool [9] to preprocess the news articles and convert the dataset into keywords document matrix that Flock by Leader algorithm takes as input. The second dataset is the iris plant dataset. It contains 150 instances from three classes: Iris-virginica-class-1, Iris-versicolor-class-2, and Irissetosa-class-3. There exist fifty instances in each class. Each instance is described by four attributes. Details about the dataset can be found in [12]. 5.2 Evaluation Methodology We will use the F-measure as the quality measure. The F-Measure computes an average of the information retrieval precision and recall. Each cluster is considered as if it were the result of a query and each class as if it were the desired set of documents for a query. We then calculate the recall and precision of that cluster for each given class. The F-measure of cluster j (retrieved) and class i (known) is defined as follows. 5.3 Results Using the evaluation methods mentioned in the previous section, we compare the performance of FlockbyLeader algorithm against Flocking-based clustering algorithm mentioned in [3], and K-means. Table.9 illustrates the results of running FlockbyLeader algorithm on both the real news articles dataset and Iris dataset. We compare our results with results mentioned in [12] and [2] on the same dataset. Table 6 shows that FlockbyLeader has the largest F-measure values compared to both flocking Algorithm and K-means. The algorithm needed 4 iterations to converge. This is a significant improvement on the exiting flocking algorithm where the total number of iteration was 300 [2]. FlockbyLeader algorithm achieved 98.66% reduction in the number of iteration of the flocking process, a 7.5% increase in precision and recall (F-measure) over the existing flocking algorithm, and an average of 16.5% percent increase of precision and recall over K-means on both datasets. Figures 3 are (6)

9 snapshots of the virtual space on both datasets at initialization and after running the algorithm. Table 1. F-measure Evaluation Results. Dataset Algorithms Number F-measure of Clusters News Articles Flocking News Articles K-means 12( k=12) News Articles FlockbyLeader Iris Dataset K-means 3 (k=3) Iris Dataset FlockbyLeader Fig. 3. The process of Running Flock by Leader Algorithm on the IRIS Dataset 7 Conclusion In this paper we presented a simple, biologically-inspired clustering algorithm. FlockbyLeader incorporates a new discovery on Pigeons: Leadership Dynamics. Our algorithm is an enhancement of the existing flocking algorithm. The algorithm outperforms K-means on two large datasets. Our future work will include running the algorithm on different datasets.

10 8 References 1. Nagy, M., Z. Akos, D. Biro, and T. Vicsek. Hierarchical group dynamics in pigeon flocks. Nature 464, no (2010): X. Cui, J. Gao and T. E. Potok, A Flocking Based Algorithm for Document Clustering Analysis, Journal of System Architecture, June, 2006, ISSN: Vladimir G. Red'ko, Artificial Life Evolutionary Models, E. Bonabeau, M. Dorigo, and G. Theraulaz, Swarm intelligence: from natural to artificial systems, Oxford University Press, Craig W. Reynolds, Flocks, Herds, and Schools: A Distributed Behavioral Model, Computer Graphics, 21(4), July 1987, pp ] 6. S. Zhou, Y. Zhao, J. Guan, and J.Z. Huang, A Neighborhood-Based Clustering Algorithm, in Proc. PAKDD, 2005, pp Bellaachia, A.; Bari, A.;, SFLOSCAN: A biologically-inspired data mining framework for community identification in dynamic social networks, Swarm Intelligence (SIS), 2011 IEEE Symposium on, vol., no., pp.1-8, April 2011 doi: /SIS Bellaachia, A.; Bari, A.; A Flocking Based Data Mining Algorithm for Detecting Outliers in Cancer Gene Expression Microarray Data in Proc. IEEE International Conference on Information Retrieval and Knowledge Management, CAMP12, M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kotter, T. Meinl, P. Ohl, K. Thiel, and B. Wiswedel. Knime - the konstanz information miner: version 2.0 and beyond.sigkdd Explor.Newsl., 11(1):26 31, Y. Ye, J.Z. Huang, X. Chen, S. Zhou, G.J. Williams, and X. Xu, "Neighborhood Density Method for Selecting Initial Cluster Centers in K-Means Clustering", in Proc. PAKDD, 2006, pp J. Ding, R. Ma, J. Yang, and S. Chen, "A tree-structured framework for purifying "complex" clusters with structural roles of individual data", presented at Pattern Recognition, 2010, pp Guillet, F., G. Ritschard, D.A. Zighed and H. Briand (eds) (2010) Advances in Knowledge Discovery and Management, Series: Studies in Computational Intelligence, Vol. 292, Berlin: Springer. doi: / Bellaachia, A.; X. He, An Artificial Life Based Data Mining Algorithm, Swarm Intelligence IEEE, 2006

PARTICLE SWARM OPTIMIZATION (PSO)

PARTICLE SWARM OPTIMIZATION (PSO) PARTICLE SWARM OPTIMIZATION (PSO) J. Kennedy and R. Eberhart, Particle Swarm Optimization. Proceedings of the Fourth IEEE Int. Conference on Neural Networks, 1995. A population based optimization technique

More information

Fuzzy Ant Clustering by Centroid Positioning

Fuzzy Ant Clustering by Centroid Positioning Fuzzy Ant Clustering by Centroid Positioning Parag M. Kanade and Lawrence O. Hall Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract We

More information

An Adaptive Flocking Algorithm for Spatial Clustering

An Adaptive Flocking Algorithm for Spatial Clustering An Adaptive Flocking Algorithm for Spatial Clustering Gianluigi Folino and Giandomenico Spezzano CNR-ISI Via Pietro Bucci cubo 41C c/o DEIS, UNICAL, 87036 Rende (CS), Italy Phone: +39 984 831722, Fax:

More information

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 5, No. 2, April 2011 CSES International 2011 ISSN 0973-4406 A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Density Based Clustering using Modified PSO based Neighbor Selection

Density Based Clustering using Modified PSO based Neighbor Selection Density Based Clustering using Modified PSO based Neighbor Selection K. Nafees Ahmed Research Scholar, Dept of Computer Science Jamal Mohamed College (Autonomous), Tiruchirappalli, India nafeesjmc@gmail.com

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 3, March 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue:

More information

Normalization based K means Clustering Algorithm

Normalization based K means Clustering Algorithm Normalization based K means Clustering Algorithm Deepali Virmani 1,Shweta Taneja 2,Geetika Malhotra 3 1 Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi Email:deepalivirmani@gmail.com

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

Graph projection techniques for Self-Organizing Maps

Graph projection techniques for Self-Organizing Maps Graph projection techniques for Self-Organizing Maps Georg Pölzlbauer 1, Andreas Rauber 1, Michael Dittenbach 2 1- Vienna University of Technology - Department of Software Technology Favoritenstr. 9 11

More information

Advanced visualization techniques for Self-Organizing Maps with graph-based methods

Advanced visualization techniques for Self-Organizing Maps with graph-based methods Advanced visualization techniques for Self-Organizing Maps with graph-based methods Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University

More information

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology 9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example

More information

Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis

Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis Sensors & Transducers 2014 by IFSA Publishing, S. L. http://www.sensorsportal.com Improvement of SURF Feature Image Registration Algorithm Based on Cluster Analysis 1 Xulin LONG, 1,* Qiang CHEN, 2 Xiaoya

More information

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University

More information

K-Means Clustering With Initial Centroids Based On Difference Operator

K-Means Clustering With Initial Centroids Based On Difference Operator K-Means Clustering With Initial Centroids Based On Difference Operator Satish Chaurasiya 1, Dr.Ratish Agrawal 2 M.Tech Student, School of Information and Technology, R.G.P.V, Bhopal, India Assistant Professor,

More information

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India.

Argha Roy* Dept. of CSE Netaji Subhash Engg. College West Bengal, India. Volume 3, Issue 3, March 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Training Artificial

More information

Clustering Part 4 DBSCAN

Clustering Part 4 DBSCAN Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Detecting Clusters and Outliers for Multidimensional

Detecting Clusters and Outliers for Multidimensional Kennesaw State University DigitalCommons@Kennesaw State University Faculty Publications 2008 Detecting Clusters and Outliers for Multidimensional Data Yong Shi Kennesaw State University, yshi5@kennesaw.edu

More information

COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS

COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS Mariam Rehman Lahore College for Women University Lahore, Pakistan mariam.rehman321@gmail.com Syed Atif Mehdi University of Management and Technology Lahore,

More information

Analyzing Outlier Detection Techniques with Hybrid Method

Analyzing Outlier Detection Techniques with Hybrid Method Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,

More information

Improving Suffix Tree Clustering Algorithm for Web Documents

Improving Suffix Tree Clustering Algorithm for Web Documents International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2015) Improving Suffix Tree Clustering Algorithm for Web Documents Yan Zhuang Computer Center East China Normal

More information

Clustering Algorithms for Data Stream

Clustering Algorithms for Data Stream Clustering Algorithms for Data Stream Karishma Nadhe 1, Prof. P. M. Chawan 2 1Student, Dept of CS & IT, VJTI Mumbai, Maharashtra, India 2Professor, Dept of CS & IT, VJTI Mumbai, Maharashtra, India Abstract:

More information

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering IJIRST International Journal for Innovative Research in Science & Technology Volume 3 Issue 07 December 2016 ISSN (online): 2349-6010 Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

More information

Datasets Size: Effect on Clustering Results

Datasets Size: Effect on Clustering Results 1 Datasets Size: Effect on Clustering Results Adeleke Ajiboye 1, Ruzaini Abdullah Arshah 2, Hongwu Qin 3 Faculty of Computer Systems and Software Engineering Universiti Malaysia Pahang 1 {ajibraheem@live.com}

More information

Iteration Reduction K Means Clustering Algorithm

Iteration Reduction K Means Clustering Algorithm Iteration Reduction K Means Clustering Algorithm Kedar Sawant 1 and Snehal Bhogan 2 1 Department of Computer Engineering, Agnel Institute of Technology and Design, Assagao, Goa 403507, India 2 Department

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for

More information

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,

More information

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms

More information

Enhancing K-means Clustering Algorithm with Improved Initial Center

Enhancing K-means Clustering Algorithm with Improved Initial Center Enhancing K-means Clustering Algorithm with Improved Initial Center Madhu Yedla #1, Srinivasa Rao Pathakota #2, T M Srinivasa #3 # Department of Computer Science and Engineering, National Institute of

More information

University of Florida CISE department Gator Engineering. Clustering Part 4

University of Florida CISE department Gator Engineering. Clustering Part 4 Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Regression Based Cluster Formation for Enhancement of Lifetime of WSN

Regression Based Cluster Formation for Enhancement of Lifetime of WSN Regression Based Cluster Formation for Enhancement of Lifetime of WSN K. Lakshmi Joshitha Assistant Professor Sri Sai Ram Engineering College Chennai, India lakshmijoshitha@yahoo.com A. Gangasri PG Scholar

More information

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization J.Venkatesh 1, B.Chiranjeevulu 2 1 PG Student, Dept. of ECE, Viswanadha Institute of Technology And Management,

More information

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster)

More information

K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors

K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors Shao-Tzu Huang, Chen-Chien Hsu, Wei-Yen Wang International Science Index, Electrical and Computer Engineering waset.org/publication/0007607

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM Saroj 1, Ms. Kavita2 1 Student of Masters of Technology, 2 Assistant Professor Department of Computer Science and Engineering JCDM college

More information

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at

International Journal of Research in Advent Technology, Vol.7, No.3, March 2019 E-ISSN: Available online at Performance Evaluation of Ensemble Method Based Outlier Detection Algorithm Priya. M 1, M. Karthikeyan 2 Department of Computer and Information Science, Annamalai University, Annamalai Nagar, Tamil Nadu,

More information

Research and Improvement on K-means Algorithm Based on Large Data Set

Research and Improvement on K-means Algorithm Based on Large Data Set www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 6 Issue 7 July 2017, Page No. 22145-22150 Index Copernicus value (2015): 58.10 DOI: 10.18535/ijecs/v6i7.40 Research

More information

Kapitel 4: Clustering

Kapitel 4: Clustering Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.

More information

Wrapper Feature Selection using Discrete Cuckoo Optimization Algorithm Abstract S.J. Mousavirad and H. Ebrahimpour-Komleh* 1 Department of Computer and Electrical Engineering, University of Kashan, Kashan,

More information

Automatic Group-Outlier Detection

Automatic Group-Outlier Detection Automatic Group-Outlier Detection Amine Chaibi and Mustapha Lebbah and Hanane Azzag LIPN-UMR 7030 Université Paris 13 - CNRS 99, av. J-B Clément - F-93430 Villetaneuse {firstname.secondname}@lipn.univ-paris13.fr

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

CFMTL: Clustering Wireless Sensor Network Using Fuzzy Logic and Mobile Sink In Three-Level

CFMTL: Clustering Wireless Sensor Network Using Fuzzy Logic and Mobile Sink In Three-Level CFMTL: Clustering Wireless Sensor Network Using Fuzzy Logic and Mobile Sink In Three-Level Ali Abdi Seyedkolaei 1 and Ali Zakerolhosseini 2 1 Department of Computer, Shahid Beheshti University, Tehran,

More information

Redefining and Enhancing K-means Algorithm

Redefining and Enhancing K-means Algorithm Redefining and Enhancing K-means Algorithm Nimrat Kaur Sidhu 1, Rajneet kaur 2 Research Scholar, Department of Computer Science Engineering, SGGSWU, Fatehgarh Sahib, Punjab, India 1 Assistant Professor,

More information

K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection

K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection Zhenghui Ma School of Computer Science The University of Birmingham Edgbaston, B15 2TT Birmingham, UK Ata Kaban School of Computer

More information

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION 1 M.S.Rekha, 2 S.G.Nawaz 1 PG SCALOR, CSE, SRI KRISHNADEVARAYA ENGINEERING COLLEGE, GOOTY 2 ASSOCIATE PROFESSOR, SRI KRISHNADEVARAYA

More information

Document Clustering: Comparison of Similarity Measures

Document Clustering: Comparison of Similarity Measures Document Clustering: Comparison of Similarity Measures Shouvik Sachdeva Bhupendra Kastore Indian Institute of Technology, Kanpur CS365 Project, 2014 Outline 1 Introduction The Problem and the Motivation

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

Graphs, Search, Pathfinding (behavior involving where to go) Steering, Flocking, Formations (behavior involving how to go)

Graphs, Search, Pathfinding (behavior involving where to go) Steering, Flocking, Formations (behavior involving how to go) Graphs, Search, Pathfinding (behavior involving where to go) Steering, Flocking, Formations (behavior involving how to go) Class N-2 1. What are some benefits of path networks? 2. Cons of path networks?

More information

REPRESENTATION OF BIG DATA BY DIMENSION REDUCTION

REPRESENTATION OF BIG DATA BY DIMENSION REDUCTION Fundamental Journal of Mathematics and Mathematical Sciences Vol. 4, Issue 1, 2015, Pages 23-34 This paper is available online at http://www.frdint.com/ Published online November 29, 2015 REPRESENTATION

More information

Chapter DM:II. II. Cluster Analysis

Chapter DM:II. II. Cluster Analysis Chapter DM:II II. Cluster Analysis Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster Analysis DM:II-1

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Analysis of Dendrogram Tree for Identifying and Visualizing Trends in Multi-attribute Transactional Data

Analysis of Dendrogram Tree for Identifying and Visualizing Trends in Multi-attribute Transactional Data Analysis of Dendrogram Tree for Identifying and Visualizing Trends in Multi-attribute Transactional Data D.Radha Rani 1, A.Vini Bharati 2, P.Lakshmi Durga Madhuri 3, M.Phaneendra Babu 4, A.Sravani 5 Department

More information

Visual programming language for modular algorithms

Visual programming language for modular algorithms Visual programming language for modular algorithms Rudolfs Opmanis, Rihards Opmanis Institute of Mathematics and Computer Science University of Latvia, Raina bulvaris 29, Riga, LV-1459, Latvia rudolfs.opmanis@gmail.com,

More information

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 11 Nov. 2016, Page No.

International Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 11 Nov. 2016, Page No. www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 11 Nov. 2016, Page No. 19054-19062 Review on K-Mode Clustering Antara Prakash, Simran Kalera, Archisha

More information

Unsupervised learning on Color Images

Unsupervised learning on Color Images Unsupervised learning on Color Images Sindhuja Vakkalagadda 1, Prasanthi Dhavala 2 1 Computer Science and Systems Engineering, Andhra University, AP, India 2 Computer Science and Systems Engineering, Andhra

More information

CHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH

CHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH 37 CHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH 4.1 INTRODUCTION Genes can belong to any genetic network and are also coordinated by many regulatory

More information

BRACE: A Paradigm For the Discretization of Continuously Valued Data

BRACE: A Paradigm For the Discretization of Continuously Valued Data Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science

More information

LECTURE 16: SWARM INTELLIGENCE 2 / PARTICLE SWARM OPTIMIZATION 2

LECTURE 16: SWARM INTELLIGENCE 2 / PARTICLE SWARM OPTIMIZATION 2 15-382 COLLECTIVE INTELLIGENCE - S18 LECTURE 16: SWARM INTELLIGENCE 2 / PARTICLE SWARM OPTIMIZATION 2 INSTRUCTOR: GIANNI A. DI CARO BACKGROUND: REYNOLDS BOIDS Reynolds created a model of coordinated animal

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

e-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data

e-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data : Related work on biclustering algorithms for time series gene expression data Sara C. Madeira 1,2,3, Arlindo L. Oliveira 1,2 1 Knowledge Discovery and Bioinformatics (KDBIO) group, INESC-ID, Lisbon, Portugal

More information

AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS

AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS H.S Behera Department of Computer Science and Engineering, Veer Surendra Sai University

More information

A Memetic Heuristic for the Co-clustering Problem

A Memetic Heuristic for the Co-clustering Problem A Memetic Heuristic for the Co-clustering Problem Mohammad Khoshneshin 1, Mahtab Ghazizadeh 2, W. Nick Street 1, and Jeffrey W. Ohlmann 1 1 The University of Iowa, Iowa City IA 52242, USA {mohammad-khoshneshin,nick-street,jeffrey-ohlmann}@uiowa.edu

More information

Clustering Techniques

Clustering Techniques Clustering Techniques Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 16 Lopresti Fall 2007 Lecture 16-1 - Administrative notes Your final project / paper proposal is due on Friday,

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Available online at  ScienceDirect. Procedia Computer Science 89 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 89 (2016 ) 341 348 Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) Parallel Approach

More information

Distance-based Methods: Drawbacks

Distance-based Methods: Drawbacks Distance-based Methods: Drawbacks Hard to find clusters with irregular shapes Hard to specify the number of clusters Heuristic: a cluster must be dense Jian Pei: CMPT 459/741 Clustering (3) 1 How to Find

More information

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1 Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1 HAI THANH MAI AND MYOUNG HO KIM Department of Computer Science Korea Advanced Institute of Science

More information

NUMB3RS Activity: Follow the Flock. Episode: In Plain Sight

NUMB3RS Activity: Follow the Flock. Episode: In Plain Sight Teacher Page 1 NUMB3RS Activity: Follow the Flock Topic: Introduction to Flock Behavior Grade Level: 8-12 Objective: Use a mathematical model to simulate an aspect of birds flying in a flock Time: 30 minutes

More information

Clustering Lecture 4: Density-based Methods

Clustering Lecture 4: Density-based Methods Clustering Lecture 4: Density-based Methods Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced

More information

Finding Effective Software Security Metrics Using A Genetic Algorithm

Finding Effective Software Security Metrics Using A Genetic Algorithm International Journal of Software Engineering. ISSN 0974-3162 Volume 4, Number 2 (2013), pp. 1-6 International Research Publication House http://www.irphouse.com Finding Effective Software Security Metrics

More information

Fast Efficient Clustering Algorithm for Balanced Data

Fast Efficient Clustering Algorithm for Balanced Data Vol. 5, No. 6, 214 Fast Efficient Clustering Algorithm for Balanced Data Adel A. Sewisy Faculty of Computer and Information, Assiut University M. H. Marghny Faculty of Computer and Information, Assiut

More information

EECS730: Introduction to Bioinformatics

EECS730: Introduction to Bioinformatics EECS730: Introduction to Bioinformatics Lecture 15: Microarray clustering http://compbio.pbworks.com/f/wood2.gif Some slides were adapted from Dr. Shaojie Zhang (University of Central Florida) Microarray

More information

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE Sundari NallamReddy, Samarandra Behera, Sanjeev Karadagi, Dr. Anantha Desik ABSTRACT: Tata

More information

SWARM INTELLIGENCE -I

SWARM INTELLIGENCE -I SWARM INTELLIGENCE -I Swarm Intelligence Any attempt to design algorithms or distributed problem solving devices inspired by the collective behaviourof social insect colonies and other animal societies

More information

Clustering of datasets using PSO-K-Means and PCA-K-means

Clustering of datasets using PSO-K-Means and PCA-K-means Clustering of datasets using PSO-K-Means and PCA-K-means Anusuya Venkatesan Manonmaniam Sundaranar University Tirunelveli- 60501, India anusuya_s@yahoo.com Latha Parthiban Computer Science Engineering

More information

Stability Analysis of M-Dimensional Asynchronous Swarms With a Fixed Communication Topology

Stability Analysis of M-Dimensional Asynchronous Swarms With a Fixed Communication Topology 76 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 48, NO. 1, JANUARY 2003 Stability Analysis of M-Dimensional Asynchronous Swarms With a Fixed Communication Topology Yang Liu, Member, IEEE, Kevin M. Passino,

More information

City, University of London Institutional Repository

City, University of London Institutional Repository City Research Online City, University of London Institutional Repository Citation: Andrienko, N., Andrienko, G., Fuchs, G., Rinzivillo, S. & Betz, H-D. (2015). Real Time Detection and Tracking of Spatial

More information

Parallel Approach for Implementing Data Mining Algorithms

Parallel Approach for Implementing Data Mining Algorithms TITLE OF THE THESIS Parallel Approach for Implementing Data Mining Algorithms A RESEARCH PROPOSAL SUBMITTED TO THE SHRI RAMDEOBABA COLLEGE OF ENGINEERING AND MANAGEMENT, FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

More information

A Two-phase Distributed Training Algorithm for Linear SVM in WSN

A Two-phase Distributed Training Algorithm for Linear SVM in WSN Proceedings of the World Congress on Electrical Engineering and Computer Systems and Science (EECSS 015) Barcelona, Spain July 13-14, 015 Paper o. 30 A wo-phase Distributed raining Algorithm for Linear

More information

TOWARDS NEW ESTIMATING INCREMENTAL DIMENSIONAL ALGORITHM (EIDA)

TOWARDS NEW ESTIMATING INCREMENTAL DIMENSIONAL ALGORITHM (EIDA) TOWARDS NEW ESTIMATING INCREMENTAL DIMENSIONAL ALGORITHM (EIDA) 1 S. ADAEKALAVAN, 2 DR. C. CHANDRASEKAR 1 Assistant Professor, Department of Information Technology, J.J. College of Arts and Science, Pudukkottai,

More information

Comparing and Selecting Appropriate Measuring Parameters for K-means Clustering Technique

Comparing and Selecting Appropriate Measuring Parameters for K-means Clustering Technique International Journal of Soft Computing and Engineering (IJSCE) Comparing and Selecting Appropriate Measuring Parameters for K-means Clustering Technique Shreya Jain, Samta Gajbhiye Abstract Clustering

More information

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other

More information

Research Article QOS Based Web Service Ranking Using Fuzzy C-means Clusters

Research Article QOS Based Web Service Ranking Using Fuzzy C-means Clusters Research Journal of Applied Sciences, Engineering and Technology 10(9): 1045-1050, 2015 DOI: 10.19026/rjaset.10.1873 ISSN: 2040-7459; e-issn: 2040-7467 2015 Maxwell Scientific Publication Corp. Submitted:

More information

Review: Identification of cell types from single-cell transcriptom. method

Review: Identification of cell types from single-cell transcriptom. method Review: Identification of cell types from single-cell transcriptomes using a novel clustering method University of North Carolina at Charlotte October 12, 2015 Brief overview Identify clusters by merging

More information

Minimal Test Cost Feature Selection with Positive Region Constraint

Minimal Test Cost Feature Selection with Positive Region Constraint Minimal Test Cost Feature Selection with Positive Region Constraint Jiabin Liu 1,2,FanMin 2,, Shujiao Liao 2, and William Zhu 2 1 Department of Computer Science, Sichuan University for Nationalities, Kangding

More information

Mobility Data Management & Exploration

Mobility Data Management & Exploration Mobility Data Management & Exploration Ch. 07. Mobility Data Mining and Knowledge Discovery Nikos Pelekis & Yannis Theodoridis InfoLab University of Piraeus Greece infolab.cs.unipi.gr v.2014.05 Chapter

More information

Optimization of Benchmark Functions Using Artificial Bee Colony (ABC) Algorithm

Optimization of Benchmark Functions Using Artificial Bee Colony (ABC) Algorithm IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 10 (October. 2013), V4 PP 09-14 Optimization of Benchmark Functions Using Artificial Bee Colony (ABC) Algorithm

More information

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery Ninh D. Pham, Quang Loc Le, Tran Khanh Dang Faculty of Computer Science and Engineering, HCM University of Technology,

More information

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation

More information

Particle Swarm Optimization

Particle Swarm Optimization Particle Swarm Optimization Gonçalo Pereira INESC-ID and Instituto Superior Técnico Porto Salvo, Portugal gpereira@gaips.inesc-id.pt April 15, 2011 1 What is it? Particle Swarm Optimization is an algorithm

More information

Handling Multi Objectives of with Multi Objective Dynamic Particle Swarm Optimization

Handling Multi Objectives of with Multi Objective Dynamic Particle Swarm Optimization Handling Multi Objectives of with Multi Objective Dynamic Particle Swarm Optimization Richa Agnihotri #1, Dr. Shikha Agrawal #1, Dr. Rajeev Pandey #1 # Department of Computer Science Engineering, UIT,

More information

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

A Naïve Soft Computing based Approach for Gene Expression Data Analysis Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2124 2128 International Conference on Modeling Optimization and Computing (ICMOC-2012) A Naïve Soft Computing based Approach for

More information

Swarm Based Fuzzy Clustering with Partition Validity

Swarm Based Fuzzy Clustering with Partition Validity Swarm Based Fuzzy Clustering with Partition Validity Lawrence O. Hall and Parag M. Kanade Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract

More information

CLUSTERING IN BIOINFORMATICS

CLUSTERING IN BIOINFORMATICS CLUSTERING IN BIOINFORMATICS CSE/BIMM/BENG 8 MAY 4, 0 OVERVIEW Define the clustering problem Motivation: gene expression and microarrays Types of clustering Clustering algorithms Other applications of

More information

Clustering in Data Mining

Clustering in Data Mining Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1

Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1 Acta Technica 62 No. 3B/2017, 141 148 c 2017 Institute of Thermomechanics CAS, v.v.i. Fast K-nearest neighbors searching algorithms for point clouds data of 3D scanning system 1 Zhang Fan 2, 3, Tan Yuegang

More information