Swarm Based Fuzzy Clustering with Partition Validity

Size: px
Start display at page:

Download "Swarm Based Fuzzy Clustering with Partition Validity"

Transcription

1 Swarm Based Fuzzy Clustering with Partition Validity Lawrence O. Hall and Parag M. Kanade Computer Science & Engineering Dept University of South Florida, Tampa FL <pkanade, Abstract Swarm based approaches to clustering have been showntobeabletoskiplocalextremabydoingaformofglobal search. We previously reported on the use of a swarm based approach using artificial ants to do fuzzy clustering by optimizing the fuzzy c-means (FCM) criterion. FCM requires that one choose the number of cluster centers (c). In the event that the user of the algorithm is unsure of the number of cluster centers, they can try several different choices and evaluate them with a cluster validity metric. In this work, we use a fuzzy cluster validity metric proposed by Xie and Beni as the criterion for evaluating a partition produced by swarm based clustering. Interestingly, when provided with more clusters than exist in the data our antbased approach produces a partition with empty clusters and/or very lightly populated clusters. We used two data sets, Iris and an artificially generated data set, to show that optimizing a cluster validity metric with a swarm based approach can effectively provide an indication of how many clusters there are in the data. I. INTRODUCTION Clustering unlabeled data requires that an algorithm determine the number of clusters or the user of the clustering algorithm is able to approximately guess the number of clusters in the data. In the absence of knowledge about the number of clusters in the data, partition validity metrics [1], [2] are typically applied to determine how good a partition is for a particular number of clusters. In the event that the algorithm automatically determines the number of clusters, it can still be useful to use partition validity metrics if you want to compare partitions with different numbers of clusters from different algorithms. Swarm based approaches have been used to produce partitions of clusters [3], [4], [5], [6], [7], [8]. In particular, ant based clustering has been used to produce a fuzzy partition [9]. We thought it would potentially be valuable to use an ant based approach to optimize a fuzzy partition validity metric. In [2] the Xie-Beni [1] validity metric was shown to be quite useful in picking out the right number of clusters for a partition. So, we chose to optimize this metric. Our hypothesis was that given a guess of the number of clusters it would produce a fuzzy partition that was, perhaps, a little more valid than FCM for the same number of clusters. It turns out that this approach will determine the number of clusters that exist in the data, if provided with an initial guess of cluster centers that is larger than or equal to the actual number of clusters in the data. We support this observation with results on two data sets, the well-known Iris data set [10] and an artificially generated data set. The Iris data set was where we first observed the phenomenon as we claimed there were three clusters and the algorithm stubbornly produced two by leaving one cluster empty or nearly empty. In Section 2, we briefly describe the fuzzy c-means clustering algorithm and the Xie-Beni Partition validity metric. In Section 3, we discuss ant based clustering using partition validity to evaluate partitions, Section 4 contains experimental results and Section 5 has the conclusions. II. FUZZY CLUSTERING AND PARTITION VALIDITY In [11] the authors proposed a reformulation of the optimization criteria used in a couple of common clustering objective functions. The original clustering function minimizes the objective function (1) used in fuzzy c-means clustering to find good clusters for a data partition. J m (U, β, X) = c i=1 k=1 where U ik : Membership of the k th object in the i th cluster β i : The i th cluster prototype m 1: Thedegreeoffuzzification c 2: Number of clusters n : Number of data points D ik (x k,β i ): Distance of x k from the i th cluster center β i n Uik m D ik (x k,β i ) (1) The reformulation replaces the membership matrix U with the necessary conditions which are satisfied by U. In this work, the ants will move only cluster centers and hence we do not want the U matrix in the equation. The reformulated version of J m is denoted as R m. The reformulation for the fuzzy optimization function is given in (2). The function R m depends only on the cluster prototype and not on the U matrix, whereas J depends on both the cluster prototype and the U matrix. The U matrix for the reformulated criterion can be easily computed using (3) /05/$ IEEE. 991

2 R m (β,x) = U ik = ( n c ) 1 m D ik (x k,β i ) 1 1 m (2) k=1 i=1 ( ) D ik (x k,β i ) 1 1 m ( c ). (3) j=1 D jk(x k,β j ) 1 1 m The Xie-Beni partition validity metric can be described as [1]: R m (β,x) XB(β,X) = n(min i j { β i β j 2 (4) }) It is clearly tied to the FCM functional with a strong preference for keeping the smallest distance between any two cluster centroids as large as possible. The smallest XB(β,X) is considered to be the best. III. FUZZY ANTS CLUSTERING ALGORITHM The ants co-ordinate to move cluster centers in feature space to search for optimal cluster centers. Initially the feature values are normalized between 0 and 1. Each ant is assigned to a particular feature of a cluster in a partition. The ants never change the feature, cluster or partition assigned to them as in [6]. After randomly moving the cluster centers for a fixed number of iterations, called an epoch, the quality of the partition is evaluated by using the Xie-Beni criterion (4). If the current partition is better than any of the previous partitions in the ant s memory then the ant remembers its location for this partition otherwise the ant, with a given probability goes back to a better partition or continues from the current partition. This ensures that the ants do not remember a bad partition and erase a previously known good partition. Even if the ants change good cluster centers to unreasonable cluster centers, the ants can go back to the good cluster centers as the ants have a finite memory in which they keep the currently best known cluster centers. There are two directions for the random movement of the ant. The positive direction is when the ant is moving in the feature space from 0 to 1, and the negative direction is when the ant is moving in the feature space from 1 to 0. If during the random movement the ant reaches the end of the feature space the ant reverses its direction. After a fixed number of epochs the ants stop. Each ant has a memory of the mem (5 here) best locations for the feature of a particular cluster of a particular partition that it is moving. An ant has a chance to move I times before an evaluation is made (an epoch). It can move a random distance between D min and D max. It has a probability of resting P rest (not moving for an epoch) and a probability of continuing in the same direction as it was moving at the start of the epoch P continue. At the end of an epoch in which it did not find a position better than any in memory it continues with P ContinueCurrent. Otherwise there are a fixed set of probabilities for which of the best locations in memory search should be resumed from for the next epoch [6]. They are a probability of 0.6 that the ant chooses to go back to the best known partition, a probability of 0.2 that the ant goes back to the second best known partition, a probability of 0.1 that the ant goes to the third best known partition, a probability of that the ant goes to the fourth best known partition and with a probability of that the ant goes to the worst or fifth of the known partitions. Since objects membership in clusters are not explicitly evaluated at each step, there can be cluster centroids that are placed in feature space such that no object is closer to the centroid than it is other centroids. These are empty clusters and indicate that there are less true clusters than estimated as will be shown in the proceeding. There may also exist clusters with one, two or very few examples assigned to them which are likely spurious if we expect approximately equal size clusters and have cluster sizes larger than some threshold, say thirty. IV. EXPERIMENTS There were two data sets utilized to experimentally evaluate the ant based clustering algorithm proposed here. The first was the well-known Iris data set. It consists of four continuous valued features, 150 examples, and three classes [10]. Each class consists of 50 examples. However, one of the classes is clearly linearly separable from the other two and many partition validity metrics will prefer a partition with two classes. Figure 1 shows a two-dimensional projection of the first 2 principal components of the Iris data, which has been normalized so all feature values are between 0 and 1. The first 2 principal components are strongly correlated with the petal length and petal width features. For this data set, a reasonable argument may be made for two or three clusters. 2nd Principal Component Fig st Principal Component Iris Dataset (Normalized)- First 2 Principal Components The artificial dataset had 2 attributes, 5 classes and 1000 examples. It was generated using a Gaussian distribution and is shown in Figure 2. The classes are slightly unequally sized [12] (248, 132, 217, 192 and 211 respectively). A. Experimental parameters The parameters used in the experiments are shown in Table I. Essentially, 30 different partitions were utilized in each epoch. As there is significant randomness in the process, each experiment was run 30 times. Each experiment was tried with the known number of clusters or more. For the Iris data set, we also tried two classes 992

3 2nd Attribute TABLE II NUMBER OF CLUSTERS SEARCHED FOR AND AVERAGE NUMBER FOUND FOR THE IRISDATAWITHTHEMINIMUMP OVER 30 TRIALS. Clusters Ave. clusters P searched found Fig. 2. 1st Attribute Gauss-1 Dataset (Normalized) TABLE I PARAMETER VALUES Parameter Value Number of ants 30 Partitions Memory per ant 5 Iterations per epoch 50 Epochs 1000 P rest 0.01 P continue 0.75 P ContinueCurrent 0.20 D min D max 0.01 m 2 because of the fact that in feature space an argument can be made for this number of classes. B. Results We will look at the results from the Iris data set first. When we tried to cluster into three classes; a partition with 50 examples from class 1 and 100 examples from class 2/class 3 was found 10 of 30 times. The rest of the time a cluster with one example was found four times and in the other experiments the cluster with class 1 had a few examples from another class. So, the results seem to clearly indicate that there are two classes. However, we wanted a repeatable method that could objectively determine how many classes existed. We used a threshold on the number of examples in a cluster. The FCM functional has a bias towards producing approximately equal size clusters. It is not the right functional to use for widely different sized clusters. Hence, we used a threshold which was the percent of examples if each cluster was the same size. If a cluster had less than the threshold, it indicated that there was no cluster and the cluster should be merged with another. We did not, in these experiments, try to merge the clusters. The equation is T = n P, (5) c where n is the number of examples, c is the number of clusters searched for and P is the percentage. Any percentage 2 or greater will lead to the conclusion that there are only 2 clusters in the Iris data when we search for 3. Results are summarized for different c in Table II. Next, we searched for four clusters in the Iris data. A partition with 50 examples from class 1 and the other two classes perfectly mixed occurred three times. There was always one empty cluster and the largest cluster size was 9 in the case three clusters were found. So, any threshold above 30 percent will provide the conclusion that there are only two clusters. With five clusters there were typically two or three empty clusters and the perfect partition into two clusters occurs twice. If a percentage of 90 or above is used the conclusion will be two clusters exist. This search space is significantly larger and no more epochs were utilized, so we feel the result is a strong one. We also tried six clusters where there were typically two or three empty clusters. In this case, with a percentage of 90 or above the average number of classes was 2.5. There were a number of cases in which the linearly separable class would get discovered as one cluster and the other two classes would be split into two (maybe 67/33 or 74/26 for example). Again, in this large search space this seems to be a very reasonable result. One would probably not guess double the number of actual classes. In order to evaluate whether a more complete search might result in the discovery of 2 clusters more often when we initially searched for 6, we changed the number of epochs to 4000 and the number of iterations per epochs to 25. This causes the ant to move less during epochs and have more opportunities (epochs) to find good partitions. With these parameters and a percentage of 90, just 2 clusters were found for all thirty trials. The examples in the linearly separable class were assigned, by themselves, to one cluster nine times. Finally, we report the results when searching for only 2 clusters. In this case, there were always two clusters found (for P < 0.65). In 14/30 trials a partition with the linearly separable class and the other two classes mixed was found. In the other experiments a few examples were assigned with the linearly separable class making its size between 51 and 54. So, a very reasonable partition was obtained when searching for two classes. For the artificial data we did experiments with 5, 6, 7, 8 and 9 clusters. Results are summarized for different c in Table III. The ant based clustering always found five clusters when it was given five to search for. In fact, it found the exact original partition 15 times. When it was incorrect, it had some small 993

4 TABLE III NUMBER OF CLUSTERS SEARCHED FOR AND AVERAGE NUMBER FOUND FOR THE ARTIFICIAL DATA WITH THE MINIMUM P OVER 30 TRIALS. Clusters Ave. clusters P searched found confusion between class two and class five. A typical partition that did not match the original was: (248, 133,217, 192, 210) in which one example had switched between class 2 and class 5. This seems to be a pretty reasonable clustering result given the larger search space of the ants. When it searched for six classes, it always found five for a percentage 30 or greater. The sixth cluster typically had between 0 and two examples assigned to it. When searching for seven classes, it found five classes for a percentage of 30 or greater 29 times. One time it found six classes. In that case there was an empty cluster and then class 4 was split into two clusters. For eight classes, exactly five were found for a percentage of Making it larger would occasionally cause 4 to be found when Cluster 5 was split exactly into 2 chunks. For nine classes, five classes were always found for a percentage of 80 up to about 90. There might be two or three empty clusters. The other non-clusters were very lightly populated with less than 15 examples closest to their centroid in the usual case. As the percentage got too high it would cause a class split into two, to occasionally be missed resulting in four clusters. For example, with P =1,T = and class 4 is split into two clusters with 107 and 86 examples in each, respectively. V. SUMMARY AND DISCUSSION A swarm based approach to clustering is used to optimize a fuzzy partition validity metric. A group of ants was assigned as a team to produce a partition of the data by positioning cluster centroids. Each ant was assigned to a particular feature of a particular cluster in a particular partition. The assignment was fixed. The ants utilized memory to keep track of the best locations they had visited. Thirty partitions were simultaneously explored. It was found that an overestimate of the number of clusters that exist in the data would result in a best partition with the optimal number of clusters. An overestimate of the number of clusters allows the ant based algorithm the freedom to make groups of two or more clusters have approximately the same centroid, thereby reducing the total number of clusters in a partition. The ability to choose a smaller set clusters than initially hypothesized allows for a better optimized value for the partition validity function. After minimal post-processing to remove spurious clusters the natural substructure of the data, in terms of clusters, was discovered. The Xie-Beni fuzzy clustering validity metric was used to evaluate the goodness of each partition. It was based on the fuzzy c-means clustering algorithm. A minor modification was made to it so that a membership matrix did not need to be computed. A threshold was applied to cluster size to eliminate very small clusters which would not be discovered utilizing the FCM functional which has a strong bias towards approximately equal size clusters. Small clusters here mean clusters of from 1 to 20 elements or less than 40% of the expected size of a class (given that we knew the approximate class size). Two data sets, the Iris data and a five cluster artificial data set, were used to evaluate the approach. For both data sets, the number of clusters in the feature space describing the data set were discovered even when guessing there were more than twice as many clusters as in the original data set. There is an open question on how to set the threshold which would indicate that a cluster is spurious (too small to be real). There is the question of what to do with spurious clusters. They could certainly be merged into the closest non-spurious cluster. Alternatively, if the threshold is too high a cluster that is split into two or more chunks could be left undiscovered as all sub-clusters could be deemed spurious. The search can be parallelized to make it significantly faster. That is, each ant can certainly move independently. The final partitions produced by the swarm based clustering algorithm typically matched or were quite close to what would be obtained from FCM with the same number of cluster centers and matched the actual data quite well. Hence, it is a promising approach to find a partition with the number of clusters actually resident in the data as long as some heuristic overestimate of the cluster number can be made. ACKNOWLEDGEMENTS This research partially supported by The National Institutes of Health via a bioengineering research partnership under grant number 1 R01 EB REFERENCES [1] X. Xie and G. Beni, Validity measure for fuzzy clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 3, no. 8, pp , [2] N. Pal and J. Bezdek, On cluster validity for the fuzzy c-means model, IEEE Transactions on Fuzzy Systems, vol. 3, no. 3, pp , [3] S. Ouadfel and M. Batouche, Unsupervised image segmentation using a colony of cooperating ants, in Biologically Motivated Computer Vision, Second International Workshop, BMCV 2002, vol. Lecture Notes in Computer Science Vol.2525, 2002, pp [4] N. Labroche, N. Monmarche, and G. Venturini, A new clustering algorithm based on the chemical recognition system of ants, in Proceedings of the European Conference on Artificial Intelligence, 2002, pp [5] N. Monmarché, M. Slimane, and G. Venturini, On improving clustering in numerical databases with artificial ants, in 5th European Conference on Artificial Life (ECAL 99), Lecture Notes in Artificial Intelligence, D. Floreano, J. Nicoud, and F. Mondala, Eds., vol Lausanne, Switzerland: Springer-Verlag, Sep 1999, pp [6] P. M. Kanade and L. O. Hall, Fuzzy ants clustering with centroids, FUZZ-IEEE 04, [7] J. Handl, J. Knowles, and M. Dorigo, On the performance of antbased clustering. design and application of hybrid intelligent systems, in Frontiers in Artificial intelligence and Applications 104, 2003, pp

5 [8], Strategies for the increased robustness of ant-based clustering, in Self-Organising Applications: Issues, challenges and trends, vol.lncs 2977, 2003, pp [9] J. Bezdek, J. Keller, R. Krishnapuram, and N. Pal, Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Boston, MA: Kluwer, [10] C. Blake and C. Merz, UCI repository of machine learning databases, [Online]. Available: mlearn/mlrepository.html [11] R. J. Hathway and J. C. Bezdek, Optimization of clustering criteria by reformulation, IEEE Transactions on Fuzzy Systems, vol. 3, no. 2, pp , May [12] P. Kanade, Fuzzy ants as a clustering concept, Master s thesis, University of South Florida, Tampa, FL,

Fuzzy Ant Clustering by Centroid Positioning

Fuzzy Ant Clustering by Centroid Positioning Fuzzy Ant Clustering by Centroid Positioning Parag M. Kanade and Lawrence O. Hall Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract We

More information

Kernel Based Fuzzy Ant Clustering with Partition validity

Kernel Based Fuzzy Ant Clustering with Partition validity 2006 IEEE International Conference on Fuzzy Systems Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 6-2, 2006 Kernel Based Fuzzy Ant Clustering with Partition validity Yuhua Gu and Lawrence

More information

Fuzzy Ants as a Clustering Concept

Fuzzy Ants as a Clustering Concept Fuzzy Ants as a Clustering Concept Parag M. Kanade and Lawrence O. Hall Dept. of Computer Science & Engineering, ENB118 University of South Florida, Tampa FL 33620 pkanade@csee.usf.edu, hall@csee.usf.edu

More information

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering. Chapter 4 Fuzzy Segmentation 4. Introduction. The segmentation of objects whose color-composition is not common represents a difficult task, due to the illumination and the appropriate threshold selection

More information

Fuzzy ants as a clustering concept

Fuzzy ants as a clustering concept University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 2004 Fuzzy ants as a clustering concept Parag M. Kanade University of South Florida Follow this and additional

More information

Fast Fuzzy Clustering of Infrared Images. 2. brfcm

Fast Fuzzy Clustering of Infrared Images. 2. brfcm Fast Fuzzy Clustering of Infrared Images Steven Eschrich, Jingwei Ke, Lawrence O. Hall and Dmitry B. Goldgof Department of Computer Science and Engineering, ENB 118 University of South Florida 4202 E.

More information

Lecture 5 Finding meaningful clusters in data. 5.1 Kleinberg s axiomatic framework for clustering

Lecture 5 Finding meaningful clusters in data. 5.1 Kleinberg s axiomatic framework for clustering CSE 291: Unsupervised learning Spring 2008 Lecture 5 Finding meaningful clusters in data So far we ve been in the vector quantization mindset, where we want to approximate a data set by a small number

More information

Collaborative Rough Clustering

Collaborative Rough Clustering Collaborative Rough Clustering Sushmita Mitra, Haider Banka, and Witold Pedrycz Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India {sushmita, hbanka r}@isical.ac.in Dept. of Electrical

More information

Rezaee proposed the V CW B index [3] which measures the compactness by computing the variance of samples within the cluster with respect to average sc

Rezaee proposed the V CW B index [3] which measures the compactness by computing the variance of samples within the cluster with respect to average sc 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) July 6-11, 2014, Beijing, China Enhanced Cluster Validity Index for the Evaluation of Optimal Number of Clusters for Fuzzy C-Means Algorithm

More information

Cluster Tendency Assessment for Fuzzy Clustering of Incomplete Data

Cluster Tendency Assessment for Fuzzy Clustering of Incomplete Data EUSFLAT-LFA 2011 July 2011 Aix-les-Bains, France Cluster Tendency Assessment for Fuzzy Clustering of Incomplete Data Ludmila Himmelspach 1 Daniel Hommers 1 Stefan Conrad 1 1 Institute of Computer Science,

More information

FEATURE EXTRACTION USING FUZZY RULE BASED SYSTEM

FEATURE EXTRACTION USING FUZZY RULE BASED SYSTEM International Journal of Computer Science and Applications, Vol. 5, No. 3, pp 1-8 Technomathematics Research Foundation FEATURE EXTRACTION USING FUZZY RULE BASED SYSTEM NARENDRA S. CHAUDHARI and AVISHEK

More information

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

A Parallel Evolutionary Algorithm for Discovery of Decision Rules A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl

More information

Cluster Analysis. Ying Shen, SSE, Tongji University

Cluster Analysis. Ying Shen, SSE, Tongji University Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group

More information

Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data

Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data PRABHJOT KAUR DR. A. K. SONI DR. ANJANA GOSAIN Department of IT, MSIT Department of Computers University School

More information

Learning to Identify Fuzzy Regions in Magnetic Resonance Images

Learning to Identify Fuzzy Regions in Magnetic Resonance Images Learning to Identify Fuzzy Regions in Magnetic Resonance Images Sarah E. Crane and Lawrence O. Hall Department of Computer Science and Engineering, ENB 118 University of South Florida 4202 E. Fowler Ave.

More information

Noise-based Feature Perturbation as a Selection Method for Microarray Data

Noise-based Feature Perturbation as a Selection Method for Microarray Data Noise-based Feature Perturbation as a Selection Method for Microarray Data Li Chen 1, Dmitry B. Goldgof 1, Lawrence O. Hall 1, and Steven A. Eschrich 2 1 Department of Computer Science and Engineering

More information

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION 1 M.S.Rekha, 2 S.G.Nawaz 1 PG SCALOR, CSE, SRI KRISHNADEVARAYA ENGINEERING COLLEGE, GOOTY 2 ASSOCIATE PROFESSOR, SRI KRISHNADEVARAYA

More information

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices Syracuse University SURFACE School of Information Studies: Faculty Scholarship School of Information Studies (ischool) 12-2002 Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

More information

ANALYZING AND OPTIMIZING ANT-CLUSTERING ALGORITHM BY USING NUMERICAL METHODS FOR EFFICIENT DATA MINING

ANALYZING AND OPTIMIZING ANT-CLUSTERING ALGORITHM BY USING NUMERICAL METHODS FOR EFFICIENT DATA MINING ANALYZING AND OPTIMIZING ANT-CLUSTERING ALGORITHM BY USING NUMERICAL METHODS FOR EFFICIENT DATA MINING Md. Asikur Rahman 1, Md. Mustafizur Rahman 2, Md. Mustafa Kamal Bhuiyan 3, and S. M. Shahnewaz 4 1

More information

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION 1 ZUHERMAN RUSTAM, 2 AINI SURI TALITA 1 Senior Lecturer, Department of Mathematics, Faculty of Mathematics and Natural

More information

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and

More information

Genetic Programming for Data Classification: Partitioning the Search Space

Genetic Programming for Data Classification: Partitioning the Search Space Genetic Programming for Data Classification: Partitioning the Search Space Jeroen Eggermont jeggermo@liacs.nl Joost N. Kok joost@liacs.nl Walter A. Kosters kosters@liacs.nl ABSTRACT When Genetic Programming

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Enhancing K-means Clustering Algorithm with Improved Initial Center

Enhancing K-means Clustering Algorithm with Improved Initial Center Enhancing K-means Clustering Algorithm with Improved Initial Center Madhu Yedla #1, Srinivasa Rao Pathakota #2, T M Srinivasa #3 # Department of Computer Science and Engineering, National Institute of

More information

I. INTRODUCTION II. RELATED WORK.

I. INTRODUCTION II. RELATED WORK. ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A New Hybridized K-Means Clustering Based Outlier Detection Technique

More information

The Data Mining Application Based on WEKA: Geographical Original of Music

The Data Mining Application Based on WEKA: Geographical Original of Music Management Science and Engineering Vol. 10, No. 4, 2016, pp. 36-46 DOI:10.3968/8997 ISSN 1913-0341 [Print] ISSN 1913-035X [Online] www.cscanada.net www.cscanada.org The Data Mining Application Based on

More information

Enhanced Hemisphere Concept for Color Pixel Classification

Enhanced Hemisphere Concept for Color Pixel Classification 2016 International Conference on Multimedia Systems and Signal Processing Enhanced Hemisphere Concept for Color Pixel Classification Van Ng Graduate School of Information Sciences Tohoku University Sendai,

More information

A Hybrid Recommender System for Dynamic Web Users

A Hybrid Recommender System for Dynamic Web Users A Hybrid Recommender System for Dynamic Web Users Shiva Nadi Department of Computer Engineering, Islamic Azad University of Najafabad Isfahan, Iran Mohammad Hossein Saraee Department of Electrical and

More information

An Empirical Study of Hoeffding Racing for Model Selection in k-nearest Neighbor Classification

An Empirical Study of Hoeffding Racing for Model Selection in k-nearest Neighbor Classification An Empirical Study of Hoeffding Racing for Model Selection in k-nearest Neighbor Classification Flora Yu-Hui Yeh and Marcus Gallagher School of Information Technology and Electrical Engineering University

More information

Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems

Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems Structural Advantages for Ant Colony Optimisation Inherent in Permutation Scheduling Problems James Montgomery No Institute Given Abstract. When using a constructive search algorithm, solutions to scheduling

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Similarity-Driven Cluster Merging Method for Unsupervised Fuzzy Clustering

Similarity-Driven Cluster Merging Method for Unsupervised Fuzzy Clustering Similarity-Driven Cluster Merging Method for Unsupervised Fuzzy Clustering Xuejian Xiong, Kian Lee Tan Singapore-MIT Alliance E4-04-10, 4 Engineering Drive 3 Singapore 117576 Abstract In this paper, a

More information

An Empirical Comparison of Spectral Learning Methods for Classification

An Empirical Comparison of Spectral Learning Methods for Classification An Empirical Comparison of Spectral Learning Methods for Classification Adam Drake and Dan Ventura Computer Science Department Brigham Young University, Provo, UT 84602 USA Email: adam drake1@yahoo.com,

More information

Equi-sized, Homogeneous Partitioning

Equi-sized, Homogeneous Partitioning Equi-sized, Homogeneous Partitioning Frank Klawonn and Frank Höppner 2 Department of Computer Science University of Applied Sciences Braunschweig /Wolfenbüttel Salzdahlumer Str 46/48 38302 Wolfenbüttel,

More information

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

More information

HEURISTIC OPTIMIZATION USING COMPUTER SIMULATION: A STUDY OF STAFFING LEVELS IN A PHARMACEUTICAL MANUFACTURING LABORATORY

HEURISTIC OPTIMIZATION USING COMPUTER SIMULATION: A STUDY OF STAFFING LEVELS IN A PHARMACEUTICAL MANUFACTURING LABORATORY Proceedings of the 1998 Winter Simulation Conference D.J. Medeiros, E.F. Watson, J.S. Carson and M.S. Manivannan, eds. HEURISTIC OPTIMIZATION USING COMPUTER SIMULATION: A STUDY OF STAFFING LEVELS IN A

More information

BRACE: A Paradigm For the Discretization of Continuously Valued Data

BRACE: A Paradigm For the Discretization of Continuously Valued Data Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science

More information

The Application of K-medoids and PAM to the Clustering of Rules

The Application of K-medoids and PAM to the Clustering of Rules The Application of K-medoids and PAM to the Clustering of Rules A. P. Reynolds, G. Richards, and V. J. Rayward-Smith School of Computing Sciences, University of East Anglia, Norwich Abstract. Earlier research

More information

Automatic Fatigue Detection System

Automatic Fatigue Detection System Automatic Fatigue Detection System T. Tinoco De Rubira, Stanford University December 11, 2009 1 Introduction Fatigue is the cause of a large number of car accidents in the United States. Studies done by

More information

Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms

Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms Binoda Nand Prasad*, Mohit Rathore**, Geeta Gupta***, Tarandeep Singh**** *Guru Gobind Singh Indraprastha University,

More information

Efficient Pairwise Classification

Efficient Pairwise Classification Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization

More information

Multi-Modal Data Fusion: A Description

Multi-Modal Data Fusion: A Description Multi-Modal Data Fusion: A Description Sarah Coppock and Lawrence J. Mazlack ECECS Department University of Cincinnati Cincinnati, Ohio 45221-0030 USA {coppocs,mazlack}@uc.edu Abstract. Clustering groups

More information

An Efficient Analysis for High Dimensional Dataset Using K-Means Hybridization with Ant Colony Optimization Algorithm

An Efficient Analysis for High Dimensional Dataset Using K-Means Hybridization with Ant Colony Optimization Algorithm An Efficient Analysis for High Dimensional Dataset Using K-Means Hybridization with Ant Colony Optimization Algorithm Prabha S. 1, Arun Prabha K. 2 1 Research Scholar, Department of Computer Science, Vellalar

More information

Design and Performance Improvements for Fault Detection in Tightly-Coupled Multi-Robot Team Tasks

Design and Performance Improvements for Fault Detection in Tightly-Coupled Multi-Robot Team Tasks Design and Performance Improvements for Fault Detection in Tightly-Coupled Multi-Robot Team Tasks Xingyan Li and Lynne E. Parker Distributed Intelligence Laboratory, Department of Electrical Engineering

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS

AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS H.S Behera Department of Computer Science and Engineering, Veer Surendra Sai University

More information

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering IJIRST International Journal for Innovative Research in Science & Technology Volume 3 Issue 07 December 2016 ISSN (online): 2349-6010 Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 4th, 2014 Wolf-Tilo Balke and José Pinto Institut für Informationssysteme Technische Universität Braunschweig The Cluster

More information

A Study on Clustering Method by Self-Organizing Map and Information Criteria

A Study on Clustering Method by Self-Organizing Map and Information Criteria A Study on Clustering Method by Self-Organizing Map and Information Criteria Satoru Kato, Tadashi Horiuchi,andYoshioItoh Matsue College of Technology, 4-4 Nishi-ikuma, Matsue, Shimane 90-88, JAPAN, kato@matsue-ct.ac.jp

More information

Fuzzy clustering with volume prototypes and adaptive cluster merging

Fuzzy clustering with volume prototypes and adaptive cluster merging Fuzzy clustering with volume prototypes and adaptive cluster merging Kaymak, U and Setnes, M http://dx.doi.org/10.1109/tfuzz.2002.805901 Title Authors Type URL Fuzzy clustering with volume prototypes and

More information

An adjustable p-exponential clustering algorithm

An adjustable p-exponential clustering algorithm An adjustable p-exponential clustering algorithm Valmir Macario 12 and Francisco de A. T. de Carvalho 2 1- Universidade Federal Rural de Pernambuco - Deinfo Rua Dom Manoel de Medeiros, s/n - Campus Dois

More information

Fuzzy C-means Clustering with Temporal-based Membership Function

Fuzzy C-means Clustering with Temporal-based Membership Function Indian Journal of Science and Technology, Vol (S()), DOI:./ijst//viS/, December ISSN (Print) : - ISSN (Online) : - Fuzzy C-means Clustering with Temporal-based Membership Function Aseel Mousa * and Yuhanis

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra Pattern Recall Analysis of the Hopfield Neural Network with a Genetic Algorithm Susmita Mohapatra Department of Computer Science, Utkal University, India Abstract: This paper is focused on the implementation

More information

Associative Cellular Learning Automata and its Applications

Associative Cellular Learning Automata and its Applications Associative Cellular Learning Automata and its Applications Meysam Ahangaran and Nasrin Taghizadeh and Hamid Beigy Department of Computer Engineering, Sharif University of Technology, Tehran, Iran ahangaran@iust.ac.ir,

More information

How do microarrays work

How do microarrays work Lecture 3 (continued) Alvis Brazma European Bioinformatics Institute How do microarrays work condition mrna cdna hybridise to microarray condition Sample RNA extract labelled acid acid acid nucleic acid

More information

Research Article Using the ACS Approach to Solve Continuous Mathematical Problems in Engineering

Research Article Using the ACS Approach to Solve Continuous Mathematical Problems in Engineering Mathematical Problems in Engineering, Article ID 142194, 7 pages http://dxdoiorg/101155/2014/142194 Research Article Using the ACS Approach to Solve Continuous Mathematical Problems in Engineering Min-Thai

More information

Adaptive Metric Nearest Neighbor Classification

Adaptive Metric Nearest Neighbor Classification Adaptive Metric Nearest Neighbor Classification Carlotta Domeniconi Jing Peng Dimitrios Gunopulos Computer Science Department Computer Science Department Computer Science Department University of California

More information

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

A Naïve Soft Computing based Approach for Gene Expression Data Analysis Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2124 2128 International Conference on Modeling Optimization and Computing (ICMOC-2012) A Naïve Soft Computing based Approach for

More information

Exploiting the Scale-free Structure of the WWW

Exploiting the Scale-free Structure of the WWW Exploiting the Scale-free Structure of the WWW Niina Päivinen Department of Computer Science, University of Kuopio P.O. Box 1627, FIN-70211 Kuopio, Finland email niina.paivinen@cs.uku.fi tel. +358-17-16

More information

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

A Comparative study of Clustering Algorithms using MapReduce in Hadoop A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering

More information

http://www.cmplx.cse.nagoya-u.ac.jp/~fuzzdata/ Professor Takeshi Furuhashi Associate Professor Tomohiro Yoshikawa A student in doctor course 6 students in master course 2 undergraduate students For contact

More information

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE Sundari NallamReddy, Samarandra Behera, Sanjeev Karadagi, Dr. Anantha Desik ABSTRACT: Tata

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

S. Sreenivasan Research Scholar, School of Advanced Sciences, VIT University, Chennai Campus, Vandalur-Kelambakkam Road, Chennai, Tamil Nadu, India

S. Sreenivasan Research Scholar, School of Advanced Sciences, VIT University, Chennai Campus, Vandalur-Kelambakkam Road, Chennai, Tamil Nadu, India International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 10, October 2018, pp. 1322 1330, Article ID: IJCIET_09_10_132 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=9&itype=10

More information

Introduction to Artificial Intelligence

Introduction to Artificial Intelligence Introduction to Artificial Intelligence COMP307 Machine Learning 2: 3-K Techniques Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline K-Nearest Neighbour method Classification (Supervised learning) Basic NN (1-NN)

More information

Solving the Traveling Salesman Problem using Reinforced Ant Colony Optimization techniques

Solving the Traveling Salesman Problem using Reinforced Ant Colony Optimization techniques Solving the Traveling Salesman Problem using Reinforced Ant Colony Optimization techniques N.N.Poddar 1, D. Kaur 2 1 Electrical Engineering and Computer Science, University of Toledo, Toledo, OH, USA 2

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

A Formal Approach to Score Normalization for Meta-search

A Formal Approach to Score Normalization for Meta-search A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

A Bounded Index for Cluster Validity

A Bounded Index for Cluster Validity A Bounded Index for Cluster Validity Sandro Saitta, Benny Raphael, and Ian F.C. Smith Ecole Polytechnique Fédérale de Lausanne (EPFL) Station 18, 1015 Lausanne, Switzerland sandro.saitta@epfl.ch,bdgbr@nus.edu.sg,ian.smith@epfl.ch

More information

Ranking Clustered Data with Pairwise Comparisons

Ranking Clustered Data with Pairwise Comparisons Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances

More information

INF 4300 Classification III Anne Solberg The agenda today:

INF 4300 Classification III Anne Solberg The agenda today: INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15

More information

10701 Machine Learning. Clustering

10701 Machine Learning. Clustering 171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among

More information

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

Weighting and selection of features.

Weighting and selection of features. Intelligent Information Systems VIII Proceedings of the Workshop held in Ustroń, Poland, June 14-18, 1999 Weighting and selection of features. Włodzisław Duch and Karol Grudziński Department of Computer

More information

Cluster Analysis. Angela Montanari and Laura Anderlucci

Cluster Analysis. Angela Montanari and Laura Anderlucci Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a

More information

A Framework for adaptive focused web crawling and information retrieval using genetic algorithms

A Framework for adaptive focused web crawling and information retrieval using genetic algorithms A Framework for adaptive focused web crawling and information retrieval using genetic algorithms Kevin Sebastian Dept of Computer Science, BITS Pilani kevseb1993@gmail.com 1 Abstract The web is undeniably

More information

A GENETIC ALGORITHM FOR CLUSTERING ON VERY LARGE DATA SETS

A GENETIC ALGORITHM FOR CLUSTERING ON VERY LARGE DATA SETS A GENETIC ALGORITHM FOR CLUSTERING ON VERY LARGE DATA SETS Jim Gasvoda and Qin Ding Department of Computer Science, Pennsylvania State University at Harrisburg, Middletown, PA 17057, USA {jmg289, qding}@psu.edu

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

1 Case study of SVM (Rob)

1 Case study of SVM (Rob) DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how

More information

Review of the Robust K-means Algorithm and Comparison with Other Clustering Methods

Review of the Robust K-means Algorithm and Comparison with Other Clustering Methods Review of the Robust K-means Algorithm and Comparison with Other Clustering Methods Ben Karsin University of Hawaii at Manoa Information and Computer Science ICS 63 Machine Learning Fall 8 Introduction

More information

6. Dicretization methods 6.1 The purpose of discretization

6. Dicretization methods 6.1 The purpose of discretization 6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many

More information

University of Florida CISE department Gator Engineering. Clustering Part 2

University of Florida CISE department Gator Engineering. Clustering Part 2 Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical

More information

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,

More information

Fast Associative Memory

Fast Associative Memory Fast Associative Memory Ricardo Miguel Matos Vieira Instituto Superior Técnico ricardo.vieira@tagus.ist.utl.pt ABSTRACT The associative memory concept presents important advantages over the more common

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Two-step Modified SOM for Parallel Calculation

Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical

More information

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES 6.1 INTRODUCTION The exploration of applications of ANN for image classification has yielded satisfactory results. But, the scope for improving

More information

ORT EP R RCH A ESE R P A IDI! " #$$% &' (# $!"

ORT EP R RCH A ESE R P A IDI!  #$$% &' (# $! R E S E A R C H R E P O R T IDIAP A Parallel Mixture of SVMs for Very Large Scale Problems Ronan Collobert a b Yoshua Bengio b IDIAP RR 01-12 April 26, 2002 Samy Bengio a published in Neural Computation,

More information

A Software Testing Optimization Method Based on Negative Association Analysis Lin Wan 1, Qiuling Fan 1,Qinzhao Wang 2

A Software Testing Optimization Method Based on Negative Association Analysis Lin Wan 1, Qiuling Fan 1,Qinzhao Wang 2 International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2015) A Software Testing Optimization Method Based on Negative Association Analysis Lin Wan 1, Qiuling Fan

More information

Classification of Soil and Vegetation by Fuzzy K-means Classification and Particle Swarm Optimization

Classification of Soil and Vegetation by Fuzzy K-means Classification and Particle Swarm Optimization Classification of Soil and Vegetation by Fuzzy K-means Classification and Particle Swarm Optimization M. Chapron ETIS, ENSEA, UCP, CNRS, 6 avenue du ponceau 95014 Cergy-Pontoise, France chapron@ensea.fr

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 7: Document Clustering May 25, 2011 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig Homework

More information

Multivariate Analysis

Multivariate Analysis Multivariate Analysis Cluster Analysis Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Unsupervised Learning Cluster Analysis Natural grouping Patterns in the data

More information

Ant Colonies, Self-Organizing Maps, and A Hybrid Classification Model

Ant Colonies, Self-Organizing Maps, and A Hybrid Classification Model Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Ant Colonies, Self-Organizing Maps, and A Hybrid Classification Model Michael L. Gargano, Lorraine L. Lurie, Lixin Tao,

More information

Texture Image Segmentation using FCM

Texture Image Segmentation using FCM Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore Texture Image Segmentation using FCM Kanchan S. Deshmukh + M.G.M

More information

MITOCW ocw f99-lec07_300k

MITOCW ocw f99-lec07_300k MITOCW ocw-18.06-f99-lec07_300k OK, here's linear algebra lecture seven. I've been talking about vector spaces and specially the null space of a matrix and the column space of a matrix. What's in those

More information

L9: Hierarchical Clustering

L9: Hierarchical Clustering L9: Hierarchical Clustering This marks the beginning of the clustering section. The basic idea is to take a set X of items and somehow partition X into subsets, so each subset has similar items. Obviously,

More information