HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION
|
|
- Mark Bridges
- 5 years ago
- Views:
Transcription
1 HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION 1 M.S.Rekha, 2 S.G.Nawaz 1 PG SCALOR, CSE, SRI KRISHNADEVARAYA ENGINEERING COLLEGE, GOOTY 2 ASSOCIATE PROFESSOR, SRI KRISHNADEVARAYA ENGINEERING COLLEGE, GOOTY Abstract:-The Multiple Prototype Fuzzy Clustering Model (FCMP), introduced by Nascimento, Mirkin and Moura-Pires (1999), proposes a framework for partitional fuzzy clustering which suggests a model of how the data are generated from a cluster structure to be identified. In the model, it is assumed that the membership of each entity to a cluster expresses a part of the cluster prototype re ected in the entity. In this paper we extend the FCMP framework to a number of clustering criteria, and study the FCMP properties on fitting the underlying proposed model from which data is generated. A comparative study with the Fuzzy c-means algorithm is also presented. This paper presents Centre-based clustering approaches for clustering Y-STR data. The main goal is to investigate and observe the performance of the fundamental clustering approaches when partitioning Y-STR data. Two fundamental Centre-based hard clustering approaches, k-means and k-modes algorithms, and two fundamental Centre-based soft clustering approaches, fuzzy k-means and fuzzy k-modes algorithms were chosen for evaluation of Y-STR haplogroup and Y-STR Surname datasets. The results show that the soft k-means clustering algorithm produces the best average of the clustering accuracy (99.62%) for Y-STR haplogroup data as well Y-STR surname data (97.61%). The overall results show that the soft clustering approach is better (92.11%) than the hard clustering approach (81.20%) in clustering Y-STR data. However, the approach for clustering Y-STR data should be further investigated to find the best way of achieving 100% of the clustering results. Keywords: Soft Clustering, Hard Clustering, Fuzzy c-means Clustering, and text classification. 1.Introduction: Partitional clustering essentially deals with the task of partitioning a set of entities into a number of homogeneous clusters, with respect to a suitable similarity measure. Due to the fuzzy nature of many practical problems, a number of fuzzy clustering methods have been developed following the general fuzzy set theory strategies outlined by Zadeh, [1]. The main difference between the traditional hard clustering and fuzzy clustering can be stated as follows. While in hard clustering an entity belongs only to one cluster, in fuzzy clustering entities are allowed to belong to many clusters with different degrees of membership. The most known method of fuzzy clustering is the Fuzzy c- Means method (FCM), initially proposed by Dunn [2] and generalized by Bezdek [3],[4] and other authors [5],[6] (see [7] for an overview). Usually, membership functions are de ned based on a distance function, such that membership degrees express proximities of entities to cluster centers (i.e. prototypes). By choosing a suitable distance function (see [6],[8], [9]) different cluster shapes can be identified. However, these approaches typically fail to explicitly describe how the fuzzy cluster structure relates to the data from which it is derived. Nascimento, Mirkin and Moura-Pires [10] proposed a framework for fuzzy clustering based on a model of how the data is generated from a cluster structure to be identified. In this approach, the underlying fuzzy c-partition is supposed to be de ned in such a way that the membership of an entity to a cluster expresses a part of the cluster s prototype re ected in the entity. This way, an entity may bear 60% of a prototype A and 40% of prototype B, which simultaneously expresses the entity s membership to the respective clusters. The prototypes are considered as offered by the knowledge domain. This idea was initially proposed by Mirkin and Satarov as the socalled ideal type fuzzy clustering model [11] (see also [12]), such that each observed entity is defined as a convex combination of the prototypes, and the coe cients are the entity membership values. In our work, we consider a different way for pertaining observed entities to the prototypes: any entity may independently relate to any prototype, which is similar to the assumption in FCM criterion. The model is called the Fuzzy Clustering Multiple Prototype (FCMP) model. In this paper we extend the FCMP framework to a number of clustering criteria and present the main results of the study of the FCMP as a model-based 213
2 approach for clustering as well as its comparison with the Hard, Soft and FCM algorithm. 1. Centre Based Hard and Soft Clustering Approaches The centre-based clustering has been evolving significantly, even though the results have not reached 100% of the clustering accuracy for all benchmarks datasets yet. The trend now has shifted from the hard clustering approaches to the soft clustering approaches. The soft clustering approaches seem to be a promising approach particularly in dealing with categorical data (See Ng and Li (2009) and Kim et al., (2007)). The hard clustering is sometimes called non-fuzzy clustering, whereas the soft clustering is referred to fuzzy clustering. From a general perspective, the hard and soft clustering can be seen as differing in the assigning of values for a partition matrix. The hard clustering approach only assigns a value of 1 or 0. In contrast, the soft clustering is more relaxed, allowing the values to be part of more than one cluster. The higher the value is, the higher the degree of confidence that the objects belong to that cluster. The Centre-based clustering can be described as follows: Let us suppose that the objective is to partition a data set, D into cluster, C. Suppose that k is known as a priori. Let X ={X1, X2,, Xn} be set of data with set of attributes A ={A1,A2,, Am}. The partition of D, whether hard or soft partition is to minimize the cost function as Equation (1), and subject to Equation (2), (3) and (4). Where k( n ) is a known number of clusters, W is a (k x n) partition matrix, Z is [z 1,z 2, z k ] R mk and d (z l, x i ) is a dissimilarity measure between z l and x i. The algorithm can be generalized as: Step 1: Choose an initial point Z (1) R mk Determine W (1) such that F( W,Z (1) ) is minimized. Set t=1. Step 2: Determine Z (t=1) such that F( W t,z (t+1) ) is minimized. If F( W t,z (t+1) )= F( W t,z (t) ) then stop; otherwise goto step 3 Step 3: Determine W (t+1) such that F( W( t+1),z (t+1) ) minimized. If that F( W( t+1),z (t+1) )= that F( W( t),z (t+1) ) then stop; otherwise t=t+1 and goto step 2 From an optimization perspective, the main focus is to solve problem P as described by [13]. The problem P can be solved by iteratively solving the following two problems [14]: Problem P1: Fix Z = Z and solve the reduced problem P(W,Z ) Problem P2: Fix W = Ŵ and solve the reduced problem P( W, Z). Thus, the differences between the hard clustering and the soft clustering are as follows: In the hard clustering, the problem P1 is minimized by Equation (5). 214
3 whereas, in the soft clustering, the problem P1 is minimized by Equation (6). However, in the problem P2, the hard clustering is minimized according to the k-means and k-mode respectively. The k-means minimize as in Equation (7). The main difference between the k-means and k-modes algorithms is that, the k-means handles numerical data, whereas the k-mode handles categorical data. Thus, mean is a mechanism for k-means algorithm to update its centroids and mode for k -Modes algorithm. Consequently, the k-means uses Euclidean distance as in Equation (12) and the k-modes uses a simple dissimilarity measure, introduced by [15] as in Equation (13) and (14). whereas, in the k-modes minimize as in Equation (8). z lj =a j (r) (8) (r) where a j is the mode of attribute values of A j in cluster C l such that Further, in the soft clustering, for the Fuzzy k -Means, the minimize z is given by Equation (10). 2. FUZZY C-MEANS ALGORITHM The fuzzy c-means (FCM) algorithm [3] is one of the most widely used methods in fuzzy clustering. It is based on the concept of fuzzy c-partition, introduced by Ruspini [13], summarized as follows. LetX = [x 1..x n ] be a set of given data, where each data point x k (k = 1..n) is a vector in R p, Ucn be a set of real c x n matrices, and c be an integer, 2 c < n. Then, the fuzzy c-partition space for X is the set Where α [1, ) is a weighting exponent. In the Fuzzy k -Modes, the minimize z is given by Equation (11). (15) 215
4 where u ik is the membership value of x k in cluster i (i =1,, c). The aim of the FCM algorithm is to find an optimal fuzzy c-partition and corresponding prototypes minimizing the objective function (16) In (16), V = (v 1, v 2,., v c ) is a matrix of unknown cluster centers (prototypes) vi 2 <p, k k is the Euclidean norm, and the weighting exponent m in [1, ) is a constant that influences the membership values. To minimize criterion Jm, under the fuzzy constraints defined in (15), the FCM algorithm is de ned as an alternating minimization algorithm (cf.[3] for the derivations), as follows. Choose a value for c; m and ", a small positive constant; then, generate randomly a fuzzy c-partition U0 and set iteration number t = 0. A two-step iterative process works as follows. Given the membership values u(t) ik, the cluster centers v(t) i ( i = 1,, c ) are calculated by (17) Given the new cluster centers v (t) i, update membership values u (t) ik: According to the discussion about the traditional FCM algorithm, the initial condition of cluster centers influences the performance of the algorithm. The best choice of the original cluster centers needs to consider the features of the data set. In this paper, meteorological data is chosen as our experimental data. Meteorological data is different from other experiment data. If we just use the traditional FCM algorithm to deal with the meteorological data, there will be a large error when clustering a certain object. To solve the initialization problem, we put forward an improved FCM algorithm in term of selecting the initial cluster centers. Nowadays, there are several methods to select the original cluster centers. In the following section we will go through some commonly used methods. (1) Randomly The traditional FCM algorithm determines initial cluster centers randomly. This method is simple and generally applicable to all data but usually causes local minima. (2) user-specified Normally, users decide original cluster centers by some priori knowledge. According to the understanding of the data, users always can obtain logical cluster centers to achieve the purpose of the global optimum. (3) Randomly classify objects into several clusters, compute the center of each cluster and determine them as cluster centers More time consumption is spent when randomly classifying objects in this method. When the number of objects in data sets is very small, the cost of time can be ignored. Nevertheless, as the number of objects increases, the speed of the increasing cost of time can be largely rapid. (18) The process stops when U (t+1) - U (t) = ", or a predefined number of iterations is reached. 3. The Improved FCM Algorithm for Meteorological Data (4) Select the farthest points as cluster centers Generally speaking, this method selects initial cluster centers following to the maximum distance principle. It can achieve high efficiency if there are no outliers or noisy points in data sets. But if the data sets contain some outliers, outliers are easier to be chosen as the cluster centers. 216
5 (5) Select points with the maximum density The number of objects whose distance is less than the given radius from the observed object is defined as the density of the observed object. After computing the density of each object, the object whose density is the largest is chosen as the cluster center. Then compute densities of objects whose distances are larger than the given distance from the selected center centers, also choose the object whose density is the largest as the second cluster center. And so forth until the number of cluster centers reaches the given number. This method ensures that cluster centers are far away from each other to avoid the objective function into local minima. r d In our paper, we adopt a new method to determine cluster centers which is based on the fifth method as mentioned above. In our method, we first randomly select the observed object and compute the density of the observed object. If the density of the observed object is not less than the given density parameter, the observed object can be seen as the cluster center. Secondly we keep selecting the second cluster center satisfying the above constraints in the data set which excludes the objects which are cluster centers or objects whose distances are less than the given distance parameter. Finally we obtain the given number of cluster centers after repeating the above process. The distance parameter and the density parameter are decided by users according to the characteristics of the data sets and the priori knowledge. This selection strategy spends less time than the fifth method because time of computing densities of all objects in the data set is saved, while this method maintains the advantage of avoiding the object function into local minima. 4. Conclusion: The hard, soft and FCMP framework proposes a model of how the data are generated from a cluster structure to be identified. This implies direct interpretability of the fuzzy membership values, which should be considered a motivation for introducing the model-based methods. Based on the experimental results obtained in this research, the following can be stated. The FCMP-2 algorithm is able to restore the original prototypes from which data have been generated, and FCMP-0 can be viewed as a device for estimating the number of clusters in the underlying structure to be found. For small dimension spaces, FCMP-1 is an intermediate model between FCMP-0 and FCMP-2, and can be viewed as a model based parallel to FCM. On the high dimension data, FCMP-1 degenerates in a hard clustering approach. Also, FCM drastically decreases the number of prototypes in the high dimension spaces (at least with the proposed data generator). This model-based clustering approach seems appealing in the sense that, on doing cluster analysis, the experts of a knowledge domain usually have conceptual understanding of how the domain is organized in terms of tentative prototypes. This knowledge may well serve as the initial setting for data based structurization of the domain. In such a case, the belongingness of data entities to clusters are based on how much they share the features of corresponding prototypes. This seems useful in such application areas as mental disorders in psychiatry or consumer behavior in marketing. However, the effective utility of the multiple prototypes model still remains to be demonstrated with real data. References: [1] L. Zadeh, Fuzzy sets, Information and Control, 8, pp , [2] J. Dunn, A fuzzy relative of the Isodata process and its use in detecting compact, well-separated clusters, Journal of Cybernetics, 3(3), pp , [3] J. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, [4] J. Bezdek and R. Hathaway, Recent convergence results for the fuzzy c-means clustering algorithms, Journal of Classified- cation, 5(2), pp , [5] G. Gustafson and W. Kessel, Fuzzy clustering with a fuzzy covariance matrix, in Proc. IEEE CDC, pp , San Diego, [6] F. Klawonn and A. Keller, Fuzzy clustering based on modified distance measures, J. Kok, D. Hand and M. Berthold (eds.), Advances in Intelligent Data Analysis, Third International Symposium,(IDA 99), Lecture Notes in Computer Science, 1642, Springer-Verlag, pp , [7] R. Kruse, F. Hoppner, F. Klawonn and T. Runkler, Fuzzy Cluster Analysis, John Wiley and Sons, [8] R. Dave, Fuzzy shell-clustering and applications to circle detection of digital images, International Journal of General Systems, 16(4), pp ,
6 [9] L. Bobrowski and J. Bezdek, c-means with l1 and l1 norms, IEEE Transactions on Systems, Man and Cybernetics, 21(3), pp , 1991 [10] S. Nascimento, B. Mirkin and F. Moura-Pires, Multiple prototypes model for fuzzy clustering, in J. Kok, D. Hand and M. Berthold, (eds.), Advances in Intelligent Data Analysis. Third International Symposium,(IDA 99), Lecture Notes in Computer Science, 1642, Springer-Verlag, pp , [11] B. Mirkin and G. Satarov, Method of fuzzy additive types for analysis of multidimensional data: I, II, Automation and Remote Control, 51(5, 6), pp , pp , [13] Bobrowski L, Bezdek JC (1991) c-means clustering with the l1 and l norms. IEEE Transactions on Systems, Man and Cybernetics. 21(3): Chaturvedi A, Foods K, Green JE (2001) K-modes Clustering. Journal of Classification, 18: [14] Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2: [15] Kaufman L and Rousseeuw PJ (1987) Clustering by means of medoids. Elsevier, [12] B. Mirkin, Mathematical Classi cation and Clustering, Kluwer Academic Publishers,
Fuzzy Clustering: Insights and a New Approach
Mathware & Soft Computing 11 2004) 125-142 Fuzzy Clustering: Insights and a New Approach F. Klawonn Department of Computer Science University of Applied Sciences Braunschweig/Wolfenbuettel Salzdahlumer
More informationEqui-sized, Homogeneous Partitioning
Equi-sized, Homogeneous Partitioning Frank Klawonn and Frank Höppner 2 Department of Computer Science University of Applied Sciences Braunschweig /Wolfenbüttel Salzdahlumer Str 46/48 38302 Wolfenbüttel,
More informationCluster Tendency Assessment for Fuzzy Clustering of Incomplete Data
EUSFLAT-LFA 2011 July 2011 Aix-les-Bains, France Cluster Tendency Assessment for Fuzzy Clustering of Incomplete Data Ludmila Himmelspach 1 Daniel Hommers 1 Stefan Conrad 1 1 Institute of Computer Science,
More informationAn Improved Fuzzy K-Medoids Clustering Algorithm with Optimized Number of Clusters
An Improved Fuzzy K-Medoids Clustering Algorithm with Optimized Number of Clusters Akhtar Sabzi Department of Information Technology Qom University, Qom, Iran asabzii@gmail.com Yaghoub Farjami Department
More informationA Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data
Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data
More informationModeling Proportional Membership in Fuzzy Clustering
IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 11, NO. 2, APRIL 2003 173 Modeling Proportional Membership in Fuzzy Clustering Susana Nascimento, Boris Mirkin, and Fernando Moura-Pires Abstract To provide feedback
More informationCan Fuzzy Clustering Avoid Local Minima and Undesired Partitions?
Can Fuzzy Clustering Avoid Local Minima and Undesired Partitions? Balasubramaniam Jayaram and Frank Klawonn Abstract. Empirical evaluations and experience seem to provide evidence that fuzzy clustering
More informationKeywords - Fuzzy rule-based systems, clustering, system design
CHAPTER 7 Application of Fuzzy Rule Base Design Method Peter Grabusts In many classification tasks the final goal is usually to determine classes of objects. The final goal of fuzzy clustering is also
More informationCollaborative Rough Clustering
Collaborative Rough Clustering Sushmita Mitra, Haider Banka, and Witold Pedrycz Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India {sushmita, hbanka r}@isical.ac.in Dept. of Electrical
More informationFUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION
FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION 1 ZUHERMAN RUSTAM, 2 AINI SURI TALITA 1 Senior Lecturer, Department of Mathematics, Faculty of Mathematics and Natural
More informationColour Image Segmentation Using K-Means, Fuzzy C-Means and Density Based Clustering
Colour Image Segmentation Using K-Means, Fuzzy C-Means and Density Based Clustering Preeti1, Assistant Professor Kompal Ahuja2 1,2 DCRUST, Murthal, Haryana (INDIA) DITM, Gannaur, Haryana (INDIA) Abstract:
More informationSingle Cluster Visualization to Optimize Air Traffic Management
Single Cluster Visualization to Optimize Air Traffic Management Frank Rehm, Frank Klawonn 2, and Rudolf Kruse 3 German Aerospace Center Lilienthalplatz 7 3808 Braunschweig, Germany frank.rehm@dlr.de 2
More informationEfficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points
Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points Dr. T. VELMURUGAN Associate professor, PG and Research Department of Computer Science, D.G.Vaishnav College, Chennai-600106,
More informationOverlapping Clustering: A Review
Overlapping Clustering: A Review SAI Computing Conference 2016 Said Baadel Canadian University Dubai University of Huddersfield Huddersfield, UK Fadi Thabtah Nelson Marlborough Institute of Technology
More informationFuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.
Chapter 4 Fuzzy Segmentation 4. Introduction. The segmentation of objects whose color-composition is not common represents a difficult task, due to the illumination and the appropriate threshold selection
More informationComparative Study of Different Clustering Algorithms
Comparative Study of Different Clustering Algorithms A.J.Patil 1, C.S.Patil 2, R.R.Karhe 3, M.A.Aher 4 Department of E&TC, SGDCOE (Jalgaon), Maharashtra, India 1,2,3,4 ABSTRACT:This paper presents a detailed
More informationA fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p.
Title A fuzzy k-modes algorithm for clustering categorical data Author(s) Huang, Z; Ng, MKP Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. 446-452 Issued Date 1999 URL http://hdl.handle.net/10722/42992
More informationClustering. Supervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationSupervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationFuzzy clustering with volume prototypes and adaptive cluster merging
Fuzzy clustering with volume prototypes and adaptive cluster merging Kaymak, U and Setnes, M http://dx.doi.org/10.1109/tfuzz.2002.805901 Title Authors Type URL Fuzzy clustering with volume prototypes and
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationPerformance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms
Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms Binoda Nand Prasad*, Mohit Rathore**, Geeta Gupta***, Tarandeep Singh**** *Guru Gobind Singh Indraprastha University,
More informationKeywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.
Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Survey of Clustering
More informationMethods for Intelligent Systems
Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering
More informationA SURVEY ON CLUSTERING ALGORITHMS Ms. Kirti M. Patil 1 and Dr. Jagdish W. Bakal 2
Ms. Kirti M. Patil 1 and Dr. Jagdish W. Bakal 2 1 P.G. Scholar, Department of Computer Engineering, ARMIET, Mumbai University, India 2 Principal of, S.S.J.C.O.E, Mumbai University, India ABSTRACT Now a
More informationAnalyzing Outlier Detection Techniques with Hybrid Method
Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,
More informationS. Sreenivasan Research Scholar, School of Advanced Sciences, VIT University, Chennai Campus, Vandalur-Kelambakkam Road, Chennai, Tamil Nadu, India
International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 10, October 2018, pp. 1322 1330, Article ID: IJCIET_09_10_132 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=9&itype=10
More informationCluster Analysis. Ying Shen, SSE, Tongji University
Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group
More informationINF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering
INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal & Stephan Oepen Language Technology Group (LTG) September 23, 2015 Agenda Last week Supervised vs unsupervised learning.
More informationINF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22
INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task
More informationNovel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data
Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data PRABHJOT KAUR DR. A. K. SONI DR. ANJANA GOSAIN Department of IT, MSIT Department of Computers University School
More informationFuzzy Clustering Methods in Multispectral Satellite Image Segmentation
Fuzzy Clustering Methods in Multispectral Satellite Image Segmentation Rauf Kh. Sadykhov 1, Andrey V. Dorogush 1 and Leonid P. Podenok 2 1 Computer Systems Department, Belarusian State University of Informatics
More informationUnsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More informationEE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR
EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR 1.Introductıon. 2.Multi Layer Perception.. 3.Fuzzy C-Means Clustering.. 4.Real
More informationSimilarity Measures of Pentagonal Fuzzy Numbers
Volume 119 No. 9 2018, 165-175 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Similarity Measures of Pentagonal Fuzzy Numbers T. Pathinathan 1 and
More informationFuzzy C-MeansC. By Balaji K Juby N Zacharias
Fuzzy C-MeansC By Balaji K Juby N Zacharias What is Clustering? Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. Example: The balls of
More informationCHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS
CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS 4.1. INTRODUCTION This chapter includes implementation and testing of the student s academic performance evaluation to achieve the objective(s)
More informationGeneralized Fuzzy Clustering Model with Fuzzy C-Means
Generalized Fuzzy Clustering Model with Fuzzy C-Means Hong Jiang 1 1 Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, US jiangh@cse.sc.edu http://www.cse.sc.edu/~jiangh/
More informationTexture Image Segmentation using FCM
Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore Texture Image Segmentation using FCM Kanchan S. Deshmukh + M.G.M
More informationHFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging
007 International Conference on Convergence Information Technology HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging Lixin Han,, Guihai Chen Department of Computer Science and Engineering,
More informationTable of Contents. Recognition of Facial Gestures... 1 Attila Fazekas
Table of Contents Recognition of Facial Gestures...................................... 1 Attila Fazekas II Recognition of Facial Gestures Attila Fazekas University of Debrecen, Institute of Informatics
More informationInternational Journal Of Engineering And Computer Science ISSN: Volume 5 Issue 11 Nov. 2016, Page No.
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 11 Nov. 2016, Page No. 19054-19062 Review on K-Mode Clustering Antara Prakash, Simran Kalera, Archisha
More informationSYDE Winter 2011 Introduction to Pattern Recognition. Clustering
SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned
More informationRedefining and Enhancing K-means Algorithm
Redefining and Enhancing K-means Algorithm Nimrat Kaur Sidhu 1, Rajneet kaur 2 Research Scholar, Department of Computer Science Engineering, SGGSWU, Fatehgarh Sahib, Punjab, India 1 Assistant Professor,
More informationOn Sample Weighted Clustering Algorithm using Euclidean and Mahalanobis Distances
International Journal of Statistics and Systems ISSN 0973-2675 Volume 12, Number 3 (2017), pp. 421-430 Research India Publications http://www.ripublication.com On Sample Weighted Clustering Algorithm using
More informationMachine Learning & Statistical Models
Astroinformatics Machine Learning & Statistical Models Neural Networks Feed Forward Hybrid Decision Analysis Decision Trees Random Decision Forests Evolving Trees Minimum Spanning Trees Perceptron Multi
More informationComponent grouping for GT applications - a fuzzy clustering approach with validity measure
Loughborough University Institutional Repository Component grouping for GT applications - a fuzzy clustering approach with validity measure This item was submitted to Loughborough University's Institutional
More informationQUALITATIVE MODELING FOR MAGNETIZATION CURVE
Journal of Marine Science and Technology, Vol. 8, No. 2, pp. 65-70 (2000) 65 QUALITATIVE MODELING FOR MAGNETIZATION CURVE Pei-Hwa Huang and Yu-Shuo Chang Keywords: Magnetization curve, Qualitative modeling,
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationINF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering
INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationANALYSIS AND REASONING OF DATA IN THE DATABASE USING FUZZY SYSTEM MODELLING
ANALYSIS AND REASONING OF DATA IN THE DATABASE USING FUZZY SYSTEM MODELLING Dr.E.N.Ganesh Dean, School of Engineering, VISTAS Chennai - 600117 Abstract In this paper a new fuzzy system modeling algorithm
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster
More informationThe Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers Detection
Volume-8, Issue-1 February 2018 International Journal of Engineering and Management Research Page Number: 194-200 The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers
More informationPattern Clustering with Similarity Measures
Pattern Clustering with Similarity Measures Akula Ratna Babu 1, Miriyala Markandeyulu 2, Bussa V R R Nagarjuna 3 1 Pursuing M.Tech(CSE), Vignan s Lara Institute of Technology and Science, Vadlamudi, Guntur,
More informationClustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures
Clustering and Dissimilarity Measures Clustering APR Course, Delft, The Netherlands Marco Loog May 19, 2008 1 What salient structures exist in the data? How many clusters? May 19, 2008 2 Cluster Analysis
More informationData Mining Approaches to Characterize Batch Process Operations
Data Mining Approaches to Characterize Batch Process Operations Rodolfo V. Tona V., Antonio Espuña and Luis Puigjaner * Universitat Politècnica de Catalunya, Chemical Engineering Department. Diagonal 647,
More informationFuzzy Ant Clustering by Centroid Positioning
Fuzzy Ant Clustering by Centroid Positioning Parag M. Kanade and Lawrence O. Hall Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract We
More informationFuzzy C-means Clustering with Temporal-based Membership Function
Indian Journal of Science and Technology, Vol (S()), DOI:./ijst//viS/, December ISSN (Print) : - ISSN (Online) : - Fuzzy C-means Clustering with Temporal-based Membership Function Aseel Mousa * and Yuhanis
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationECM A Novel On-line, Evolving Clustering Method and Its Applications
ECM A Novel On-line, Evolving Clustering Method and Its Applications Qun Song 1 and Nikola Kasabov 2 1, 2 Department of Information Science, University of Otago P.O Box 56, Dunedin, New Zealand (E-mail:
More informationECLT 5810 Clustering
ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 3, Issue 3, March 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue:
More informationOBJECT-CENTERED INTERACTIVE MULTI-DIMENSIONAL SCALING: ASK THE EXPERT
OBJECT-CENTERED INTERACTIVE MULTI-DIMENSIONAL SCALING: ASK THE EXPERT Joost Broekens Tim Cocx Walter A. Kosters Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Email: {broekens,
More informationProblems of Fuzzy c-means Clustering and Similar Algorithms with High Dimensional Data Sets
Problems of Fuzzy c-means Clustering and Similar Algorithms with High Dimensional Data Sets Roland Winkler, Frank Klawonn, and Rudolf Kruse Abstract Fuzzy c-means clustering and its derivatives are very
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationImplementation of Fuzzy C-Means and Possibilistic C-Means Clustering Algorithms, Cluster Tendency Analysis and Cluster Validation
Implementation of Fuzzy C-Means and Possibilistic C-Means Clustering Algorithms, Cluster Tendency Analysis and Cluster Validation Md. Abu Bakr Siddiue *, Rezoana Bente Arif #, Mohammad Mahmudur Rahman
More informationKapitel 4: Clustering
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.
More informationUnderstanding Clustering Supervising the unsupervised
Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data
More informationAn adjustable p-exponential clustering algorithm
An adjustable p-exponential clustering algorithm Valmir Macario 12 and Francisco de A. T. de Carvalho 2 1- Universidade Federal Rural de Pernambuco - Deinfo Rua Dom Manoel de Medeiros, s/n - Campus Dois
More informationRPKM: The Rough Possibilistic K-Modes
RPKM: The Rough Possibilistic K-Modes Asma Ammar 1, Zied Elouedi 1, and Pawan Lingras 2 1 LARODEC, Institut Supérieur de Gestion de Tunis, Université de Tunis 41 Avenue de la Liberté, 2000 Le Bardo, Tunisie
More informationIntroduction to Computer Science
DM534 Introduction to Computer Science Clustering and Feature Spaces Richard Roettger: About Me Computer Science (Technical University of Munich and thesis at the ICSI at the University of California at
More informationNORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM
NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM Saroj 1, Ms. Kavita2 1 Student of Masters of Technology, 2 Assistant Professor Department of Computer Science and Engineering JCDM college
More informationApplied Fuzzy C-means Clustering to Operation Evaluation for Gastric Cancer Patients
Applied Fuzzy C-means Clustering to Operation Evaluation for Gastric Cancer Patients Hang Zettervall, Elisabeth Rakus-Andersson Department of Mathematics and Science Blekinge Institute of Technology 3779
More informationOne-mode Additive Clustering of Multiway Data
One-mode Additive Clustering of Multiway Data Dirk Depril and Iven Van Mechelen KULeuven Tiensestraat 103 3000 Leuven, Belgium (e-mail: dirk.depril@psy.kuleuven.ac.be iven.vanmechelen@psy.kuleuven.ac.be)
More informationUnsupervised Learning
Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,
More informationCHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION
CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant
More informationA novel firing rule for training Kohonen selforganising
A novel firing rule for training Kohonen selforganising maps D. T. Pham & A. B. Chan Manufacturing Engineering Centre, School of Engineering, University of Wales Cardiff, P.O. Box 688, Queen's Buildings,
More informationApplication of fuzzy set theory in image analysis. Nataša Sladoje Centre for Image Analysis
Application of fuzzy set theory in image analysis Nataša Sladoje Centre for Image Analysis Our topics for today Crisp vs fuzzy Fuzzy sets and fuzzy membership functions Fuzzy set operators Approximate
More informationSaudi Journal of Engineering and Technology. DOI: /sjeat ISSN (Print)
DOI:10.21276/sjeat.2016.1.4.6 Saudi Journal of Engineering and Technology Scholars Middle East Publishers Dubai, United Arab Emirates Website: http://scholarsmepub.com/ ISSN 2415-6272 (Print) ISSN 2415-6264
More informationFUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP
Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications & Algorithms 14 (2007) 103-111 Copyright c 2007 Watam Press FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP
More informationOUTLIER DETECTION FOR DYNAMIC DATA STREAMS USING WEIGHTED K-MEANS
OUTLIER DETECTION FOR DYNAMIC DATA STREAMS USING WEIGHTED K-MEANS DEEVI RADHA RANI Department of CSE, K L University, Vaddeswaram, Guntur, Andhra Pradesh, India. deevi_radharani@rediffmail.com NAVYA DHULIPALA
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)
More informationSwarm Based Fuzzy Clustering with Partition Validity
Swarm Based Fuzzy Clustering with Partition Validity Lawrence O. Hall and Parag M. Kanade Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract
More informationThe Pyramid Fuzzy C-means Algorithm
International International Journal of Journal Computational of Computational Intelligence Intelligence in Control, in 2(2), Control, 2010, pp. 8(2), 65-77 2016 The Pyramid Fuzzy C-means Algorithm Dan
More informationA Bounded Index for Cluster Validity
A Bounded Index for Cluster Validity Sandro Saitta, Benny Raphael, and Ian F.C. Smith Ecole Polytechnique Fédérale de Lausanne (EPFL) Station 18, 1015 Lausanne, Switzerland sandro.saitta@epfl.ch,bdgbr@nus.edu.sg,ian.smith@epfl.ch
More informationCluster Analysis. Angela Montanari and Laura Anderlucci
Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a
More informationK-Means Clustering With Initial Centroids Based On Difference Operator
K-Means Clustering With Initial Centroids Based On Difference Operator Satish Chaurasiya 1, Dr.Ratish Agrawal 2 M.Tech Student, School of Information and Technology, R.G.P.V, Bhopal, India Assistant Professor,
More informationA Fuzzy Rule Based Clustering
A Fuzzy Rule Based Clustering Sachin Ashok Shinde 1, Asst.Prof.Y.R.Nagargoje 2 Student, Computer Science & Engineering Department, Everest College of Engineering, Aurangabad, India 1 Asst.Prof, Computer
More informationClustering part II 1
Clustering part II 1 Clustering What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods 2 Partitioning Algorithms:
More informationClustering Algorithms In Data Mining
2017 5th International Conference on Computer, Automation and Power Electronics (CAPE 2017) Clustering Algorithms In Data Mining Xiaosong Chen 1, a 1 Deparment of Computer Science, University of Vermont,
More informationECLT 5810 Clustering
ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping
More informationOn-Lib: An Application and Analysis of Fuzzy-Fast Query Searching and Clustering on Library Database
On-Lib: An Application and Analysis of Fuzzy-Fast Query Searching and Clustering on Library Database Ashritha K.P, Sudheer Shetty 4 th Sem M.Tech, Dept. of CS&E, Sahyadri College of Engineering and Management,
More informationColor based segmentation using clustering techniques
Color based segmentation using clustering techniques 1 Deepali Jain, 2 Shivangi Chaudhary 1 Communication Engineering, 1 Galgotias University, Greater Noida, India Abstract - Segmentation of an image defines
More information8. Clustering: Pattern Classification by Distance Functions
CEE 6: Digital Image Processing Topic : Clustering/Unsupervised Classification - W. Philpot, Cornell University, January 0. Clustering: Pattern Classification by Distance Functions The premise in clustering
More informationApplication of genetic algorithms and Kohonen networks to cluster analysis
Application of genetic algorithms and Kohonen networks to cluster analysis Marian B. Gorza lczany and Filip Rudziński Department of Electrical and Computer Engineering Kielce University of Technology Al.
More informationFuzzy Systems. Fuzzy Clustering 2
Fuzzy Systems Fuzzy Clustering 2 Prof. Dr. Rudolf Kruse Christoph Doell {kruse,doell}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge
More informationA Graph Based Approach for Clustering Ensemble of Fuzzy Partitions
Journal of mathematics and computer Science 6 (2013) 154-165 A Graph Based Approach for Clustering Ensemble of Fuzzy Partitions Mohammad Ahmadzadeh Mazandaran University of Science and Technology m.ahmadzadeh@ustmb.ac.ir
More informationK-means algorithm and its application for clustering companies listed in Zhejiang province
Data Mining VII: Data, Text and Web Mining and their Business Applications 35 K-means algorithm and its application for clustering companies listed in Zhejiang province Y. Qian School of Finance, Zhejiang
More information