A Fuzzy Clustering and Fuzzy Merging Algorithm

Size: px
Start display at page:

Download "A Fuzzy Clustering and Fuzzy Merging Algorithm"

Transcription

1 A Fuzzy Clustering and Fuzzy Merging Algorithm Carl G. Looney Computer Science Department/171 University of Nevada, Reno Reno, NV Abstract. Some major problems in clustering are: i) find the optimal number K of clusters; ii) assess the validity of a given clustering; iii) permit the classes to form natural shapes rather than forcing them into normed balls of the distance function; iv) prevent the order in which the feature vectors are read in from affecting the clustering; and v) prevent the order of merging from affecting the clustering. The k-means algorithm is the most efficient, easiest to implement and has known convergence, but it suffers from all of the above deficiencies. We employ a relatively large number K of uniformly randomly distributed initial prototypes and then thin by deleting any prototypes that are too close to another in a manner to leave fewer uniformly distributed prototypes. We then employ the k-means algorithm, eliminate empty and very small clusters and iterate a process of computing a new type of fuzzy prototypes and reassigning the feature vectors until the prototypes become fixed. At that point we merge the K small clusters into a smaller number K of larger classes. The algorithm is tested on both simple and difficult data sets. We modify the Xie-Bene validity measure to determine the goodness of the clustering for multiple values of K. Keywords. fuzzy clustering, fuzzy merging, classification, pattern recognition 1. Introduction The classification of a set of entities by a learning system is a powerful tool for acquiring knowledge from observations stored in data files. Given a set of feature vectors, a self-organizing process clusters them into classes with similar feature values, either from a top-down (divisive) or bottom-up (agglomerative) process. Classes can also be learned under a supervised training process where each input vector is labeled and the process parameters are adjusted until the process output matches the codeword for the correct label. Nearest neighbor methods use a labeled reference set of exemplar vectors and put any novel input vector into the cluster to which a majority of its nearest labeled neighbors belong. Artificial neural networks [14] also learn this way. Here we investigate an agglomerative self-organizing (clustering) approach. The major problems in clustering are: i) how do we find the number K of clusters for a given set of vectors {x (q) : q = 1,...,Q} when K is unknown? ii) how do we assess the validity of a given clustering of a data set into K clusters? iii) how do we permit clusters to take their own natural shape rather than forcing them into the shapes of normed unit balls? iv) how do we assure that the clustering is independent of the order in which the feature vectors are input? and v) how do we assure the independence of the order in which clusters are merged? A ball of uniformly distributed vectors has no cluster structure [22]. But if a set of vectors is partitioned into multiple subsets that are compactly distributed about their centers and the centers are relatively far apart, then there is strong cluster structure. We say that such clusters are compact and well-separated. These are criteria for well clustered data sets when the clusters are contained in nonintersecting normed balls. Not all data sets have strong cluster structure. Section 2 reviews some important clustering algorithms, although none of them solves the major problems given above. Section 3 develops our new type of fuzzy clustering algorithm, while Section 4 extends this agglomerative process by merging an excess of small clusters into fewer larger clusters.

2 Together the fuzzy clustering and fuzzy merging overcome to a significant extent the problems listed above. Section 5 provides the results of computer runs on various data sets. Section 6 gives some conclusions. 2. Clustering Algorithms In what follows, {x (q) : q = 1,...,Q} is a set of Q feature vectors that is to be partitioned (clustered) into K clusters {C k : k=1,...,k}. For each k th cluster C k its prototype, or center, is designated by c (k). The dimension of the feature vectors x (q) = (x 1 (q),...,x N (q) ) and centers c (k) = (c 1 (k),...,c N (k) ) is N. The feature vectors are standardized independently in each component so the vectors belong to the N-cube [0,1] N. 2.1 The k-means Algorithm (Forgy [10], 1965 and MacQueen [15], 1967). This is the most heavily used clustering algorithm because it is used as an initial process for many other algorithms [18]. The number K of clusters is input first and K centers are initialized by taking K feature vectors as seeds via c (k) x (k) for k = 1,...,K. The main loop assigns each x (q) to the C k with closest prototype (minimum distance assignment); averages the members of each C k to obtain a new prototype c (k) ; reassigns all x (q) to clusters by minimum distance assignment; and it stops if no prototypes have changed, else it repeats the loop. This is the Forgy version. MacQueen s version recomputes a new prototype every time a feature vector is assigned to a cluster. Selim and Ismail [20] show that convergence is guaranteed to a local minimum of the total sum-squared error SSE. Letting clust[q] = k designate that x (q) is in C k, we can write SSE = {q: clust[q] = k} (k=1,k) x (q) - c (k) 2 (1) The disadvantages are: i) K must be given (it is often unknown); ii) the clusters formed can be different for different input orders of the feature vectors; iii) no improper clusters are merged or broken up; iv) the cluster averages are not always the most representative vectors for the clusters; and v) the cluster boundaries have the shape of the normed unit ball (usually the Euclidean or Mahalanobis norm, see [14]), which may be unnatural. Peña et al. [18] have shown in 1999 that random drawing of the seed (initial) centers is more independent of the ordering than is the Forgy-MacQueen selection method. While the Kaufman method is slightly better than random selection (see [18]), it is computationally much more expensive). Snarey et al. [21] proposed a maximin method for selecting a subset of the original feature vectors as seeds. 2.2 The ISODATA. The ISODATA method of Ball and Hall [2] in 1967 employs rules with parameters for clustering, merging, splitting, creating new clusters and eliminating old ones (Fu [11] abstracted this process into a generalized algorithm in 1982). K is ultimately determined, starting from some user-given initial value K 0. A cluster is broken into smaller ones if the maximal spread in a single dimension is too great. Some disadvantages are: i) multiple parameters must be given by the user, although they are not known a priori; ii) a considerable amount of experimentation may be required to get reasonable values; iii) the clusters are ball shaped as determined by the distance function; iv) the value determined for K depends on the parameters given by the user and is not necessarily the best value; and v) a cluster average is often not the best prototype for a cluster. 2.3 Other Basic Methods. Graphical methods (e,g,, see [14]) connect any pair of feature vectors if the distance between them is less than a given threshold, which can result in long dendritic clusters. A variation of this constructs a minimal spanning tree and then removes the longest branches to leave connected clusters. Such methods are generally inefficient relative to k-means types of algorithms and different thresholds can yield very different clusterings. Syntactic methods (e.g., see [14]) hold promise but are not widely applicable. -2-

3 2.4 Fuzzy Clustering Algorithms. The fuzzy c-means algorithm of Bezdek [3,5] in 1973 (also called the fuzzy ISODATA) uses fuzzy weighting with positive weights {w qk } to determine the prototypes {c (k) } for the K clusters, where K must be given. The weights minimize the constrained functional J, where J(w qk,c (k) ) = (q=1,q) (k=1,k)(w qk ) p x (q) - c (k) 2 (2) (k=1,k)w qk = 1 for each q, 0 < (q=1,q)w qk < Q, for each k (3) Upon employing Lagrange multipliers { q} in Equations (2) and (3) the resulting J of Equation (4) is minimized by putting its partial derivatives equal to zero (necessary conditions) for p > 1. J(w qk,c (k) ) = (q=1,q) (k=1,k)(w qk ) p x (q) - c (k) 2 + (q=1,q) q( (k=1,k)w qk - 1) (4) w qk = (1/ x (q) - c (k) 2 ) 1/(p-1) / (r=1,k)(1/ x (q) - c (r) 2 ) 1/(p-1), k = 1,...,K, q = 1,...,Q (5) c (k) = (q=1,q) W qk x (q), k=1,...,k; W qk = w qk / (r=1,q) w rk, q = 1,...,Q (6a,b) As p approaches unity, the close neighbors are weighted much more heavily, but as p increases beyond 2, the weightings tend to become more uniform. The fuzzy c-means algorithm allows each feature vector to belong to multiple clusters with various fuzzy membership values. The algorithm is: i) input K (2 K Q) and draw initial fuzzy weights {w qk } at random in [0,1]; ii) standardize the initial weights for each q th feature vector over all K clusters via w qk = w qk / (r=1,k) w qr (7) iii) compute J old from Equation (2); iv) standardize the weights over k = 1,...,K for each q via Equation (6b) to obtain W qk ; v) use Equation (6a) to compute new prototypes c (k), k = 1,...,K; vi) update the weights {w qk } from Equation (5); vii) compute J new from Equation (2); and viii) if J new - J old <, then stop, else J old J new and go to Step iv above. Convergence to a fuzzy weight set that minimizes the functional is not assured in general for the fuzzy clustering algorithms due to local minima and saddle points of Equation (4). A fuzzy version of the ISODATA algorithm was developed by Dunn [9] in Keller et al. [12] published a fuzzy k-nearest neighbor algorithm in 1985 (see [7], 1968 for nearest neighbor rules). The crisp (nonfuzzy) algorithm uses a set S = {x (q) : q = 1,...,Q} of labeled feature vectors and assigns any novel input vector x to a class based upon the class to which a majority of its k nearest neighbors belong. The fuzzification of the process utilizes a fuzzy weight w qk for each labeled q th feature vector and each k th class such that for each q the weights sum to unity over the K classes. The gradient based fuzzy c-means algorithm (1994) of Park and Dagher [17] minimizes the partial functionals and constraints given below (also see [14, p. 253]). A single feature vector x (q) is input at a time and any nearby c (k) are moved toward x (q) by a small amount controlled by the learning rate and fuzzy weight w qk, unlike the fuzzy c-means algorithm. J (q) (w qk,c (k) ) = (k=1,k)w qk x (q) - c (k) 2, q = 1,...,Q; (k=1,k)(w qk ) = 1, for each q (8a,b) -3-

4 J (q) (w qk,c (k) )/ w qk = 0; w qk = {1/ x (q) - c (k) 2 } / (r=1,k){1/ x (q) - c (r) 2 } (9a,b) c n (k) c n (k) - { J q / w qk } = c n (k) + (w qk ) 2 [x n (q) - c n (k) ], n = 1,...,N (10) The unsupervised fuzzy competitive learning (UFCL) algorithm (Chung and Lee [8], 1994 or see [14]) extends the unsupervised competitive learning (UCL) algorithm. The constrained functional to be minimized and the resulting optimal weights and component adjustments are J(w qk,c (k) ) = (q=1,q) (k=1,k)(w qk ) p x (q) - c (k) 2, 0 < w qk < 1; (k=1,k)w qk = 1 (for each q) (11a,b) w qk = {1/ x (q) - c (k) 2 } 1/(p-1) / (r=1,k){1/ x (q) - c (r) 2 } 1/(p-1) (12) c n (k) c n (k) + (w qk ) p [x n (q) - c n (k) ] (n = 1,...,N; k = 1,...,K) (13) A set of prototypes is selected and each feature vector is input, one at a time, and the nearest (winning) prototype is moved a small step towards it according to Equation (13). After all feature vectors have been put through this training, the process is repeated until the prototypes become stable ( decreases over the iterations).. The mountain method of Yager and Filev [25], 1994 sums Gaussian functions centered on feature vectors to build a combined mountain function over the more densely located points. The peaks are the cluster centers. While clever, this method is very computationally expensive. Chiu s method [6] in 1994 reduces the computation somewhat. The mountain function is actually a fuzzy set membership function. 2.4 Clustering Validity. A clustering validity measure provides a goodness-of-clustering value. Some general measures for fuzzy clusters [16] are the partition coefficient (PC) and the classification entropy (CE) (invented by Bezdek [4]) and the proportional exponent (PE). The uniform distribution functional (UDF) [22] is an improved measure that was proposed by Windham in These are not as independent of K as is the Xie-Beni [24] clustering validity that is a product of compactness and separation measures. The compactness-to-separation ratio is defined by = {(1/K) (k=1,k) k 2 }/{D min } 2 ; k2 = (q=1,q) w qk x (q) - c (k) 2, k = 1,...,K (14a,b) where D min is the minimum distance between the prototypes (cluster centers). A larger D min value means a lower reciprocal value. Each k2 is a fuzzy weighted mean-square error for the kth cluster, which is smaller for more compact clusters. Thus a lower value of means more compactness and greater separation. We modify the Xie-Beni fuzzy validity measure by summing over only the members of each cluster rather than over all Q exemplars for each cluster. We also take the reciprocal = 1/! so that a larger value of indicates a better clustering and we call the modified Xie-Beni clustering validity measure. 3. A New Fuzzy Clustering Algorithm The weighted fuzzy expected value (WFEV) of a set of real values x 1,..,x P was defined by Schneider and Craig [19] to be a prototypical value that is more representative of the set than is either the average or the median. It uses the two-sided decaying exponential but we use the bell shaped Gaussian function instead in -4-

5 Equation (16) below because it is a canonical fuzzy set membership function for the linguistic variable CLOSE_TO_CENTER. After computing the arithmetic mean " (0) of a set of real values x 1,..,x P as the initial center value, we employ Picard iterations to obtain the new value " (r+1) (r) = # (p=1,p) $ p x p, (% 2 ) (r+1) (r) = & (p=1,p) $ p (x p - ' (r) ) 2 (15a,b) (r) $ p = exp[-(x p - ' (r) ) 2 /(2% 2 ) (r) ] / {& (m=1,p) exp[-(x m - ' (r) ) 2 /(2% 2 ) (r) ]} (16) The value ' = ' (( ) to which this process converges is our modified weighted fuzzy expected value (MWFEV). Usually 5 to 8 iterations are sufficient for 4 or more digits of accuracy. For vectors the MWFEV is computed componentwise. The ) 2 in Equations (15b, 16) is the weighted fuzzy variance (WFV). It changes over the iterations also and an initial value (for the spread parameter ) ) can be set at about 1/4 of the average distance between cluster centers. An open question currently is that of guaranteed convergence of the MWFEV and WFV. Our preliminary investigation shows empirically that the WFV can oscillate with increasing iterations, but that a subsequence converges (this is true of the fuzzy c-means weights). Therefore we substitute a fixed nonfuzzy mean-square error for the WFV in our algorithm. Table 1. Number of Initial Cluster Centers. N = Dimension of Inputs Q = Number of Feature Vectors K = max{ 6N + 12log 2 Q, Q} 2, 2, 2 16, 32, 64 60, 72, 84 4, 4, 4 32, 64, , 96, 128 8, 8, 8 32, 64, , 120, , 16,16 64,128, , 180, , 32, , 256, ,288, 512 The MWFEV method weights outliers less but weights the central and more densely distributed vectors more heavily. It provides a more typical vector to represent a cluster. Figure 1 shows an example of the MWFEV versus the mean and the median for a simple 2-dimensional example. The 5 vectors are (1,2), (2,2), (1,3), (2,3) and (5,1). The outlier (5,1) influences the mean (2.2,2.2) and the median (2,2) too strongly. The MWFEV vector (1.503,2.282), however, is influenced only slightly by the outlier in the y-direction and even less in the x-direction. We first standardize the Q sample feature vectors by affinely mapping all Q values of the same component into the interval [0,1] (we do this independently for each n th component over n = 1,...,N). The standardized vectors are thus in the unit N-cube [0,1] N. We circumvent the problem that the order of input vectors will affect the clustering by using a large number K of uniformly distributed initial prototypes to spread out the initial cluster centers. To cluster the sample {x (q) : q=1,...,q} we use a relatively large initial number K of uniformly randomly drawn prototypes, thin out the prototypes, assign the vectors to clusters via k-means and delete empty and very small clusters, then do our special type of fuzzy clustering followed by fuzzy merging. The initial K is empirically derived as shown in Table 1 by K = max{6n + 12log 2 Q, Q}. For example, for N = 32 and Q = 256 the computed K would be 288. In the main loop we compute the fuzzy weights and the MWFEV -5-

6 (componentwise) for each k th cluster to obtain the prototype c (k). We also compute the mean-squared error * k2 for each k th cluster. Then we again assign all of the feature vectors to clusters based on the minimum distance assignment. This process is repeated until the fuzzy centers do not change. The fuzzy clustering algorithm follows , - - -, A Fuzzy Clustering Algorithm Step 1: I = 0; input I 0 ; /*Set current and max. iteration numbers. */ input N and Q; /*N = no. components; Q = no. feature vectors. */ compute K; /* K = max{ 6N + 12 log 2 Q, Q}. */ for q = 1 to Q do input x (q) ; /*Read in Q feature vectors in file to be clustered.*/ for k = 1 to K do /* Generate uniform random prototypes in [0,1] N.*/ c (k) = random-vector(); for k=1 to K do count[k] = 0; /*Initialize number of vectors in each cluster*/ = 1.0/(K) 1/N ; /*Threshold for prototypes being too close. */ Step 2: repeat dmin = ; /*Parameter for smallest distance between centers. */ for k=1 to K-1 do /*Check all distances between prototypes */ for kk=k+1 to K do /*to obtain minimum such distance. */ if (- c (k) - c (kk) < dmin) then dmin = c (k) - c (kk) ; kmin = kk; /*Index for possible deletion. */ if (dmin < ) then /*If deletion criterion is met then */ for k= kmin to K-1 do /*remove prototype too close to another one */ c (k) = c (k+1) ; /*by moving all higher prototypes down one place. */ K = K-1; /*Decrement K upon each removal. */ again = true; /*Do process again if prototype is too close. */ else again = false; /*Do not repeat if smallest distance is too large */ until (again = false); /*in which case again is a false boolean variable. */ Step 3: for i = 1 to 20 do /*Do 20 k-means iterations. */ for q = 1 to Q do /*Assign each x (q) via nearest prototype c (k). */ k* = nearest(q); /*c (k*) is nearest prototype to x (q). */ clust[q] = k*; /*Assign vector x (q) to cluster k*. */ count[k*] = count[k*]+1; /*Update count of cluster k*. */ for k=1 to K do c (k) = (0,...,0); /*Zero out all prototypes (centers) for averaging. */ for q = 1 to Q do /*For each feature vector, check if it belongs */. if (clust[q] = k) /*to cluster k, and if so, then add it to the sum*/ c (k) = c (k) + x (q) /*for averaging to get new prototype (componentwise). */ c (k) = c (k) /count[k]; Step 4: eliminate(p); /*Eliminate clusters of count p (user given small p). */ Step 5: for k = 1 to K do /*Begin fuzzy clustering. */ fuzzyweights(); /*Use Eqn. (16) to compute fuzzy wts. in k th cluster. */ / k2 = variance(); /*Get variance (mean-square error) of each cluster. */ c (k) = fuzaverage(); /*Fuzzy prototypes (Eqn. 15a, componentwise WFEV). */ for k=1 to K do count[k] = 0; /*For each k th cluster, zero out counts for reassignment.*/ Step 6: for q = 1 to Q do /*Assign each x (q) to a C k via minimal distance. */ -6-

7 k* = nearest(q); /*Assign x (q) to nearest c (k*). */ clust[q] = k*; /*Record this with index clust[q] = k*. */ count[k*] = count[k*]+1; /*Increment count for assigned vector. */ I = I + 1; /*Increment iteration number. */ Step 7: if (I > I 0 ) then /*Check for stopping criterion. */ compute ; /*Compute modified Xie-Beni validity measure. */ merge(); /*Call function to merge, eliminate empty results. */ stop; /*Stop if no prototypes change: stop_criterion = TRUE*/ else go to Step 5; /*or if number of iterations is exceeded, else repeat loop*/ Elimination Algorithm (eliminate(p)) /*For each cluster k, eliminate it if its */ for k = 1 to K-1 do /*count 2 p (empty clusters also). */ if (count[k] 2 p) then /*First, test for size < p. If so, then move*/ for i = k to K-1 do /*all count indices down by 1. */ count[i] = count[i+1]; for q = 1 to Q do /*Move assignment indices down by 1 */ if (clust[q] = i+1) then /*for any vector in k th cluster. */ clust[q] = i; K = K - 1: /*Decrement no. clusters on elimination. */ if (count[k] 2 p) then K = K - 1; /*Check to eliminate last cluster center. */ Fuzzy Merging of Clusters We first find the R = K(K-1)/2 distances {d(r): r = 1,...,R} between unique pairs of prototypes. For each such distance, the indices k1(r) and k2(r) record the two cluster indices for which the distance d(r) was computed, ordered so that k1(r) < k2(r). Each r th prototypical pair k1(r) and k2(r) is merged only if the following condition is met: d(r) < 3 D, where D is the WFEV of all of the distances {d(r): r = 1,...,R} and 3 satisfies 0.0 < 3 < 1.0 (3 = 0.5 is a good first trial value). An optional criterion is: the ball of radius (4 k1(r) + 4 k2(r))/2 centered on the midpoint vector y = ½[c (k1(r)) + c (k2(r)) ] between the cluster centers contains at least 25% of the vectors in one cluster and atleast 15% of the other (these default percentage values can be changed) Algorithm to Merge Clusters (merge()) Step 1: r = 1; /*For every one of the K(K-1)/2 unique */ for k = 1 to K-1do /*pairs of prototypes, get the distance*/ for kk = k+1 to K do /*between them and get the indices for*/ d[r] = c (k) - c (kk) ; /*the prototypical pairs (k1() and k2(). */ k1[r] = k; k2[r] = kk; r r + 1; /*Increment r (r th pair), 1 r R=K(K-1)/2. */ for k=1 to K do Merge[k] = 0; /*Zero out merge indices. */ Step 2: D = MWFEV({d[r]} r ); /* Find MWFEV of the d(r) values. */ Step 3: r* = 1; for r = 2 to K(K-1)/2 do /*Over all inter-prototype distances */ if d[r] < d[r*] then r* = r; /*find a pair with least distance for merge test. */ -7-

8 ; [Step 4: K1 = 0; K2 = 0; /*Optional: Zero out counters 1 and 2. */ y = ½[c (k1[r*]) + c (k2[r*]) ]; /*Compute midway vector between prototypes. */ = k1(r*) + k2(r*) ; /*Compute for ball centered at y. */ for q=1 to Q do /*Find feature vectors in overlapping ball. */ if (: x (q) - y: < ) then /*x (q) must be close to y and also be in either */ if (clust[q] = k1[r*]) then K1 = K1+1; /*cluster k1(r*) or cluster k2(r*). */ if (clust[q] = k2[r*]) then K2 = K2+1; P1 = K1/count[k1[r*]]; /*Compute proportion of C k1(r) points in ball, */ P2 = K2/count[k2[r*]; ] /*proportion of C k2(r) points in ball (optional).*/ < < = = Step 5: if (d[r*] < D) /*Test merging criteria (try = 0.5 first). */ [ AND ((P1 0.2 AND P2 0.2) ] /*Optional criteria for merging. */ for q = 1 to Q do /*Check each q th feature vector: is it in C k2(r*)? */ if (clust[q] = k2[r*] then /*If so, then reassign it to C k1(r*) (Class k1(r*)). */ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > clust[q] = k1[r*]; /*k2(r) > k1(r) is always true for same r. */ count[k1[r*]] = count[k1[r*]]+1; /*Increment count of Class k1(r)*/ count[k2[r*]] = count[k2[r*]] - 1; /*Decrement count of Class k2(r)*/ Merge[k2[r*]] = [k1[r*]]; /*Record merger. */ z (k2[r*]) = c (k2[r*]) ; /*Save old prototype. */ stop_criterion = false; else stop_criterion = true; Step 6: eliminate(0); /*Eliminate empty clusters (p = 0). */ if (stop_criterion) stop; /*Test for stopping( no previous merging). */ else goto Step 1; /*If no stopping then repeat merge test. */ This algorithm merges successfully without the optional criteria in Steps 4 and 5. When cluster k2(r) is merged with cluster k1(r) we record that as Merge(k2(r)) = k1(r). It is always true that k1[r] < k2[r]. A variation is to save the prototype of the merged cluster that was eliminated (see Step 5). If new feature vectors are to be classified, they can be checked for closeness to each current prototype c (k), but can also be checked against the saved prototypes of the form z (s). If a feature vector is closest to a saved prototype z (s), then the index Merge(s) yields the next cluster to which it belongs. This means that the index Merge(Merge(s)) is to be checked, and so forth. This allows a string of small clusters to be merged to form a longer one, provided that the stringent merge criteria are met, with a final shape that can be different from that of a normed ball. 5. Computer Results 5.1 Tests on Data. To show the affects of ordering on the usual (Forgy [10]) k-means algorithm we used the data file testfz1a.dta shown in Figure 2 and the file testfz1b.dta formed from it by exchanging feature vector number 1 with feature vector number 8 in the ordering. With K = 5 the different results are shown in Figures 3 (testfz1a.dta) and 4 (testfz1b.dta). Other changes in the ordering caused yet other clusterings. The shown clusterings are not intuitive. Our merging algorithm applied to the clusters in both Figures 3 and 4 with? = 0.5 yielded the identical results shown in Figure 5, which agrees more with our intuition of the classes. Figures 6 and 7 show the respective results of our new fuzzy clustering and fuzzy merging algorithms on the file testfz1a.dta. The results were the same for testfz1b.dta. The modified Xie-Beni validity values -8-

9 were also identical for the two files. This, and other tests, showed that the strategy of drawing a large number of uniformly distributed prototypes and then thinning them out to achieve a smaller number of more sparsely uniformly distributed prototypes prevented the order of inputs from affecting the final clustering in these cases. The final merged clusters with K = 3 were also intuitive. The initial K was 59, which was reduced to 8 by eliminating empty clusters and then to 5 (very intuitive as seen in Figure 6) by elimination with p = 1. K was reduced to 3 by merging = 0.5. Table 2 sums the results. Table 2. Clustering/Eliminating/Merging via New K-means Algorithm. Data File No. Clusters K Clusters Validity Comments testfz1a.dta 5 3, Initial K = 59, 8, 14, 15 deleted to K = 8, 5, 7, 13 eliminated with p=1 2, 9, 11 to K=5 (no merging). 1, 4, 6, testfz1a.dta 3 2, 5, 7, 9, 11, Initial K=59, deleted to 3, 8, 12, 14, 15 K = 8, and with p=1, 1, 4, 6, 10 eliminated to K=5, then merged to K = 3 (@ =0.5). The second data set is Anderson s [1] well-known set of 150 feature vectors of that are known to be noisy and nonseparable [12]. The data is labeled as K = 3 classes that represent 3 subspecies (Sestosa, Versicolor and Virginica) of the iris species. The given sample contains 50 labeled feature vectors from each class for a total of Q = 150. The feature vectors have N = 4 features: i) sepal length; ii) sepal width; iii) petal length; and iv) petal width. Table 3 presents the results of our complete fuzzy clustering and fuzzy merging algorithm. Our modified Xie-Beni clustering validity measure shows that K = 2 classes are best, which coincides with the PC and CE validity values found by Lin and Lee [13]. Lin and Lee also used the PE that gave the best value for K = 3 rather than K = 2. This yields 3 votes to 1 that K = 2 is the best number of clusters for the iris data set. We note that in each clustering of Table 3 the Sestosa class contained the correct number of 50 feature vectors (underlined) and the same weighted fuzzy variance (underlined). The second class therefore has two subclasses that are not well separated. Initially, our value for K was 150, but after deletions (due to close prototypes) it was reduced to 57. K was further reduced to 16, 9 and then 7 by elimination with p = 2, 6 and 10, respectively. After 40 fuzzy clustering iterations, the fuzzy merging = 0.5 reduced K to 4 as shown in Table 3. This was followed by further fuzzy merging = 0.66 to yield K = 3 clusters, and then = 0.8 to yield K = 2. The modified Xie-Beni validities are also shown in Table 3. Table 3. Fuzzy Clustering/Merging the Standardized Iris Data. Data File No. Clusters K Validity Cluster Sizes A k iris150.dta , 40, 29, , 0.174, 0.182, , 51, , 0.248, CB 50, ,

10 Our third data set is taken from Wolberg and Mangasarian [23] at the University of Wisconsin Medical School. We randomly selected 200 of the more than 500 feature vectors. As usual, we standardized each of the 30 features separately to be in [0,1]. The vectors are labeled for two classes that we think are benign and malignant. One label is attached to 121 vectors while the other is attached to 79 vectors. Our radial basis functional link net (neural network) learned the two groups of 121 and 79 vectors correctly. However, the vectors do not naturally fall precisely into these groups because of noise and/or mislabeling (a danger in supervised learning with assigned labels). Table 4 shows that K = 2 (with sizes of 128 and 72) has the best clustering. The original K was 272, which was not reduced by deletion of close clusters with the computed D (N = 30 features provided a large cube). Eliminations with p = 1, 7, and 11, followed by fuzzy clustering and then fuzzy mergings with E = 0.5, 0.66, 0.8 and 0.88, respectively, brought K down to 5, 4, 3, and 2 respectively. We conclude from the low modified Xie-Bene validity values that the data contains extraneous noise and does not separate into compact clusters. GF Table 4. Results on the Wolberg-Mangasarian Wisconsin Breast Cancer Data. Data File No. Clusters K Validity Cluster Sizes wbcd200.dta , 53, 69, 22, , 117, 22, , 109, , 72 The fourth data set was from geological data that is labeled for K = 2 classes. Each of the Q = 70 feature vectors has N = 4 features. The data were noisy and the labels were estimates by humans that gave 35 to one class and 35 to the other one. Our neural network learned the labeled classes correctly. Table 5 shows some of the results. The initial K was 98, but this was reduced to 33 by deletion of close prototypes and then reduced to 13 and 8 by elimination with p =1 and p = 4, respectively. After 40 fuzzy clustering iterations, the fuzzy mergings with E = 0.5, 0.66, 0.8 and 0.88 brought it down to the respective K values of 7, 5, 3 and 2. Other paths through deleting, eliminating and merging also yielded the same results. GF Table 5. Results on the Geological Labeled Data. Data File No. Clusters K Validity Cluster Sizes geo70.dta , 26, 7, 16, , 13, , Conclusions Of the many methods available for clustering feature vectors the k-means algorithm is the quickest and simplest, but it is dependent upon the order of the feature vectors and requires K to be given. We have investigated an approach that makes the clustering independent of the ordering. Our method of deletion to thin out a dense initial set of prototypes leaves fewer but yet uniformly distributed prototypes for initial clusters. Our method of elimination is a continuing process of removing clusters that are either empty or very small and reassigning their vectors to the remaining larger clusters. Before merging, we apply a few tens of -10-

11 iterations of our simplified fuzzy clustering that moves centers into regions of more densely situated feature vectors. These centers more accurately represent the clusters. From these prototypical centers, we do our fuzzy merging. This is accomplished by first finding the modified weighted fuzzy expected value (MWFEV) of the distances between the fuzzy centers and using a user selected value H times this MWFEV. If the distance between two fuzzy centers is less than the threshold, then the clusters are merged into the first of the two (the pairs of clusters are ordered). Optional criteria can be used in the decisionmaking to merge or not by forming a temporary center at the midpoint of the line between the pair of centers being considered, taking the summed mean-squared error of the two clusters as radius and checking to see if sufficiently many points from each cluster fall into this new overlapping ball. If so, and if the centers are sufficiently close, then the merging of the two clusters is done. In our experiments, we have had good success when deleting (thinning) using the threshold I computed by the fuzzy clustering algorithm. When eliminating empty and small clusters, we find that p = 1 gives a good first pass because there are so many clusters that most are either empty or very small and we don t want to eliminate too many clusters that could have prototypes in good locations. If there are still too many clusters, then we can use p = 2, or greater after observing the cluster sizes on the screen. The merging should use a value of H = 0.5 on the first try to see what the result is on the screen. The merging can be continued until the modified Xie-Beni validity stops increasing or K = 2 is reached. The user can learn from and employ the information output to the screen concerning the intermediate results and the modified Xie-Beni validity values so that a particular data set can be processed multiple times to achieve a very good result. This makes the combined fuzzy clustering and fuzzy merging algorithm with user interaction very useful for analyzing data. References [1] E. Anderson, The iris of the Gaspe peninsula, Bulletin American Iris Society, vol. 59, 2-5, [2] G. H. Ball and D. J. Hall, ISODATA: an iterative method of multivariate analysis and pattern classification, Proc. IEEE Int. Communications Conf., [3] J. C. Bezdek, Fuzzy Mathematics in Pattern Classification, Ph.D. thesis, Center for Applied Mathematics, Cornell University, Ithica, N.Y., [4] J. C. Bezdek, Cluster validity with fuzzy sets, J. Cyber., vol. 3, no. 3, 58-73, [5] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenun Press, N.Y., [6] S. L. Chiu, Fuzzy model identification based on cluster estimation, J. Intelligent and Fuzzy Systems, vol. 2, no. 3, [7] T. M. Cover, Estimates by the nearest neighbor rule, IEEE Trans. Information Theory, vol. 14, no. 1, 50-55, [8] F. L. Chung and T. Lee, Fuzzy competitive learning, Neural Networks, vol. 7, no. 3, ,

12 [9] J. C. Dunn, A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, J. Cybernetics, vol. 3, no. 3, 32-57, [10] E. Forgy, Cluster analysis of multivariate data: efficiency versus interpretability of classifications, Biometrics 21, 768, [11] K. S. Fu, "Introduction," appeared in Applications of Pattern Recognition, edited by K. S. Fu, CRC Press, Boca Raton, [12] J. M. Keller, M. R. Gray and J. A Givens, Jr., A fuzzy k-nearest neighbor algorithm, IEEE Trans. Systems, Man and Cybernetics, vol. 15, no. 4, , [13] C. T. Lin and C. S. George Lee, Neural Fuzzy Systems, Prentice-Hall, Upper Saddle River, NJ, [14] Carl Looney, Pattern Recognition Using Neural Networks, Oxford University Press, N.Y., [15] J. B. MacQueen, "Some methods for classification and analysis of multivariate observations," Proc. 5 th Berkeley Symp. on Probability and Statistics, University of California Press, Berkeley, , [16] N. R. Pal and J. C. Bezdek, On cluster validity for the c-means model, IEEE Trans. on Fuzzy Systems, vol. 3, no. 3, [17] D. C. Park and I. Dagher, "Gradient based fuzzy c-means (GBFCM) algorithm," Proc. IEEE Int. Conf. Neural Networks, , [18] J. M. Peña, J. A. Lozano and P. Larrañaga, An empirical comparison of four initialization methods for the K-means algorithm, Pattern Recognition Letters 20, , Jul [19] M. Schneider and M. Craig, "On the use of fuzzy sets in histogram equalization," Fuzzy Sets Syst., vol. 45, , [20] S. Z. Selim and M. A. Ismail, K-means type algorithms: a generalized convergence theorem and characterization of local optimality, IEEE Trans. Pattern Analysis and Machine Intelligence, 6, 81-87, [21] M. Snarey, N. K. Terrett, P. Willet, and D. J. Wilton, Comparison of algorithms for dissimilarity-based compound selection. J. Mol. Graphics and Modelling 15, , [22] M. P. Windham, Cluster validity for the fuzzy c-means clustering algorithm, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 4, no. 4, , [23] W. H. Wolberg and O. L. Mangasarian, Multisurface method of pattern separation for medical diagnosis applied to breast cytology, Proc. of the Nat. Academy of Sciences, U.S.A., Volume 87, , December, [24] X. L. Xie and G. Beni, A validity measure for fuzzy clustering, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 8, ,

13 [25] R. R. Yager and D. P. Filev, Approximate clustering via the mountain method, IEEE Trans. Systems, Man and Cybernetics, vol. 24, ,

14 Figure 1. MWFEV vs. mean and median. Figure 2. Original sample. -14-

15 Figure 3. K-means, original order, K = 5. Figure 4. K-means, 1 and 8 exchanged, K =

16 Figure 5. Fuzzy merging of k-means clusters. Figure 6. New method, thinned to K=

17 Figure 7. Fuzzy clustering/merging. -17-

BCD-Clustering algorithm for Breast Cancer Diagnosis

BCD-Clustering algorithm for Breast Cancer Diagnosis International Journal of Scientific and Research Publications, Volume, Issue 1, January 01 1 BCD-Clustering algorithm for Breast Cancer Diagnosis Dr Sulochana Wadhwani, Dr A.K. Wadhwani, Tripty Singh,

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Unsupervised Learning

Unsupervised Learning Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

University of Florida CISE department Gator Engineering. Clustering Part 2

University of Florida CISE department Gator Engineering. Clustering Part 2 Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical

More information

Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data

Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data Novel Intuitionistic Fuzzy C-Means Clustering for Linearly and Nonlinearly Separable Data PRABHJOT KAUR DR. A. K. SONI DR. ANJANA GOSAIN Department of IT, MSIT Department of Computers University School

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Cluster Analysis. Ying Shen, SSE, Tongji University

Cluster Analysis. Ying Shen, SSE, Tongji University Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Machine Learning and Pervasive Computing

Machine Learning and Pervasive Computing Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)

More information

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 4: CLUSTER ANALYSIS CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis

More information

Machine Learning. Unsupervised Learning. Manfred Huber

Machine Learning. Unsupervised Learning. Manfred Huber Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training

More information

University of Florida CISE department Gator Engineering. Clustering Part 5

University of Florida CISE department Gator Engineering. Clustering Part 5 Clustering Part 5 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville SNN Approach to Clustering Ordinary distance measures have problems Euclidean

More information

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Medha Vidyotma April 24, 2018 1 Contd. Random Forest For Example, if there are 50 scholars who take the measurement of the length of the

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

INF 4300 Classification III Anne Solberg The agenda today:

INF 4300 Classification III Anne Solberg The agenda today: INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15

More information

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

More information

Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms

Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms Performance Measure of Hard c-means,fuzzy c-means and Alternative c-means Algorithms Binoda Nand Prasad*, Mohit Rathore**, Geeta Gupta***, Tarandeep Singh**** *Guru Gobind Singh Indraprastha University,

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.

More information

8. Clustering: Pattern Classification by Distance Functions

8. Clustering: Pattern Classification by Distance Functions CEE 6: Digital Image Processing Topic : Clustering/Unsupervised Classification - W. Philpot, Cornell University, January 0. Clustering: Pattern Classification by Distance Functions The premise in clustering

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Clustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures

Clustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures Clustering and Dissimilarity Measures Clustering APR Course, Delft, The Netherlands Marco Loog May 19, 2008 1 What salient structures exist in the data? How many clusters? May 19, 2008 2 Cluster Analysis

More information

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION WILLIAM ROBSON SCHWARTZ University of Maryland, Department of Computer Science College Park, MD, USA, 20742-327, schwartz@cs.umd.edu RICARDO

More information

A Brief Overview of Robust Clustering Techniques

A Brief Overview of Robust Clustering Techniques A Brief Overview of Robust Clustering Techniques Robust Clustering Olfa Nasraoui Department of Computer Engineering & Computer Science University of Louisville, olfa.nasraoui_at_louisville.edu There are

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical

More information

Bumptrees for Efficient Function, Constraint, and Classification Learning

Bumptrees for Efficient Function, Constraint, and Classification Learning umptrees for Efficient Function, Constraint, and Classification Learning Stephen M. Omohundro International Computer Science Institute 1947 Center Street, Suite 600 erkeley, California 94704 Abstract A

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster

More information

Introduction to Artificial Intelligence

Introduction to Artificial Intelligence Introduction to Artificial Intelligence COMP307 Machine Learning 2: 3-K Techniques Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline K-Nearest Neighbour method Classification (Supervised learning) Basic NN (1-NN)

More information

Kapitel 4: Clustering

Kapitel 4: Clustering Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.

More information

Based on Raymond J. Mooney s slides

Based on Raymond J. Mooney s slides Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit

More information

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22 INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task

More information

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE Sundari NallamReddy, Samarandra Behera, Sanjeev Karadagi, Dr. Anantha Desik ABSTRACT: Tata

More information

A fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p.

A fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. Title A fuzzy k-modes algorithm for clustering categorical data Author(s) Huang, Z; Ng, MKP Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. 446-452 Issued Date 1999 URL http://hdl.handle.net/10722/42992

More information

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION 1 ZUHERMAN RUSTAM, 2 AINI SURI TALITA 1 Senior Lecturer, Department of Mathematics, Faculty of Mathematics and Natural

More information

Bagging for One-Class Learning

Bagging for One-Class Learning Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one

More information

Fuzzy Ant Clustering by Centroid Positioning

Fuzzy Ant Clustering by Centroid Positioning Fuzzy Ant Clustering by Centroid Positioning Parag M. Kanade and Lawrence O. Hall Computer Science & Engineering Dept University of South Florida, Tampa FL 33620 @csee.usf.edu Abstract We

More information

K-Means Clustering With Initial Centroids Based On Difference Operator

K-Means Clustering With Initial Centroids Based On Difference Operator K-Means Clustering With Initial Centroids Based On Difference Operator Satish Chaurasiya 1, Dr.Ratish Agrawal 2 M.Tech Student, School of Information and Technology, R.G.P.V, Bhopal, India Assistant Professor,

More information

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters

A New Online Clustering Approach for Data in Arbitrary Shaped Clusters A New Online Clustering Approach for Data in Arbitrary Shaped Clusters Richard Hyde, Plamen Angelov Data Science Group, School of Computing and Communications Lancaster University Lancaster, LA1 4WA, UK

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Lecture 10: Semantic Segmentation and Clustering

Lecture 10: Semantic Segmentation and Clustering Lecture 10: Semantic Segmentation and Clustering Vineet Kosaraju, Davy Ragland, Adrien Truong, Effie Nehoran, Maneekwan Toyungyernsub Department of Computer Science Stanford University Stanford, CA 94305

More information

Clustering & Classification (chapter 15)

Clustering & Classification (chapter 15) Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical

More information

Artificial Intelligence. Programming Styles

Artificial Intelligence. Programming Styles Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to

More information

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 70 CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES 3.1 INTRODUCTION In medical science, effective tools are essential to categorize and systematically

More information

Comparative Study of Clustering Algorithms using R

Comparative Study of Clustering Algorithms using R Comparative Study of Clustering Algorithms using R Debayan Das 1 and D. Peter Augustine 2 1 ( M.Sc Computer Science Student, Christ University, Bangalore, India) 2 (Associate Professor, Department of Computer

More information

UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING. Daniela Joiţa Titu Maiorescu University, Bucharest, Romania

UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING. Daniela Joiţa Titu Maiorescu University, Bucharest, Romania UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING Daniela Joiţa Titu Maiorescu University, Bucharest, Romania danielajoita@utmro Abstract Discretization of real-valued data is often used as a pre-processing

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Hierarchical Clustering

Hierarchical Clustering What is clustering Partitioning of a data set into subsets. A cluster is a group of relatively homogeneous cases or observations Hierarchical Clustering Mikhail Dozmorov Fall 2016 2/61 What is clustering

More information

A Graph Theoretic Approach to Image Database Retrieval

A Graph Theoretic Approach to Image Database Retrieval A Graph Theoretic Approach to Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington, Seattle, WA 98195-2500

More information

Image Segmentation. Shengnan Wang

Image Segmentation. Shengnan Wang Image Segmentation Shengnan Wang shengnan@cs.wisc.edu Contents I. Introduction to Segmentation II. Mean Shift Theory 1. What is Mean Shift? 2. Density Estimation Methods 3. Deriving the Mean Shift 4. Mean

More information

Introduction to Computer Science

Introduction to Computer Science DM534 Introduction to Computer Science Clustering and Feature Spaces Richard Roettger: About Me Computer Science (Technical University of Munich and thesis at the ICSI at the University of California at

More information

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering. Chapter 4 Fuzzy Segmentation 4. Introduction. The segmentation of objects whose color-composition is not common represents a difficult task, due to the illumination and the appropriate threshold selection

More information

Clustering in Data Mining

Clustering in Data Mining Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,

More information

Character Recognition

Character Recognition Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Fuzzy clustering with volume prototypes and adaptive cluster merging

Fuzzy clustering with volume prototypes and adaptive cluster merging Fuzzy clustering with volume prototypes and adaptive cluster merging Kaymak, U and Setnes, M http://dx.doi.org/10.1109/tfuzz.2002.805901 Title Authors Type URL Fuzzy clustering with volume prototypes and

More information

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which

More information

CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS

CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS 4.1. INTRODUCTION This chapter includes implementation and testing of the student s academic performance evaluation to achieve the objective(s)

More information

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering Erik Velldal University of Oslo Sept. 18, 2012 Topics for today 2 Classification Recap Evaluating classifiers Accuracy, precision,

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

A Fast Approximated k Median Algorithm

A Fast Approximated k Median Algorithm A Fast Approximated k Median Algorithm Eva Gómez Ballester, Luisa Micó, Jose Oncina Universidad de Alicante, Departamento de Lenguajes y Sistemas Informáticos {eva, mico,oncina}@dlsi.ua.es Abstract. The

More information

Content-based Image and Video Retrieval. Image Segmentation

Content-based Image and Video Retrieval. Image Segmentation Content-based Image and Video Retrieval Vorlesung, SS 2011 Image Segmentation 2.5.2011 / 9.5.2011 Image Segmentation One of the key problem in computer vision Identification of homogenous region in the

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

Motivation. Technical Background

Motivation. Technical Background Handling Outliers through Agglomerative Clustering with Full Model Maximum Likelihood Estimation, with Application to Flow Cytometry Mark Gordon, Justin Li, Kevin Matzen, Bryce Wiedenbeck Motivation Clustering

More information

Lesson 3. Prof. Enza Messina

Lesson 3. Prof. Enza Messina Lesson 3 Prof. Enza Messina Clustering techniques are generally classified into these classes: PARTITIONING ALGORITHMS Directly divides data points into some prespecified number of clusters without a hierarchical

More information

Enhanced Hemisphere Concept for Color Pixel Classification

Enhanced Hemisphere Concept for Color Pixel Classification 2016 International Conference on Multimedia Systems and Signal Processing Enhanced Hemisphere Concept for Color Pixel Classification Van Ng Graduate School of Information Sciences Tohoku University Sendai,

More information

Fast Efficient Clustering Algorithm for Balanced Data

Fast Efficient Clustering Algorithm for Balanced Data Vol. 5, No. 6, 214 Fast Efficient Clustering Algorithm for Balanced Data Adel A. Sewisy Faculty of Computer and Information, Assiut University M. H. Marghny Faculty of Computer and Information, Assiut

More information

Kernel Based Fuzzy Ant Clustering with Partition validity

Kernel Based Fuzzy Ant Clustering with Partition validity 2006 IEEE International Conference on Fuzzy Systems Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 6-2, 2006 Kernel Based Fuzzy Ant Clustering with Partition validity Yuhua Gu and Lawrence

More information

6. Object Identification L AK S H M O U. E D U

6. Object Identification L AK S H M O U. E D U 6. Object Identification L AK S H M AN @ O U. E D U Objects Information extracted from spatial grids often need to be associated with objects not just an individual pixel Group of pixels that form a real-world

More information

Seismic regionalization based on an artificial neural network

Seismic regionalization based on an artificial neural network Seismic regionalization based on an artificial neural network *Jaime García-Pérez 1) and René Riaño 2) 1), 2) Instituto de Ingeniería, UNAM, CU, Coyoacán, México D.F., 014510, Mexico 1) jgap@pumas.ii.unam.mx

More information

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION

HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION HARD, SOFT AND FUZZY C-MEANS CLUSTERING TECHNIQUES FOR TEXT CLASSIFICATION 1 M.S.Rekha, 2 S.G.Nawaz 1 PG SCALOR, CSE, SRI KRISHNADEVARAYA ENGINEERING COLLEGE, GOOTY 2 ASSOCIATE PROFESSOR, SRI KRISHNADEVARAYA

More information

Texture Image Segmentation using FCM

Texture Image Segmentation using FCM Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT vol. 25 (2012) (2012) IACSIT Press, Singapore Texture Image Segmentation using FCM Kanchan S. Deshmukh + M.G.M

More information

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn KTH ROYAL INSTITUTE OF TECHNOLOGY Lecture 14 Machine Learning. K-means, knn Contents K-means clustering K-Nearest Neighbour Power Systems Analysis An automated learning approach Understanding states in

More information

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM. Center of Atmospheric Sciences, UNAM November 16, 2016 Cluster Analisis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster)

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

An adjustable p-exponential clustering algorithm

An adjustable p-exponential clustering algorithm An adjustable p-exponential clustering algorithm Valmir Macario 12 and Francisco de A. T. de Carvalho 2 1- Universidade Federal Rural de Pernambuco - Deinfo Rua Dom Manoel de Medeiros, s/n - Campus Dois

More information

6. Dicretization methods 6.1 The purpose of discretization

6. Dicretization methods 6.1 The purpose of discretization 6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many

More information

Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms

Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms Padhraic Smyth Department of Computer Science Bren School of Information and Computer Sciences University of California,

More information

Application of fuzzy set theory in image analysis. Nataša Sladoje Centre for Image Analysis

Application of fuzzy set theory in image analysis. Nataša Sladoje Centre for Image Analysis Application of fuzzy set theory in image analysis Nataša Sladoje Centre for Image Analysis Our topics for today Crisp vs fuzzy Fuzzy sets and fuzzy membership functions Fuzzy set operators Approximate

More information

An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve

An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve Replicability of Cluster Assignments for Mapping Application Fouad Khan Central European University-Environmental

More information

CS6716 Pattern Recognition

CS6716 Pattern Recognition CS6716 Pattern Recognition Prototype Methods Aaron Bobick School of Interactive Computing Administrivia Problem 2b was extended to March 25. Done? PS3 will be out this real soon (tonight) due April 10.

More information

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE 32 CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE 3.1 INTRODUCTION In this chapter we present the real time implementation of an artificial neural network based on fuzzy segmentation process

More information

Segmentation (continued)

Segmentation (continued) Segmentation (continued) Lecture 05 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr Mubarak Shah Professor, University of Central Florida The Robotics

More information

[Raghuvanshi* et al., 5(8): August, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Raghuvanshi* et al., 5(8): August, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A SURVEY ON DOCUMENT CLUSTERING APPROACH FOR COMPUTER FORENSIC ANALYSIS Monika Raghuvanshi*, Rahul Patel Acropolise Institute

More information

CLUSTERING. CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16

CLUSTERING. CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16 CLUSTERING CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16 1. K-medoids: REFERENCES https://www.coursera.org/learn/cluster-analysis/lecture/nj0sb/3-4-the-k-medoids-clustering-method https://anuradhasrinivas.files.wordpress.com/2013/04/lesson8-clustering.pdf

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

A Survey on Image Segmentation Using Clustering Techniques

A Survey on Image Segmentation Using Clustering Techniques A Survey on Image Segmentation Using Clustering Techniques Preeti 1, Assistant Professor Kompal Ahuja 2 1,2 DCRUST, Murthal, Haryana (INDIA) Abstract: Image is information which has to be processed effectively.

More information

Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions

Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Thomas Giraud Simon Chabot October 12, 2013 Contents 1 Discriminant analysis 3 1.1 Main idea................................

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal & Stephan Oepen Language Technology Group (LTG) September 23, 2015 Agenda Last week Supervised vs unsupervised learning.

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 4th, 2014 Wolf-Tilo Balke and José Pinto Institut für Informationssysteme Technische Universität Braunschweig The Cluster

More information

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation

More information

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

More information

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster)

More information

S. Sreenivasan Research Scholar, School of Advanced Sciences, VIT University, Chennai Campus, Vandalur-Kelambakkam Road, Chennai, Tamil Nadu, India

S. Sreenivasan Research Scholar, School of Advanced Sciences, VIT University, Chennai Campus, Vandalur-Kelambakkam Road, Chennai, Tamil Nadu, India International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 10, October 2018, pp. 1322 1330, Article ID: IJCIET_09_10_132 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=9&itype=10

More information

EE 701 ROBOT VISION. Segmentation

EE 701 ROBOT VISION. Segmentation EE 701 ROBOT VISION Regions and Image Segmentation Histogram-based Segmentation Automatic Thresholding K-means Clustering Spatial Coherence Merging and Splitting Graph Theoretic Segmentation Region Growing

More information