Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Size: px

Start display at page:

Download "Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance"

Jordan Short
5 years ago
Views:

1 BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 2 Sofa 2016 Prnt ISSN: ; Onlne ISSN: DOI: /cat Hybrdzaton of Expectaton-Maxmzaton and K-Means Algorthms for Better Clusterng Performance D. Raa Kshor 1, N. B. Venkateswarlu 2 1 Dept. of CSE, JNTU, Hyderabad, Telangana, Inda 2 Dept. of CSE, AITAM, Tekkal, Andhra Pradesh, Inda Emals: raakshor@gmal.com venkat_rtch@yahoo.com Abstract: The present work proposes hybrdzaton of Expectaton-Maxmzaton (EM) and K-means technques as an attempt to speed-up the clusterng process. Even though both the K-means and EM technques look nto dfferent areas, K-means can be vewed as an approxmate way to obtan maxmum lkelhood estmates for the means. Along wth the proposed algorthm for hybrdzaton, the present work also experments wth the Standard EM algorthm. Sx dfferent datasets, three of whch synthetc datasets, are used for the experments. Clusterng ftness and Sum of Squared Errors (SSE) are computed for measurng the clusterng performance. In all the experments t s observed that the proposed algorthm for hybrdzaton of EM and K-means technques s consstently takng less executon tme wth acceptable Clusterng Ftness value and less SSE than the standard EM algorthm. It s also observed that the proposed algorthm s producng better clusterng results than the Cluster package of Purdue Unversty. Keywords: Hybrdzaton, clusterng, K-means, mxture models, expectaton maxmzaton, clusterng ftness, sum of squared errors. 1. Introducton The Expectaton Maxmzaton (EM) algorthm s a model-based clusterng technque, whch attempts to optmze the ft between the gven data and some mathematcal model. Such methods are often based on the assumpton that the data s generated by a mxture of underlyng probablty dstrbutons [1]. The EM s an effectve, popular technque for estmatng mxture model parameters (lke cluster weghts and means) [7-9]. When compared to other clusterng algorthms, the EM algorthm demands more computatonal efforts although t produces exceptonally good results [20-22]. Many researchers 16

2 expermented on some varants (lke Generalzed EM (GEM), Expectaton Condtonal Maxmzaton (ECM), Sparse EM (SpEM), Lazy EM (LEM), Expectaton-Condtonal Maxmzaton Ether (ECME) algorthm and the Space Alternatng Generalzed Expectaton (SAGE) maxmzaton algorthms) n order to reduce the executon tme of EM algorthm [17, 18]. In [19], the use of Wnograd s algorthm s proposed to reduce the computatonal efforts of E-step and M-step of the standard EM algorthm. In [15], the use of mult-crtera models s proposed to desgn clusters wth the am of mproved clusterng performance. All ther experments amed at the speed-up of the EM algorthm by yeldng the same results as the Standard EM algorthm or better results wthout sacrfcng ts smplcty and stablty. As an attempt to speed up the clusterng process, the present work proposes the hybrdzaton of EM and K-means algorthms. The K-means algorthm s a very popular algorthm for data clusterng, whch ams at the local mnmum of the dstorton [2, 23]. EM s a model based approach, whch ams at fndng clusters such that maxmum lkelhood of each cluster s parameters s obtaned. In EM, each observaton belongs to each cluster wth a certan probablty [2]. The K-means algorthm s the 2nd domnantly used data mnng algorthm and the EM algorthm s the 5th domnantly used data mnng algorthm [3, 4, 24]. Though both K-means and EM technques look nto dfferent areas [2, 23], K-means can be vewed as an approxmate way to obtan maxmum lkelhood estmates for the means, whch s the goal of densty estmaton n EM [23, 24]. In the present work, along wth the proposed algorthm for hybrdzaton of EM and K-means technques, experments are carred out wth the standard EM algorthm. In all the experments, t s observed that the proposed algorthm for hybrdzaton of EM and K-means technques s consstently takng less executon tme to produce the clusterng results wth acceptable clusterng ftness value and less SSE n comparson to the standard EM algorthm. The proposed algorthm s also observed to produce clusterng results wth better performance than the Cluster Package of Purdue Unversty [26]. 2. The Standard EM (StEM) algorthm EM algorthm parttons the gven data by calculatng the maxmum a posteror prncple usng the condtonal probabltes [17]. Gven a guess for the parameter values, the EM algorthm calculates the probablty that each pont belongs to each dstrbuton and then uses these probabltes to compute a new estmate for the parameter. The EM algorthm teratvely refnes ntal mxture model parameter estmates to better ft the data and termnates at a locally optmal soluton. The standard EM [10, 11] for Gaussan Mxture Models (GMM) assumes that the algorthm wll estmate k class dstrbutons C, =1,, k. For each of the nput vectors X, =1,, N, the algorthm calculates the probablty P(C X ). The hghest probablty wll pont to the vector s class. 17

3 The EM algorthm works teratvely by applyng two steps: the Expectaton step (E-step) and the Maxmzaton step (M-step). Formally, ^ ( t ) { ( t ), ( t ), W ( t )}, 1,..., k, stands for successve parameter estmates. Gven a dataset of N, d-dmensonal vectors, the EM algorthm has to cluster them nto k groups. The mult-dmensonal Gaussan dstrbuton for the cluster C s parameterzed by the d-dmensonal mean column vector and d x d covarance matrx s gven as follows [10]: 1 ( ) T ( ) 1 1 X ( X ) 2 (1) P( X C ) e, d (2 ) where X s a sample column vector, the superscrpt T ndcates transpose of a column vector, s the determnant of, and ( ) 1 s ts matrx nverse of covarance matrx. The mxture model probablty densty functon [10] s (2) where W l s the weght of cluster C l Termnaton condton k P( X ) W P( X C ), l l1 As the termnaton condton, percentage change s computed usng the followng formula: t t 1 (3) Percentage change * 100, where t s the number of vectors assgned to new clusters n t-th teraton, and t+1 s the number of vectors assgned to new clusters n (t+1)-th teraton. The symbol * ndcates multplcaton. The algorthm termnates when the percentage change < 3. The EM algorthm for Gaussan Mxture Model [10] proceeds as follows: Step 1. Intalze mxture model parameters: set the current teraton t=0; set ntal weghts, W, to 1/k for all k clusters; select k vectors randomly from the dataset as the ntal cluster means, ; compute global covarance matrx for the dataset and set t to be the ntal covarance matrx,, for all clusters. Step 2 (E-step). Estmate the probablty of each class C, =1, 2,, k, gven a certan vector X, =1, 2,, N, for current teraton t usng the followng formula and assgn X to the cluster wth the maxmum probablty, (4) where X 1/2 WP( X C ) ( t) exp. W ( t) k PX ( ) 1/2 l l t Wl t l1 P( C ), ( ) exp. ( ) t 18

4 1 ( ( )) T 1 X t ( t )( X ( t )), 2 1 ( ( )) T 1 l X l t l ( t )( X l ( t )). 2 Each of the k clusters has ts mean ( ) and covarance ( ), = 1, 2,, k; W s the weght of -th cluster. Step 3 (M-step). Here, for -th cluster, update the parameter estmaton for the teraton t+1 as follows: P( C X ) X (5) 1 ( t 1) N, P( C X ) (6) (7) N 1 N T P( C X )( X ( t))( X ( t)) 1 ( t 1), N 1 P( C X ) N 1 W ( t 1) P( C X ). t N 1 Step 4. Compute percentage change usng (3). Step 5. Stop the process f the percentage change s < 3. Otherwse, set t=t+1 and repeat the Steps 2 up to 4 wth the updated parameters. 3. Hybrdzaton of EM and HbEMKM algorthms Though an effectvely used algorthm, the EM suffers from slow convergence as t requres heavly on computatonal efforts nvolved n repeated computaton of the many parameters lke covarance matrces, means and weghts of the clusters and repeated computaton of the nverses of covarance matrces of the clusters [3, 5, 24, 25]. On the other hand, the K-means algorthm can be used to smplfy the computaton and accelerate convergence as t requres only one parameter to compute,.e., cluster means [23, 24]. Whle assgnng ponts to the clusters, the EM maxmzes the lkelhood and the K-means mnmzes the dstorton wth respect to the clusters [23]. The algorthm for conventonal K-means s gven below [12]. Algorthm K-means Step 1. Select k vectors randomly from the dataset as the ntal cluster means,. Set the current teraton t=0. Step 2. Repeat. Step 3. Assgn each vector X from the dataset to ts closest cluster mean usng Eucldean dstance, (8) d 2 dst( X, ) ( xl l ), l1 19

5 where X s the -th vector n the dataset, s the mean of the cluster, and d s the number of dmensons of a data pont. Step 4. Re-compute the cluster means and set t=t+1. Step 5. Compute percentage change usng (3). Step 6. Untl percentage change s < 3. Step 7. End of K-means The present work, as an attempt to speed up the clusterng process, experments wth the Hybrdzaton of EM and K-Means algorthms (HbEMKM). Though both EM and K-means technques look nto dfferent areas [2, 23], K-means can be vewed as an approxmate way to obtan maxmum lkelhood estmates for the means, whch s the goal of densty estmaton n EM [23, 24]. Furthermore, K-means s formally equvalent to EM as K-means s a lmtng case of fttng data by a mxture of k Gaussans wth dentcal, sotropc covarance matrces ( = 2 I), when the soft assgnments of data ponts to mxture components are hardened to allocate each data pont solely to the most lkely component [3, 23]. A random space s sotropc f ts covarance functon depends on dstance alone [25]. In practce, there s often some conflaton of the two algorthms that K-means s sometmes used n densty estmaton applcatons due to ts more rapd convergence [23]. Also that selecton of ntal values s crtcal for EM, snce t most lkely converges to local maxma around the ntal values as EM uses maxmum lkelhood [2]. It may be a good practce, f the results of K-means are used as ntal parameter values for a subsequent executon of EM for the more exact computatons [23, 24]. The present work also experments on runnng the EM algorthm on the results of K-Means algorthm (KMEM). Along wth the proposed algorthm for hybrdzaton of EM and K-means technques, experments are carred out wth the standard EM algorthm and fnally performance comparson s made among the results of all experments. In all the experments same termnaton condton, dscussed Secton 2.1, s used. The pseudo code for the algorthm s gven below. Ths algorthm performs clusterng usng EM and K-means technques n the alternatve teratons tll termnaton. As part of the maxmzaton step for EM, cluster weghts, means and covarance matrces are calculated usng the results of K-means step. Algorthm HbEMKM N = number of samples n data n = number of samples n the -th cluster X = -th sample n data k = number of clusters W = weght of -th clusters = mean of -th cluster = covarance matrx of -th cluster Select k vectors randomly from the nput dataset as the ntal cluster means,. Frst, assgn each data vector X to the closest cluster wth mean, usng Eucldean dstance n the formula (8). Set sprogress = true 20

6 Repeat whle (sprogress == true) M-Step. Compute means and covarance matrces for = 1,, k, based on the results of K-Means step. Compute cluster weghts W = n /N for = 1,, k. E-Step. For each gven data vector X ( = 1, 2,, N), compute the cluster probablty P(C X ) for = 1,, k, usng (4). Assgn X to the cluster wth Max{ P( C X ); 1,..., k}. Compute percentage change usng (3). IF (percentage change >= 3) Compute cluster means for = 1,, k, usng (5). K-Means Step. Assgn each data vector X to the closest cluster wth mean, usng Eucldean dstance n the formula (8). Compute percentage change usng (3). IF (percentage change >= 3) Set sprogress = true ELSE Set sprogress = false End of nner IF ELSE Set sprogress = false End of outer IF End of Repeat Loop End of HbEMKM 4. Clusterng performance measure As a measure of clusterng performance, the Clusterng Ftness [13] s computed. The calculaton of Clusterng Ftness nvolves ntra-cluster smlarty, nter-cluster smlarty, and the experental knowledge,. The man obectve of any clusterng algorthm s to generate clusters wth hgher ntra-cluster smlarty and lower ntercluster smlarty [16]. So both the measures are taken nto consderaton for computng Clusterng Ftness. The computaton of Clusterng Ftness results n hgher values when the nter-cluster smlarty s low and results n lower values for when the nter-cluster smlarty s hgh. To make the computaton of Clusterng Ftness unbased, the value of s taken as 0.5 [13] Intracluster smlarty for the cluster C It can be quantfed va some functon of the recprocals of ntracluster rad wthn each of the resultng clusters. The ntracluster smlarty of a cluster C, 1 = = k, denoted as S tra(c ), s defned by 1 n (9) Stra ( C ). n 1 dst( IlCentrod) 1 21

7 Here, n s the number of tems n cluster C, 1 = l = n, I l s the l-th tem n cluster C l, and dst(i l, Centrod) calculates the dstance between I l and the centrod of C, whch s the ntracluster radus of C. To smooth the value of S tra(c ) and allow for possble sngleton clusters 1 s added to the denomnator and numerator Intracluster smlarty for one clusterng result C Denoted as S tra(c), Intracluster smlarty for one clusterng result C s defned by k S 1 tra ( C ) (10) Stra ( c). k Here, k s the number of resultng clusters n C Intercluster smlarty It can be quantfed va some functon of the recprocals of ntercluster rad of the clusterng centrods. The ntercluster smlarty for one of the possble clusterng results C, denoted as S ter(c), s defned by 1 n (11) Ster ( C). k 2 1 dst(centrod, Centrod ) 1 Here, k s the number of resultng clusters n C, 1 == k, Centrod s the centrod of the -th cluster n C, Centrod 2 s the centrod of all centrods of clusters n C. We compute ntercluster radus of Centrod by calculatng dst(centrod, Centrod 2 ), whch s dstance between Centrod, and Centrod 2. To smooth the value of S ter(c) and allow for possble all-nclusve clusterng result, 1 s added to the denomnator and the numerator Clusterng ftness The Clusterng Ftness (CF) for one of the possble clusterng results C s defned by 1 (12) CF * Stra. S Here, 0 < < 1 s an experental weght. The symbol * ndcates multplcaton. The present work consders = Sum of squared errors In the present work, Sum of Squared Errors (SSE) s also computed for all the clusterng results to measure the clusterng performance [6]. The clusterng performance s consdered to be good f the correspondng SSE s less when compared to the other clusterng technques. The SSE s computed usng the followng formula k 1 (13) SSE X. N 1 XC ter 22

8 Here, X s a vector from the dataset, s the means of the cluster C, k s the number of clusters and N s the number of vectors n the dataset. X denotes the dstance between X and. The obectve of clusterng s to mnmze the wthncluster sum of squared errors. The lesser the SSE, the better the goodness of ft s. 5. Experment and results Experments are carred out on the system wth Intel(R) Core K wth 3.50 GHz processor speed, 8GB RAM wth 1666FSB, Wndows 7 OS and usng JDK1.7.0_45. Separate modules are wrtten for each algorthm to observe the CPU tme for clusterng any dataset by keepng the same cluster seeds for all methods. I/O operatons are elmnated and tme observed s strctly for clusterng of the data (Table 1). Magc Gamma, Poker Hand, and Letter Recognton datasets are used for the present work from UCI ML dataset repostory [14]. An mportant ssue n evaluatng data analyss algorthms s the avalablty of representatve data. When real-lfe data s hard to obtan or when ts propertes are hard to modfy for testng varous algorthms, synthetc data becomes an appealng alternatve. The present work also uses three synthetc datasets that are generated by an algorthm for generatng multvarate normal random varables [27]. The frst synthetc dataset s generated assumng all clusters have dfferent means and dfferent covarance matrces. The second synthetc dataset s generated assumng some clusters have the same mean but dfferent covarance matrces. The thrd synthetc dataset s generated assumng some clusters have the same covarance matrx but dfferent means. Table 1. Datasets Data set Number of ponts Number of dmensons Letter Recognton Data 20, Magc Gamma data 19, Poker Hand data 1,025, Synthetc data-1 50, Synthetc data-2 50, Synthetc data-3 50, All the algorthms are studed by executng on each dataset by varyng number of clusters (.e., k=10, 11, 12, 13, 14, 15). The detals of executon tme, clusterng ftness and SSE of each algorthm are separately gven n the tables below for each dataset Observatons on letter recognton dataset The Tables 2, 3 and 4 consst of the executon tme, the cluster ftness and Sum of Squared Error (SSE), respectvely, of algorthms dscussed n sectons 2 and 3 and the Cluster Package of Purdue Unversty performed on Letter Recognton dataset. The observatons are also shown n the Fgs 1, 2 and 3. 23

9 Table 2. Executon tme of each clusterng method (s) Fg. 1. Letter recognton dataset: Executon tmes Table 3. Clusterng ftness of each clusterng method Fg. 2. Letter recognton dataset: Clusterng ftness Table 4. SSE of each clusterng method

10 Fg. 3. Letter recognton dataset: Sum of squared errors 5.2. Observatons on magc gamma dataset Tables 5, 6 and 7 consst of the executon tme, the cluster ftness and Sum of Squared Error (SSE), respectvely, of algorthms dscussed n Sectons 2 and 3 and the Cluster Package of Purdue Unversty performed on Magc Gamma dataset. The observatons are also shown n the Fgs 4, 5 and 6. Table 5. Executon tme of each clusterng method (s) Fg. 4. Magc gamma dataset: Executon tmes Table 6. Clusterng ftness of each clusterng method

11 Fg. 5. Magc gamma dataset: Clusterng ftness Table 7. SSE of each clusterng method Observatons on Poker hand dataset 26 Fg. 6. Magc gamma dataset: Sum of squared errors The tables 8, 9 and 10 consst of the executon tme, the cluster ftness and Sum of Squared Error (SSE), respectvely, of algorthms dscussed n Sectons 2 and 3 and the Cluster Package of Purdue Unversty performed on Poker Hand dataset. The observatons are also shown n Fgs 7, 8 and 9. Table 8. Executon tme of each clusterng method (s)

12 Fg. 7. Poker hand dataset: Executon tmes Table 9. Clusterng ftness of each clusterng method Fg. 8. Poker hand dataset: Clusterng ftness Table 10. SSE of each clusterng method

13 5.4. Observatons on Synthetc dataset-1 Fg. 9. Poker hand dataset: Sum of squared errors Tables 11, 12 and 13 consst of the executon tme, the cluster ftness and SSE, respectvely, of algorthms dscussed n Sectons 2 and 3 and the Cluster Package of Purdue Unversty performed on Synthetc dataset-1. The observatons are also shown n Fgs 10, 11 and 12. Table 11. Executon tme of each clusterng method (s) Fg. 10. Synthetc dataset-1: Executon tmes Table 12. Clusterng ftness of each clusterng method

14 Fg. 11. Synthetc dataset-1: Clusterng ftness Table 13. SSE of each clusterng method Observatons on synthetc dataset-2 Fg. 12. Synthetc dataset-1: Sum of squared errors Tables 14, 15 and 16 consst of the executon tme, the cluster ftness and Sum of Squared Error (SSE), respectvely, of algorthms dscussed n sectons 2 and 3 and the Cluster Package of Purdue Unversty performed on Synthetc dataset-2. The observatons are also shown n Fgs 13, 14 and 15. Table 14. Executon tme of each clusterng method (s)

15 Fg. 13. Synthetc dataset-2: Executon tmes Table 15. Clusterng ftness of each clusterng method Fg. 14. Synthetc dataset-2: Clusterng ftness Table 16. SSE of each clusterng method

16 5.6. Observatons on Synthetc dataset-3 Fg. 15. Synthetc dataset-2: Sum of squared errors Tables 17, 18 and 19 consst of the executon tme, the cluster ftness and SSE, respectvely, of algorthms dscussed n Sectons 2 and 3 and the Cluster Package of Purdue Unversty performed on Synthetc dataset-3. The observatons are also shown n Fgs 16, 17 and 18. Table 17. Executon tme of each clusterng method (s) Fg. 16. Synthetc dataset-3: Executon tmes Table 18. Clusterng ftness of each clusterng method

17 Fg. 17. Synthetc dataset-3: Clusterng ftness Table 19. SSE of each clusterng method Fg. 18. Synthetc dataset-3: Sum of squared errors 6. Concluson The proposed algorthm for Hybrdzaton of EM and K-means s consstently takng less computatonal tme wth all the tested datasets. The algorthm also takes less computatonal tme when compared to the Cluster package from Purdue Unversty. The proposed algorthm also produces the results wth hgher clusterng ftness values than the other algorthms ncludng Cluster It s also observed that the proposed algorthm produces the clusterng results wth lesser SSE values than the other algorthms ncludng the Cluster package. Therefore, the present work proposes Hybrdzaton of EM and K-means algorthms as a faster clusterng technque wth mproved performance. 32

18 R e f e r e n c e s 1. F r a l e y, C., A. E. R a f t e r y. Model-Based Clusterng, Dscrmnant Analyss, and Densty Estmaton. Journal of the Amercan Statstcal Assocaton, Vol. 97, 2002, No 458, p A d e b s, A. A., O. E. O l u s a y o, O. S. O l a t u n d e. An Exploratory Study of K-Means and Expectaton Maxmzaton Algorthms. Brtsh Journal of Mathematcs & Computer Scence, Vol. 2, 2012, No 2, pp W u, X., V. K u m a r, J. R. Q u n l a n, J. G h o s h, Q. Y a n g, H. M o t o d a, G. J. M c L a c h l a n, A. N g, B. Lu, P. S. Y u, Z.-H. Z h o u, M. S t e n b a c h, D. J. H a n d, D. S t e n b e r g. Survey Paper: Top 10 Algorthms n Data Mnng. Knowledge and Informaton Systems, Vol. 14, 2008, pp M a c Q u e e n, J. Some Methods for Classfcaton and Analyss of Multvarate Observatons. In: Proc. of 5th Berkeley Symposum on Mathematcs, Statstcs and Probablty, Vol. 1, 1967, pp M c L a c h l a n, G. J., T. K r s h n a n. The EM Algorthm and Extensons, 2/e. John Wley & Sons, Inc., Han, J., M. K a m b e r. Data Mnng Concepts and Technques, 2/e. New Delh, Inda, Elsever, Inc., Tan, P.-N., M. S t e n b a c h, V. K u m a r. Introducton to Data Mnng, 1/e. Pearson Educaton, Y e u n g, K. Y., C. F r a l e y, A. M u r u a, A. E. R a f t e r y, W. L. R u z z o. Model-Based Clusterng and Data Transformatons for Gene Expresson Data. Bonformatcs, Vol. 17, 2010, No 10, pp B r a d l e y, P. S., U. M. F a y y a d, C. A. R e n a. Scalng EM (Expectaton-Maxmzaton) Clusterng to Large Databases. Techncal Report, Mcrosoft Research, MSR-TR-98-35, K ö r t n g, T. S., L. V. D u t r a, L. M. G. F o n s e c a, G. J. E r t h a l. Assessment of a Modfed Verson of the EM Algorthm for Remote Sensng Data Classfcaton. In: Proc. of Iberoamercan Congress on Pattern Recognton (CIARP). São Paulo, Brazl, LNCS 6419, 2010, pp K ö r t n g, T. S., L. V. D u t r a, L. M. G. F o n s e c a, G. E r t h a l, F. C. da S l v a. Improvements to Expectaton-Maxmzaton Approach for Unsupervsed Classfcaton of Remote Sensng Data. GeoINFO, Campos do Jordão, SP, Brazl, A g g a r w a l, N., K. A g g a r w a l. A Md-Pont Based K-Mean Clusterng Algorthm for Data Mnng. Internatonal Journal on Computer Scence and Engneerng, Vol. 4, 2012, No 6, pp Han, X., T. Z h a o. Auto-K Dynamc Clusterng Algorthm. Journal of Anmal and Veternary Advances, Vol. 4, 2005, No 5, pp UCL Machne Learnng Repostory R a d e v a, I. Mult-Crtera Models for Clusters Desgn. Cybernetcs and Informaton Technology, Vol. 13, 2013, No 1, pp Rao, V. S. H., M. V. J o n n a l a g e d d a. Insurance Dynamcs A Data Mnng Approach for Customer Retenton n Health Care Insurance Industry. Cybernetcs and Informaton Technologes, Vol. 12, 2012, No 1, pp J o l l o s, F. X., M. N a d f. Speed-up for the Expectaton-Maxmzaton Algorthm for Clusterng Categorcal Data. J. Glob Optm, Vol. 37, 2007, pp M e n g, X.-L., D. V a n D y k. The EM Algorthm An Old Folk-Song Sung to a Fast New Tune. Journal of the Royal Statstcal Socety, Vol. 59, 1997, No 3, pp N a g e n d r a, K. D. J., J. V. R. M u r t h y, N. B. V e n k a t e s w a r l u. Fast Expectaton Maxmzaton Clusterng Algorthm. Internatonal Journal of Computatonal Intellgence Research, Vol. 8, 2012, No 2, pp J o l l o s, F.-X., M. N a d f. Speed-up for the Expectaton-Maxmzaton Algorthm for Clusterng Categorcal Data. Journal of Global Optmzaton, Vol. 37, 2007, No 4, pp

19 21. N e a l, R., G. E. H n t o n. A Vew of the EM Algorthm That Justfes Incremental, Sparse, and Other Varants, Learnng n Graphcal Models. MA, USA, Kluwer Academc Publshers, X u, R., D. W u n s c h II. Survey of Clusterng Algorthms. IEEE Transactons on Neural Networks, Vol. 16, 2005, No 3, pp K e a r n s, M., Y. M a n s o u r, A. N g. An Informaton-Theoretc Analyss of Hard and Soft Assgnment Methods for Clusterng, Uncertanty n Artfcal Intellgence. In: Proc. of 13th Conference Annual Conference on Uncertanty n Artfcal Intellgence (UAI-97), San Francsco, CA, Morgan Kaufmann, 1997, pp D u d a, R. O., P. E. H a r t, D. G. S t o r k. Pattern Classfcaton, 2/e. New Delh, Wley-Inda Edton, P o r c u, E m l o, M o n t e r o, J o s e-m a r t a, S c h l a t h e r, M a r t n. Advances and Challenges n Space-Tme Modellng of Natural Events. In: Lecture Notes n Statstcs. Vol Berln, Hedelberg, Sprnger-Verlag, Purdue Unversty Cluster Software. software/cluster/ 27. A m t a v a, G., H. S. W. K. P n n a d u w a. A FORTRAN Program for Generaton of Multvarate Normally Dstrbuted Random Varables. Computers & Geoscences, Vol. 13, 1987, No 3, pp

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features