Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (008 From Comparng Clusterngs to Combnng Clusterngs Zhwu Lu and Yuxn Peng and Janguo Xao Insttute of Computer Scence and Technology, Pekng Unversty, Beng 0087, Chna {luzhwu,pengyuxn,xg}@cst.pku.edu.cn Abstract Ths paper presents a fast smulated annealng framework for combnng multple clusterngs (.e. clusterng ensemble based on some measures of agreement between parttons, whch are orgnally used to compare two clusterngs (the obtaned clusterng vs. a ground truth clusterng for the evaluaton of a clusterng algorthm. Though we can follow a greedy strategy to optmze these measures as obectve functons of clusterng ensemble, some local optma may be obtaned and smultaneously the computatonal cost s too large. To avod the local optma, we then consder a smulated annealng optmzaton scheme that operates through sngle label changes. Moreover, for these measures between parttons based on the relatonshp (oned or separated of pars of obects such as Rand ndex, we can update them ncrementally for each label change, whch makes sure the smulated annealng optmzaton scheme s computatonally feasble. The smulaton and real-lfe experments then demonstrate that the proposed framework can acheve superor results. Introducton Comparng clusterngs plays an mportant role n the evaluaton of clusterng algorthms. A number of crtera have been proposed to measure how close the obtaned clusterng s to a ground truth clusterng, such as mutual nformaton (MI (Strehl and Ghosh 00, Rand ndex (Rand 97; Hubert and Arabe 985, Jaccard ndex (Denoeud and Guénoche 006, and Wallace ndex (Wallace 983. One mportant applcaton of these measures s to make obectve evaluaton of mage segmentaton algorthms (Unnkrshnan, Pantofaru, and Hebert 007, snce mage segmentaton can be consdered as a clusterng problem. Snce the maor dffculty of clusterng combnaton s ust n fndng a consensus partton from the ensemble of parttons, these measures for comparng clusterngs can further be used as the obectve functons of clusterng ensemble. Here, t s only dfferent n that the consensus partton has to be compared to multple parttons. Such consensus functons have been developed n (Strehl and Ghosh 00 based Correspondng author. Copyrght c 008, Assocaton for the Advancement of Artfcal Intellgence (www.aaa.org. All rghts reserved. on MI. Though a greedy strategy can be used to maxmze normalzed MI va sngle label change, the computatonal cost s too large. Hence, we resort to those measures between parttons based on the relatonshp (oned or separated of pars of obects such as Rand ndex, Jaccard ndex, and Wallace ndex, whch can be updated ncrementally for each sngle label change. Moreover, to resolve the local convergence problem, we follow a smulated annealng optmzaton scheme, whch s computatonally feasble due to the ncremental update of obectve functon. We have actually proposed a fast smulated annealng framework for clusterng ensemble based on measures for comparng clusterngs. There are three man advantages to the proposed framework: developng a seres of consensus functons for clusterng ensemble, not ust one; avodng the local optma problem; 3 low computatonal complexty of our consensus functons - O(nkr for n obects, k clusters n the target partton, and r clusterngs n the ensemble. Our framework s readly applcable to large data sets, as opposed to other consensus functons whch are based on the co-assocaton of obects n clusters from an ensemble wth quadratc complexty O(n kr. Moreover, unlke those algorthms that search for a consensus partton va re-labelng and subsequent votng, ths framework can operate wth arbtrary parttons wth varyng numbers of clusters, not constraned to a predetermned number of clusters n the ensemble parttons. The rest of ths paper s organzed as follows. Secton descrbes relevant research on clusterng combnaton. In secton 3, we brefly ntroduce some measures for comparng clusterngs and especally gve three of them n detal. Secton 4 then presents the smulated annealng framework for clusterng ensemble based on the three measures. The expermental results on several data sets are presented n secton 5, followed by the conclusons n secton 6. Motvaton and Related Work Approaches to combnaton of clusterngs dffer n two man respects, namely the way n whch the contrbutng component clusterngs are obtaned and the method by whch they are combned. One mportant consensus functon s proposed by (Fred and Jan 005 to summarze varous clusterng results n a co-assocaton matrx. Co-assocaton values represent the strength of assocaton between obects by an- 665
alyzng how often each par of obects appears n the same cluster. Then the co-assocaton matrx serves as a smlarty matrx for the data tems. The fnal clusterng s formed from the co-assocaton matrx by lnkng the obects whose co-assocaton value exceeds a certan threshold. One drawback of the co-assocaton consensus functon s ts quadratc computatonal complexty n the number of obects O(n. Moreover, experments n (Topchy, Jan, and Punch 005 show co-assocaton methods are usually unrelable wth the number of clusterngs r<50. Some hypergraph-based consensus functons have also been developed n (Strehl and Ghosh 00. All the clusters n the ensemble parttons can be represented as hyperedges on a graph wth n vertces. Each hyperedge descrbes a set of obects belongng to the same cluster. A consensus functon can be formulated as a soluton to k-way mn-cut hypergraph parttonng problem. One hypergraph-based method s the meta-clusterng algorthm (MCLA, whch also uses hyperedge collapsng operatons to determne soft cluster membershp values for each obect. Hypergraph methods seem to work best for nearly balanced clusters. A dfferent consensus functon has been developed n (Topchy, Jan, and Punch 003 based on nformatontheoretc prncples. An elegant soluton can be obtaned from a generalzed defnton of MI, namely Quadratc MI (QMI, whch can be effectvely maxmzed by the k-means algorthm n the space of specally transformed cluster labels of the gven ensemble. However, t s senstve to ntalzaton due to the local optmzaton scheme of k-means. In (Dudot and Frdlyand 003; Fscher and Buhmann 003, a combnaton of parttons by re-labelng and votng s mplemented. Ther works pursue drect re-labelng approaches to the correspondence problem. A re-labelng can be done optmally between two clusterngs usng the Hungaran algorthm. After an overall consstent re-labelng, votng can be appled to determne cluster membershp for each obect. However, ths votng method needs a very large number of clusterngs to obtan a relable result. A probablstc model of consensus s offered by (Topchy, Jan, and Punch 005 usng a fnte mxture of multnomal dstrbutons n the space of cluster labels. A combned partton s found as a soluton to the correspondng maxmum lkelhood problem usng the EM algorthm. Snce the EM consensus functon needs to estmate too many parameters, accuracy degradaton wll nevtably occur wth ncreasng number of parttons when sample sze s fxed. To summarze, exstng consensus functons suffer from a number of drawbacks that nclude complexty, heurstc character of obectve functon, and uncertan statstcal status of the consensus soluton. Ths paper ust ams to overcome these drawbacks through developng a fast smulated annealng framework for combnng multple clusterngs based on those measures for comparng clusterngs. Measures for Comparng Clusterngs Ths secton frst presents the basc notatons for comparng two clusterngs, and then ntroduces three measures of agreement between parttons whch wll be used for combnng multple clusterngs n the rest of the paper. Notatons and Problem Statement Let λ a and λ b be two clusterngs of the sample data set X = {x t } n t=, wth k a and k b groups respectvely. To compare these two clusterngs, we have to frst gve a quanttatve measure of agreement between them. In the case of evaluatng a clusterng algorthm, t means that we have to show how close the obtaned clusterng s to a ground truth clusterng. Snce these measures wll further be used as obectve functons of clusterng ensemble, t s mportant that we can update them ncrementally for sngle label change. The computaton of the new obectve functon n ths way can lead to much less computatonal cost. Hence, we focus on these measures whch can be specfed as: S(λ a,λ b =f({n a } ka =, {nb } k b =, {n }, ( where n a s the number of obects n cluster C accordng to λ a, n b s the number of obects n cluster C accordng to λ b, and n denotes the number of obects that are n cluster C accordng to λ a as well as n group C accordng to λ b. When an obect (whch s n C accordng to λ b moves from cluster C to cluster C accordng to λ a, only the followng updates arse for ths sngle label change: ˆn a = n a, ˆn a = na +, ( ˆn = n, ˆn = n +. (3 Accordng to (, S(λ a,λ b may then be updated ncrementally. Though many measures for comparng clusterngs can be represented as (, we wll focus on one specal type of measures based on the relatonshp (oned or separated of pars of obects such as Rand ndex, Jaccard ndex, and Wallace ndex n the followng. The comparson of parttons for ths type of measures s ust based on the pars of obects of X. Two parttons λ a and λ b agree on a par of obects x and x f these obects are smultaneously oned or separated n them. On the other hand, there s a dsagreement f x and x are oned n one of them and separated n the other. Let n A be the number of pars smultaneously oned together, n B the number of pars oned n λ a and separated n λ b, n C the number of pars separated n λ a and oned n λ b, and n D the number of pars smultaneously separated. Accordng to (Hubert and Arabe 985, we have n A =,, nb = a na, b and n C = n D = na n B n C. na. Moreover, we can easly obtan Rand Index Rand ndex s a popular nonparametrc measure n statstcs lterature and works by countng pars of obects that have compatble label relatonshps n the two clusterngs to be compared. More formally, the Rand ndex (Rand 97 can be computed as the rato of the number of pars of obects havng the same label relatonshp n λ a and λ b as: ( n R(λ a,λ b =(n A + n D /, (4 where n A + n D = +, a b. 666
A problem wth the Rand ndex s that the expected value of the Rand ndex of two random parttons does not take a constant value. The corrected Rand ndex proposed by (Hubert and Arabe 985 assumes the generalzed hypergeometrc dstrbuton as the model of randomness,.e., the two parttons λ a and λ b are pcked at random such that the number of obects n the clusters are fxed. Under ths model, the corrected Rand ndex can be gven as: CR(λ a,λ b =, h a h b /, (5 (ha + h b h a h b / where h a = a and h b = b. In the followng, we actually use ths verson of Rand ndex for combnng multple clusterngs. Jaccard Index In the Rand ndex, the pars smultaneously oned or separated are counted n the same way. However, parttons are often nterpreted as classes of oned obects, the separatons beng the consequences of ths clusterng. We then use the Jaccard ndex (Denoeud and Guénoche 006, noted J, whch does not consder the n D smultaneous separatons: J(λ a,λ b = n A nd =, h a + h b,, (6 where nd = n A + n B + n C = h a + h b n A. Wallace Index Ths ndex s very natural, and t s the number of oned pars common to two parttons λ a and λ b dvded by the number of possble pars (Wallace 983: W (λ a,λ b = n A ha h b =, ha h. (7 b Ths last quantty depends on the partton of reference and, f we do not want to favor nether λ a nor λ b, the geometrcal average s used. The Proposed Framework The above measures of agreement between parttons for comparng clusterngs are further used as obectve functons of clusterng ensemble. In ths secton, we frst gve detals about the clusterng ensemble problem, and then present a fast smulated annealng framework for combnng multple clusterngs that operates through sngle label changes to optmze these measure-based obectve functons. The Clusterng Ensemble Problem Gven a set of r parttons Λ={λ q q =,..., r}, wth the q-th partton λ q havng k q clusters, the consensus functon Γ for combnng multple clusterngs can be defned ust as (Strehl and Ghosh 00: Γ:Λ λ, N n r N n, (8 whch maps a set of clusterngs to an ntegrated clusterng. If there s no pror nformaton about the relatve mportance of the ndvdual groupngs, then a reasonable goal for the consensus answer s to seek a clusterng that shares the most nformaton wth the orgnal clusterngs. More precsely, based on the measure of agreement (.e. shared nformaton between parttons, we can now defne a measure between a set of r parttons Λ and a sngle partton λ as the average shared nformaton: S(λ, Λ = r S(λ, λ q. (9 r q= Hence, the problem of clusterng ensemble s ust to fnd a consensus partton λ of the data set X that maxmzes the obectve functon S(λ, Λ from the gathered parttons Λ: λ =argmax λ r r S(λ, λ q. (0 q= The desred number of clusters k n the consensus clusterng λ deserves a separate dscusson that s beyond the scope of ths paper. Here, we smply assume that the target number of clusters s predetermned for the consensus clusterng. More detals about ths model selecton problem can be found n (Fgueredo and Jan 00. To update the obectve functon of clusterng ensemble ncrementally, we have to consder those measures whch take the form of (. Though many measures for comparng clusterngs can be represented as (, we wll focus on one specal type of measures based on the relatonshp (oned or separated of pars of obects n the followng. Actually, only three measures,.e. the Rand ndex, Jaccard ndex, and Wallace ndex, are used as the obecton functons of clusterng ensemble. Moreover, to resolve the local convergence problem of the greedy optmzaton strategy, we further take nto account the smulated annealng scheme. Note that our clusterng ensemble algorthms developed n the followng can be modfed slghtly when other types of measures specfed as ( are used as obectve functons. Hence, we have actually presented a smulated annealng framework for combnng multple clusterngs. Clusterng Ensemble va Smulated Annealng Gven a set of r parttons Λ={λ q q =,..., r}, the obectve functon of clusterng ensemble can ust be set as the measure between a sngle partton λ and Λ n (9. The measure S(λ, λ q between λ and λ q can be Rand ndex, Jaccard ndex, or Wallace ndex. Accordng to (5 (7, we can set S(λ, λ q as any of the followng three measures: S(λ, λ q h q 0 = h h q /( n (h + h q h h q /, ( S(λ, λ q = h q 0 /(h + h q hq 0, ( S(λ, λ q = h q 0 h / h q, (3 where h q 0 = q, h =, and h q = q., Here, the frequency counts are denoted a lttle dfferently 667
from (: n s the number of obects n cluster C accordng to λ, n q s the number of obects n cluster C accordng to λ q, and n q s the number of obects that are n cluster C accordng to λ and n cluster C accordng to λ q. Note that the correspondng algorthms based on these three measures whch follow the smulated annealng optmzaton scheme are denoted as SA-RI, SA-JI, and SA-WI, respectvely. To fnd the consensus partton from the multple clusterngs Λ, we can maxmze the obectve functon S(λ, Λ by sngle label change. That s, we randomly select an obect x t from the data set X = {x t } n t=, and then change the label of t λ(x t = to another randomly selected label accordng to λ,.e., move t from the current cluster C to another cluster C. Such sngle label change only leads to the followng updates: ˆn = n, ˆn = n +, (4 ˆn q = nq, ˆnq = nq +, (5 where = λ q (x t (q =,..., r. For each λ q Λ, to update S(λ, λ q, we can frst calculate h and h q 0 ncrementally: ĥ = h + n n +, (6 ĥ q 0 = h q 0 + nq nq +. (7 Note that h q keeps fxed for each label change. Hence, we can obtan the new Ŝ(λ, λq accordng to ( (3, and the new obectve functon Ŝ(λ, Λ s ust the mean of {Ŝ(λ, λq } r q=. Here, t s worth pontng out that the update of the obectve functon has only lnear tme complexty O(r for sngle label change, whch makes sure that the smulated annealng scheme s computatonally feasble for the maxmum of S(λ, Λ. We further take nto account a smplfed smulated annealng scheme to determne whether to select the sngle label change λ(x t :. At a temperature T, the probablty of selectng the sngle label change λ(x t : can be calculated as follows: { P (λ(x t : f ΔS >0 = e ΔS T otherwse, (8 where ΔS = Ŝ(λ, Λ S(λ, Λ. We actually select the sngle label change f P (λ(x t : s hgher than a threshold P 0 (0 <P 0 < ; otherwse, we wll dscard t and begn to try the next sngle label change. The complete descrpton of our smulated annealng framework for clusterng ensemble s fnally summarzed n Table. The tme complexty s O(nk r. Expermental Results The experments are conducted wth artfcal and real-lfe data sets, where true natural clusters are known, to valdate both accuracy and robustness of consensus va our smulated annealng framework. We also explore the data sets usng seven dfferent consensus functons. Table : Clusterng Ensemble va Smulated Annealng Input:. A set of r parttons Λ={λ q q =,..., r}. The desred number of clusters k 3. The threshold for selectng label change P 0 4. The coolng rato c (0 <c< Output: The consensus clusterng λ Process:. Select a canddate clusterng λ by some combnaton methods, and set the temperature T = T 0.. Start a loop wth all obects set unvsted (v(t = 0, t =,..., n. Randomly select an unvsted obect x t from X, and change the label λ(x t to the other k labels. If a label change s selected accordng to (8, we mmedately set v(t = and try a new unvsted obect. If there s no label change for x t,wealsosetv(t =and go to a new obect. The loop s stopped untl all obects are vsted. 3. Set T = c T, and go to step. If there s no label change durng two successve loops, stop the algorthm and output λ = λ. Data Sets The detals of the four data sets used n the experments are summarzed n Table. Two artfcal data sets, -sprals and half-rngs, are shown n Fgure, whch are dffcult for any centrod based clusterng algorthms. We also use two real-lfe data sets, rs and wne data, from UCI benchmark repostory. Snce the last feature of wne data s far larger than the others, we frst regularze them nto an nterval of [0, 0]. Note that the other three data sets keep unchanged. Table : Detals of the four data sets. The average clusterng error s obtaned by the k-means algorthm. Data sets #features k n Avg. error (% -sprals 90 4.5 half-rngs 500 6.4 rs 4 3 50.7 wne 3 3 78 8.4 The average clusterng errors by the k-means algorthm for 0 ndependent runs on the four data sets are lsted n Table, whch are consdered as baselnes for those consensus functons. As for the regularzaton of wne data, the average error by the k-means algorthm can be decreased from 36.3% to 8.4% for 0 ndependent runs. Here, we evaluate the performance of a clusterng algorthm by matchng the detected and the known parttons of the data sets ust as (Topchy, Jan, and Punch 005. The best possble matchng of clusters provdes a measure of perfor- 668
0.5 0 0.5.5.5.5 0.5 0 (a 0.8 0.6 0.4 0. 0 0. 0.4 0.6 0.8.5 0.5 0 0.5.5 Fgure : Two artfcal data sets dffcult for any centrod based clusterng algorthms: (a -sprals; (b half-rngs. (b Table 3: Average error rate (% on the -sprals data set. The k-means algorthm randomly selects k [4, 7] to generate r parttons for dfferent combnaton methods. 0 37.5 39.6 38.7 45. 45. 46.8 39.3 0 35.9 37.8 37.3 43.8 44.4 47.8 37.6 30 36.0 37.0 39.3 4. 43.6 47.3 40. 40 37.6 39.7 37.6 40.8 4. 46.9 38.4 50 36. 39. 36. 4.8 43.9 44.4 36.4 mance expressed as the msassgnment rate. To determne the clusterng error, one needs to solve the correspondence problem between the labels of known and derved clusters. The optmal correspondence can be obtaned usng the Hungaran method for mnmal weght bpartte matchng problem wth O(k 3 complexty for k clusters. Selecton of Parameters and Algorthms To mplement our smulated annealng framework for clusterng ensemble, we have to select two mportant parameters,.e., the threshold P 0 for selectng label change and the coolng rato c (0 <c<. When the coolng rato c takes a larger value, we may obtan a better soluton but the algorthm may converge slower. Meanwhle, when the threshold P 0 s larger, the algorthm may converge faster but the local optma may be avoded at a lower probablty. To acheve a tradeoff between the clusterng accuracy and speed, we smply set P 0 =0.85 and c =0.99 n all the experments. Moreover, the temperature T s ntalzed by T =0.S 0 where S 0 s the ntal value of obectve functon. Our three smulated annealng methods (.e. SA-RI, SA- JI, and SA-WI for clusterng combnaton are also compared to four other consensus functons:. k-modes algorthm for consensus clusterng n ths paper, whch s orgnally developed to make categorcal clusterng (Huang 998.. EM algorthm for consensus clusterng va the mxture model (Topchy, Jan, and Punch 005. 3. QMI approach descrbed n (Topchy, Jan, and Punch 003, whch s actually mplemented by the k-means algorthm n the space of specally transformed cluster labels of the gven ensemble. 4. MCLA whch s a hypergraph method ntroduced n (Strehl and Ghosh 00. Note that our methods are ntalzed by k-modes ust because ths algorthm runs very fast, and other consensus functons can be used as ntalzatons smlarly. Snce the co-assocaton methods have O(n complexty and may lead to severe computatonal lmtatons, our methods are not compared to these algorthms. The performance of the co-assocaton methods has been already analyzed n (Topchy, Jan, and Punch 003. The code s avalable at http://www.strehl.com Table 4: Average error rate (% on the half-rngs data set. The k-means algorthm randomly selects k [3, 5] to generate r parttons for dfferent combnaton methods. 0 0.4.4 0.3 6.9 6.4 5.7 4.6 0 8.5.5 3.5 7.7 4.4 5.3 9.9 30 8. 0.4 9.0 5. 6.9 4.6 4.9 40 7.6 7.7 9. 8.5 7.5 5.9 3.5 50 8.3 9.4 0.0 9.3 8.5 6.6.7 The k-means algorthm s used as a method of generatng the parttons for the combnaton. Dversty of the parttons s ensured by: ( ntalzng the algorthm randomly; ( selectng the number of clusters k randomly. In the experments, we actually gve k a random value around the number of true natural clusters k (k k. We have found that ths method of generatng parttons leads to better results than that only by random ntalzaton. Moreover, we vary the number of combned clusterngs r n the range [0, 50]. Comparson wth Other Consensus Functons Only man results for each of the four data sets are presented n Tables 3 6 due to space lmtatons. Actually, we have ntalzed our smulated annealng methods by other consensus functons besdes k-modes, and some smlar results can be obtaned. Here, the tables report the average error rate (% of clusterng combnaton from 0 ndependent runs. Frst observaton s that our smulated annealng methods (especally SA-RI perform generally better than other consensus functons. Snce our methods only lead to slghtly hgher clusterng errors n a few cases as compared wth MCLA, we can thnk our methods preferred by overall eval- Table 5: Average error rate (% on the rs data set. The k- means algorthm randomly selects k [3, 5] to generate r parttons for dfferent combnaton methods. 0 0.7 0.7 0.6 3.4.3 4.3 0.4 0 0.6 0.9 0.8.9 7.5 4.8 0.6 30 0.7 0.7 0.9 3. 8..3 0.5 40 0.7.8 0.7.6 6.6 3.9 0.7 50 0.7 0.7 0.7 9.9 6.9.6 0.7 669
Table 6: Average error rate (% on the wne data set. The k-means algorthm randomly selects k [4, 6] to generate r parttons for dfferent combnaton methods. 0 6.5 6.7 6.5.3 7. 8.8 7.6 0 6.5 6.5 6.3.4 7.9 0.4 8.5 30 6.4 6.3 6.3.4 3. 7.5 7.4 40 6.3 6.3 6. 0. 7. 7.4 7.5 50 6.3 6. 6. 8.. 7.3 7.8 Corrected Rand Index 0.8 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35 0 0 0 30 40 50 60 70 Number of Loops 0. 0 0 40 60 80 00 0 40 60 80 00 Number of Loops (a (b Fgure : The ascent of corrected Rand ndex on two reallfe data sets (only SA-RI consdered: (a rs; (b wne. uaton. Among our three methods, SA-RI performs the best generally. All co-assocaton methods are usually unrelable wth r<50 and ths s where our methods are postoned. The k-modes, EM, and QMI consensus functons all have the local convergence problem. Snce our methods are ust ntalzed by k-modes, we can fnd that local optma are successfully avoded due to the smulated annealng optmzaton scheme. Fgure further shows the ascent of corrected Rand ndex on two real-lfe data sets (only SA-RI wth r =30consdered durng optmzaton. Moreover, t s also nterestng to note that, as expected, the average error of consensus clusterng by our smulated annealng methods s lower than average error of the k- means clusterngs n the ensemble (Table when k s chosen to be equal to the true number of clusters k. Fnally, the average tme taken by our three methods (Matlab code s less than 30 seconds per run on a GHz PC n all cases. As reported n (Strehl and Ghosh 00, experments wth n = 400, k =0, r =8average one hour usng the greedy algorthm based on normalzed MI (smlar to our methods. However, our methods only take about 0 seconds n ths case,.e., our methods are computatonally feasble n spte of the costly annealng procedure. Corrected Rand Index 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.5 Conclusons We have proposed a fast smulated annealng framework for combnng multple clusterngs based on some measures for comparng clusterngs. When the obectve functons of clusterng ensemble are specfed as those measures based on the relatonshp of pars of obects n the data set, we can then update them ncrementally for each sngle label change, whch makes sure that the proposed smulated annealng optmzaton scheme s computatonally feasble. The smulaton and real-lfe experments then demonstrate that the proposed framework can acheve superor results. Snce clusterng ensemble s actually equvalent to categorcal clusterng, our methods wll further be evaluated n ths applcaton n the future work. Acknowledgements Ths work was fully supported by the Natonal Natural Scence Foundaton of Chna under Grant No. 6050306, the Beng Natural Scence Foundaton of Chna under Grant No. 40805, and the Program for New Century Excellent Talents n Unversty under Grant No. NCET-06-0009. References Denoeud, L., and Guénoche, A. 006. Comparson of dstance ndces between parttons. In Proceedngs of the IFCS 006: Data Scence and Classfcaton, 8. Dudot, S., and Frdlyand, J. 003. Baggng to mprove the accuracy of a clusterng procedure. Bonformatcs 9(9:090 099. Fgueredo, M. A. T., and Jan, A. K. 00. Unsupervsed learnng of fnte mxture models. IEEE Trans. on Pattern Analyss and Machne Intellgence 4(3:38 396. Fscher, R. B., and Buhmann, J. M. 003. Path-based clusterng for groupng of smooth curves and texture segmentaton. IEEE Trans. on Pattern Analyss and Machne Intellgence 5(4:53 58. Fred, A. L. N., and Jan, A. K. 005. Combnng multple clusterngs usng evdence accumulaton. IEEE Trans. on Pattern Analyss and Machne Intellgence 7(6:835 850. Huang, Z. 998. Extensons to the k-means algorthm for clusterng large data sets wth categorcal values. Data Mnng and Knowledge Dscovery :83 304. Hubert, L., and Arabe, P. 985. Comparng parttons. Journal of Classfcaton :93 8. Rand, W. M. 97. Obectve crtera for the evaluaton of clusterng methods. Journal of the Amercan Statstcal Assocaton 66:846 850. Strehl, A., and Ghosh, J. 00. Cluster ensembles - a knowledge reuse framework for combnng parttonngs. In Proceedngs of Conference on Artfcal Intellgence (AAAI, 93 99. Topchy, A.; Jan, A. K.; and Punch, W. 003. Combnng multple weak clusterngs. In Proceedngs of IEEE Internatonal Conference on Data Mnng, 33 338. Topchy, A.; Jan, A. K.; and Punch, W. 005. Clusterng ensembles: models of consensus and weak parttons. IEEE Trans. on Pattern Analyss and Machne Intellgence 7(:866 88. Unnkrshnan, R.; Pantofaru, C.; and Hebert, M. 007. Toward obectve evaluaton of mage segmentaton algorthms. IEEE Trans. on Pattern Analyss and Machne Intellgence 9(6:99 944. Wallace, D. L. 983. Comment on a method for comparng two herarchcal clusterngs. Journal of the Amercan Statstcal Assocaton 78:569 576. 670