Cluster Ensemble and Its Applications in Gene Expression Analysis

Size: px
Start display at page:

Download "Cluster Ensemble and Its Applications in Gene Expression Analysis"

Transcription

1 Cluster Ensemble and Its Applcatons n Gene Expresson Analyss Xaohua Hu, Illho Yoo College of Informaton Scence and Technology Drexel Unversty, Phladelpha, PA 904 thu@cs.drexel.edu Abstract Huge amount of gene expresson data have been generated as a result of the human genomc proect. Clusterng has been used extensvely n mnng these gene expresson data to fnd mportant genetc and bologcal nformaton. Obtanng hgh qualty clusterng results s very challengng because of the nconsstency of the results of dfferent clusterng algorthms and nose n the gene expresson data. Many clusterng algorthms are avalable and dfferent clusterng algorthms may generate dfferent clusterng results due to ther bas and assumptons. It s a challengng and dauntng task for the genomc researchers to choose the best clusterng algorthm and generate the best clusterng results for ther data sets. In ths paper, we present a cluster ensemble framework for gene expresson analyss to generate hgh qualty and robust clusterng results. In our framework, the clusterng results of ndvdual clusterng algorthm are converted nto a dstance matrx, these dstance matrces are combned and a weghted graph s constructed accordng to the combned matrx. Then a graph parttonng approach s used to cluster the graph to generate the fnal clusters. The experment results ndcate that cluster ensemble approach yelds better clusterng results than the sngle best clusterng algorthm on both synthetc data set and yeast gene expresson data set.. Keywords: cluster ensemble, gene expresson analyss, graph partton Introducton Clusterng s to group analogous elements n a data set n accordance wth ts smlarty. Therefore good clusterng means elements n each cluster are smlar whle elements from dfferent clusters are dssmlar. Unlke classfcaton, clusterng does not requre the class label nformaton about data set because t s nherently a data-drven approach; that s why clusterng plays a very mportant role as the ntal step n data exploratory analyss. In gene expresson analyss, normally not much pror knowledge s accumulated, so genomc researcher tend to apply clusterng algorthms on the gene expresson data sets to gan better understandng and nsghtful Copyrght 004, Australan Computer Socety, Inc. Ths paper appeared at the nd Asa-Pacfc Bonformatcs Conference (APBC004), Dunedn, New Zealand. Conferences n Research and Practce n Informaton Technology, Vol. 9. Y-Png Phoebe Chen. Ed. Reproducton for academc, not-for proft purposes permtted provded ths text s ncluded. genetc and bologcal nformaton. Clusterng s one of most wdely and frequently used data mnng technologes n gene expresson analyss (Azuae 00, Bellaacha, Portnoy, Chen and Elkahloun 00, Ben-Dor and Yakhn 999, Berrar, Dubtzky and Granzow 00, Bloch and Arce 00, Xng and Karp 00, Zeng, Tang, Garca-Fras and Gao 00, Zhao and Karyps 003). Through the use of clusterng algorthms on gene expresson data can answer some challengng bologcal and genetc questons, such as dentfyng the functonalty of genes, fndng out what genes are co-regulated, dstngushng the mportant genes between abnormal tssue and normal tssues etc (Zhao and Karyps 003). There are multple clusterng technques that can be used to analyse gene expresson data. Advantages and lmtatons may depend on factors such as data dstrbuton, pre-processng procedures, number of genes etc. Choosng the best algorthm for a partcular problem may represent a challengng task. Moreover, t s not uncommon to observe nconsstent results when dfferent clusterng methods are tested on a partcular data set. K-Means, Self-Organzng Map (SOM), Herarchcal approaches, Fuzzy C-Means, etc, are very dfferent n some cases (Jan, Murty and Flynn 999). Ths s because clusterng methods have ther own bas and functon crteron. For example, the popular K-means algorthm performs mserably n several stuatons where the data cannot be accurately characterzed by a mxture of K Gaussan wth dentcal covarance matrces. It s well known that no sngle clusterng algorthm that performs best across varous data sets and t s very challengng to choose the best clusterng algorthm for gene expresson analyss. Desgn, performance evaluaton, and applcaton of clusterng algorthm on gene expresson data must take nto account the data characterstcs and randomness arsng from both bologcal and expermental varablty. Instead of focusng on developng sngle clusterng algorthm only work for a narrow-range of data sets, n ths paper, we take a dfferent approach. We present a unfed cluster ensemble framework to combne the clusterng results from varous clusterng algorthms. In our approach, the clusters ensemble problem s converted to a graph parttonng problem. A dstance matrx s frst constructed based on the cluster results from each ndvdual clusterng algorthm; these dstance matrces are combned to form a master dstance matrx. Then a weghted graph s constructed from the master dstance matrx and a graph-based parttonng algorthm s appled to the graph for the fnal clusterng results. The cluster ensemble bulds a robust clusterng portfolo that can perform reasonable well over a wde range of data sets wth lttle hand-tunng.

2 Our experment results on both synthetc data sets as well as gene expresson data sets ndcate that the clusterng qualty of the ensemble approach sgnfcantly outperforms the best ndvdual clusterng algorthm. The rest of the paper s organzed as follows. In Secton, we gve a bref overvew of the varous clusterng approaches and summarze the related work. We present our cluster ensemble algorthm n detals and expermental tests n Secton 3. We conclude wth our future plan and dscusson n Secton 4. Related Work Classfcaton ensemble approaches such as baggng and boostng have been proved very popular and effectve n supervsed learnng to mprove the learnng accuracy (Detterch 00, Hu 00). Followng the same phlosophy, the goal of cluster ensemble s to combne the clusterng results of multple clusterng algorthms to obtan better qualty and robust clusterng results. Generatng hgh qualty clusterng result s very challengng n gene expresson analyss because of the nose n the expermental data and the nconsstency among the dfferent clusterng algorthms. Even though many clusterng algorthms have been developed (Han and Kamber 00, Hartgan 975, Jan, Murty and Flynn 999), not much work s done n cluster ensemble n data mnng and machne learnng lterature compared wth classfcaton ensemble method. Zeng et al. (Zeng, Tang, Garca-Fras and Gao 00) proposed an adaptve meta-clusterng approach for combnng dfferent clusterng results. In ther research, they converted the ndvdual cluster results nto a dstance matrx and then combne the dstance matrx and apply a herarchcal clusterng to recluster the combned dstance matrx. Strethl et al. (Strehl and Ghosh 00) proposed a hypergraph-parttoned approach to combne dfferent clusterng results. Each cluster n an ndvdual clusterng algorthm s treated as a hyperedge. Ths crsp hypergraph lost much useful nformaton, and t s not sutable for ambguous and nosy envronment. It s very hard to fnd the optmal way of combnng clusters. It s consdered that ths s a natural phenomenon because each obect has varous characterstcs and a group of varous obects can be parttoned n several ways based on the many peculartes. For example, consder people n a unversty. We can group them nto so many groups based on gender, natonalty, poston (faculty, staff, student we can also group faculty and student n fne grade; for faculty group there are full professor, assocate professor, assstant professor, etc and for students there are graduate student and undergraduate student), etc. It s hard to say whch clusterng result s the best. We observe even though varous clusterng algorthms present dfferent types of knowledge concernng the clusterng crteron, most clusterng crtera n varous algorthms are compensatve rather than compettve n gene expresson analyss. We beleve that an effectve combnaton of several clusterng algorthms s an mportant way to mprove the clusterng qualty, but cluster ensemble s dfferent from the classfcaton ensemble. Some of the maor ssues of cluster ensembles addressed n the proposed research are how to combne dfferent clusterng results and how to ensure symmetrcal and unbased consensus wth regard to all the component parttons. The man dffcultes are: () the qualty of a clusterng combnaton algorthm cannot be evaluated as precsely as a combnng classfer, and () varous clusterng algorthms always produce results wth large dfferences due to dfferent clusterng crtera. Drectly combnng the clusterng results wth ntegraton rules such as product, sum and maorty vote cannot generate a good meanngful result. A new mechansm to combne the dfferent cluster results s needed to obtan better clusterng results. We propose a graph-based meta-clusterng approach to extract the nformaton from results of dfferent clusterng technques, so a better nterpretaton of the data dstrbuton can be obtaned. A dstance matrx s constructed to represent the statstcal nformaton of each cluster produced by varous clusterng technques. Our method ncorporates multple cluster-based dstance matrces nto a weghted graph. A graph based clusterng algorthm s used to cluster the graph for the fnal clusterng results. 3 Cluster Ensemble Based on Smlarty-Graph The motvatons for developng cluster ensembles are to mprove the qualty and robustness of results. There are two reasons for ths: () the results of clusterng are easly corrupted by the addton of nose, whch s very common n gene expresson analyss as the expermental measurement may not be very accurate or error may be ntroduced by the data transformaton, () the clusterng results of dfferent clusterng methods can vary sgnfcantly n the same data set, that ndcates that there could be a great potental for mprovement when usng an ensemble for the purpose of mprovng clusterng qualty. The purpose of cluster ensemble s to buld a robust clusterng portfolo that can perform as good as f not better than the sngle best clusterng algorthm across a wde-range of data sets. Dfferent clusterng algorthm may take a dfferent approach. For example, K-means s to group the data set so that the total Mean Square Error to the center of each cluster s mnmum whle graph-based parttonng clusterng s to partton the graph nto K parts based on the mnmum edge weght cuts. Thus a cluster ensemble can be used to generate many cluster results usng varous clusterng algorthms and then ntegrate them usng a consensus functon to yeld stable results. In ths secton we dscuss our novel cluster ensemble approach to combne the clusterng results from varous clusterng algorthms. We present a two-phase clusterng combnaton strategy. At the frst step, varous clusterng algorthms are run aganst the same data sets to generate clusterng results. At the second step, these clusterng results are combned by an auto-assocatve addtve system based on the dstance matrx of graph clusterng. The dagram below summarzes our approach.

3 Fgure Cluster Ensemble Archtecture In our approach, a dstance matrx s frst constructed based on the cluster results from each ndvdual clusterng algorthm; these dstance matrces are combned to form a master dstance matrx. Then a weghted graph s constructed from the master dstance matrx and a graph-based parttonng algorthm s appled to the graph for the fnal clusterng results. Graph-based clusterng uses varous knds of geometrc structure or graphs for analyzng data. Dfferent graphs reflect varous local structure or nherent vsual characterstc n the data set. Clusterng dvdes the graph nto connected components by dentfyng and deletng nconsstent edges, and each subgraph consstng of connected components refers to a cluster. In the subsectons below, we explan our cluster ensemble step by step. We frst dscuss the cluster valdaton ndces, whch help answer the tough queston of how many clusters n the data sets, bref descrbe the clusterng methods ntegrated n our framework, and then we explan the cluster ensemble mechansm and clusterng result evaluaton. 3. Clusterng Methods Many clusterng algorthms have been developed from computer scence and other dscplnes such as data mnng, machne learnng, pattern recognton and statstcs, to name a few. Clusterng algorthms can be roughly classfed nto herarchcal methods and non-herarchcal methods. Non-herarchcal method can also be dvded nto four categores; parttonng methods, densty-based methods, grd-based methods, and model-based methods (Han and Kamber 00, Jan, Murty and Flynn 999). Herarchcal methods proceed successvely by ether mergng the smaller clusters nto large clusters or splttng the larger clusters. The methods yeld a dendrogram or a tree of clusters representng how the clusters are related. Parttonng methods generate ntal k clusters and mprove the clusters by teratvely reassgnng elements among k clusters. The number of k and teraton s user nput. K-means and K-medods (Parttonng Around Medods (PAM) and Clusterng LARge Applcatons (CLARA)) (Jan, Murty and Flynn 999) belong to ths category. Self-Organzng Map (SOM) as a model-based method was developed for better speech recognton by Teuvo Kohonen n the early 980s (Kohonen 000). Fuzzy C-means as one of fuzzy clusterng methods has been developed by Bezdek (Bezdek 98, Bezdek and Pal 99) by generalzng Dunn s dea (Dunn 974). In our experment we ntegrated three clusterng algorthms; K-means, Self-Organzng Map (SOM), and Fuzzy C-means as our ntal mplementaton and more complementary clusterng algorthms can be added wthout any changes to the archtecture of the ensemble framework. 3. Clusterng Ensemble Algorthm Based on mult-obectve programmng: a smple strategy of desgnng clusterng ensemble algorthm s based on mult-obectve programmng that seeks a soluton to satsfy multple clusterng crtera. Mult-obectve programmng can be transformed to sngle obectve programmng by a weghtng method, whch s employed n our algorthm. Our algorthm to ensemble clusters s. Algorthm : Cluster Ensemble Based on Smlarty-Graph (CESG) Input: () the data set X={x,x,x 3,,x n ), () the upper bound of the cluster number k, (3) edge threshold value δ (4) a set of clusterng algorthms C (q) Output: the fnal clusterng result C (opt) Method: Step : Run the ndvdual clusterng algorthm C (q) multple tmes on the same data set under dfferent cluster numbers (clusters vares from to k). Step : Pck up the optmal number of clusters for each data set usng three cluster valdaton ndces (Slhouette ndex, Dunn ndex, and C ndex). If the number s not consstent, use votng strategy to choose the number wth the maorty as the number of the clusters. Step 3: Construct a dstance matrx (DM) for the clusterng results for each clusterng algorthm. (DM represents the smlarty of two data x and x ponts) Step 4: Combne the dstance matrxes by addng them nto one master dstance matrx (MDM) Step 5: Construct a weghted graph based on the dstance matrx. (There s an edge between data pont x and x f the value MDM of x and x s greater than

4 some threshold value δ, MDM s also the weght of the edge lnk x and x ) Step 6: Cluster the graph nto the optmal number of clusters based on the cluster number chosen at Step End In Step 3, there are so many ways to construct the dstance matrx based on cluster results from ndvdual clusterng algorthm. We propose a soluton based on statstcal theory. Here we assume that our data set s n Gaussan dstrbuton as n (Zhao and Karyps 003). Cluster-based dstance matrx DM (q) for the clusterng result C (q). DM (q) s a par-wse dstance matrx defned between two data ponts accordng to the clusterng result. Ths dstance s able to effcently extract the statstcal nformaton from the obtaned cluster structure. The matrx sze s nxn. Snce ts sze s ndependent of the clusterng approach, t provdes a way to algn the dfferent clusterngs onto the same space even for some stuatons where the numbers of clusters are dfferent for dfferent clusterng algorthms. Assume that nput data set are X={x, x, x 3, x n }, and the cluster algorthm generates m clusters for the data set X. Clusterng result s S={s, s, s 3,, s m }, where s s the th clusters consstng of some data ponts n X. For example, X={x, x, x 3, x 9 } and S={s, s, s 3 }, s = {x, x 5 }, s ={x, x 3, x 7 }, s 3 ={x 4, x 6, x 8, x 9 }. We assume that probablty densty functon of s s gven by p(x s ), the posteror probablty of cluster s gven x can be expressed as: p( x s ) P( s ) P( s x ) = m, where p( x s ) P( s ) P( x s k = exp[ ) = ( k m k T x µ ) (π ) ( x µ )], m s the number of clusters. s a matrx of co-varances among attrbutes n cluster, µ s the mean vector of the data ponts n the cluster s. For example, to calculate P(s x ), gven the followng elements n a cluster s = ={x, x 3, x 7 }, (assume each x has three condtons), Att_ Att_ Att_3 x x x x 3 x 3 x 3 x 3 x 33 x 7 x 7 x 7 x 73 s shown as = where s covarance between Att_ and Att_ and s varance of Att_ n s For each data pont x, we calculate the correspondng probablty vector PX = {P(s x ), P(s x ), P(s m x )}, where =,..m P(s x ) =, the probablty vectors form a probablty space of dmenson of m, wth each dmenson correspondng to one cluster. The probablty space contans nformaton from both the nput data and the cluster results. So we beleve the smlarty of any two ponts PX l and PX m n the probablty space s a good measurement to reflect the dstance of the correspondng ponts x l and x m n the orgnal space. Then for any two ponts, x l and x m, n the data set, ther dstance s defned as the dstance between PX l and PX m, namely, DM (q) (x l,x m ). Many dfferent dstance measures such as Eucldean dstance, Mahalanobs dstance or correlaton dstance can be used to calculate DL(PX l,px m ). We defne the smlarty of two ponts (x and x ) n the data set as DM ( q) = PX PX PX PX N ( ) ( PX ) PX PX PX N N In step 6, a graph-based clusterng algorthm s appled to the weghted graph for the fnal clusterng result. Many graph-based parttonng algorthms can be used for ths purpose, such METIS, HMETIS (Kayyps and Kumar 998a). Clusterng dvdes the graph nto connected components by deletng edges based on some constrant such as mnmum cuts, and each subgraph consstng of connected components refers to a cluster. In our experment, we chose the graph parttonng-based algorthm METIS (Kayyps and Kumar 998a, Kayyps and Kumar 998b) because of ts good performance and scalablty. 3.3 Clusterng Result Evaluaton To evaluate the qualty of cluster s a non-trval and often ll-posed task. Generally speakng, there are nternal crtera and external crtera. Internal crtera formulate qualty as a functon of the gven data and/or smlartes. For example, the mean squared evaluaton crteron (for k-means) and other measures of compactness are popular evaluaton crtera. Measure can also be based on solaton such as the mn-cut crteron, whch uses the sum of edges weghts across clusters (for graph portonng). When usng nternal crtera, clusterng becomes an optmzaton problem, and a clusterer can evaluate ts own performance and tune ts results accordngly. External crtera on the other hand mpose qualty by addtonal, external nformaton not gven to the clusteres, such as category labels. Ths s sometmes more approprate snce groupngs are ultmately evaluated externally by humans. For example, when obects have already been categorzed by an external source,.e., when class labels are avalable, we can use nformaton theoretcal measure to quantfy the match between the categorzaton and the clusterng. In our cluster ensemble, external crtera ft very well wth our archtecture. We use the Mnkowsk Score (Ben-Hur and Guyon 003) as our

5 cluster qualty ndcator. Below s our formula for the clusterng qualty evaluaton. A clusterng soluton for a set of n elements can be represented by an nxn matrx C where C = ff x and x are n the same cluster accordng to the soluton and C =0 otherwse. A measure of Mnkowsk Score (MS) between the clusterng results C (h) from a partcular clusterng algorthm CA h wth a reference clusterng T (or alternatvely, the true clusters f the cluster nformaton n the data set s known n advance) s defned as MS (T, C (h) ) = T-C (h) / T, where T = sqrt( T ) The Mnkowsk score s the normalzed dstance between the two matrces. Hence a perfect soluton wll obtan a score zero, and the smaller the score, the better soluton. We abbrevate the set of cluster groupngs from r dfferent clusterng algorthms as Ψ = { C (q) q {,,r}}. The average MS score of combned clusterng result C wth the Ψ s defned as MS ( ANMI ) ( C, Ψ) = r 3.4 Expermental Results r q= MS( C, C ( q) We conduct some experment study on both data sets from the UCI machne learnng repostory ( and yeast gene data set ( UCI Data Sets Irs data set, Pen dgt data set, and Vowel data set were used for our experment. The followng table shows clusterng result of the three clusterng algorthms and cluster ensemble. IRIS data set(3 clusters) K-means SOM C-mean Ensemble The followng table represents the element (or pont) compostons of rs data set n the true clusters and the best clusters usng cluster ensemble. Element composton IRIS data set Cluster Cluster Cluster3 True clusters ) Best clusters usng cluster ensemble Group Group 0 0 Group3 0 0 Group 0 0 Group Group3 0 Group 0 0 Group 0 Group The followng tables show clusterng results of the three clusterng algorthms and cluster ensemble technology for pen dgts data set and vowel data set. Pen Dgts data set(0 clusters) K-means SOM C-mean Ensemble Vowel data set ( clusters) K-means SOM C-mean Ensemble Yeast gene data set There are 6 genes n the data sets but not every gene s classfed nto a certan functon famly. In our experment we consdered the genes n a functon famly as one cluster and created 6 data sets (cluster, 3, 4, 5, 6, 7). Table shows 6 functon famles of yeast gene and how we construct the sx data sets (C, C3, C4, C5, C6, and C7) for our cluster ensemble comparson. For example, C3 means the cluster set has 3 clusters (ATP synthess, mtoss, and vacuolar proten targetng here) Functon Famles # of genes ATP synthess 9 mtoss 9 vacuolar proten targetng 9 slencng 0 fatty acd metabolsm 0 meoss phospholpd metabolsm TCA cycle proten processng 7 DNA repar 9 proten foldng 30 nuclear proten targetng 3 sgnalng 3 maor facltator superfamly 3 mrna splcng 34 chromatn structure 4 DNA replcaton 4 Cluster Sets C3 C5 C4 C C6 Table Some of Yeast gene functon famly Table shows the clusterng results ncludng cluster ensemble n Mnkowsk scores (MS) for each cluster set. As clearly ndcated by the MS values of the clusters, the cluster ensemble method made sgnfcant mprovement of qualty of the clusterng results over the ndvdual clusterng algorthm on all the sx gene data sets. For example, the best ndvdual clusterng algorthm for C3 s K-means (MS=0.890), whle the cluster ensemble has MS= For C5, the best ndvdual clusterng algorthm s SOM (MS =.4) and the cluster ensemble reduced them to MS =.059. Cluster K-mean SOM set # s C-means Ensemble C C C C C7

6 C C Table Clusterng Results of Yeast gene data sets 4 Concluson In ths paper we present a novel cluster ensemble approach for combnng clusterng results from multple cluster algorthms. The experment results on UCI machne learnng data and gene expresson data ndcate that the cluster ensemble approach can generate better qualty and robustness clusters compared wth sngle best clusterng algorthm. Clusterng ensemble s a new and very promsng research area. There are a lot of open problems for future research. We plan to expand our ensemble approach to ntegrate feature selecton for clusterng very hgh dmensonal data set and add some nference mechansm to automatcally nfer vald nformaton from the clusterng results and hope to report our fndngs n the future. 5 References Azuae, F. (00): In Slco Approaches To Mcroarray-Based Dsease Classfcaton and Gene Functon Dscovery. Annals of Medcne 34(4): Bellaacha, A., Portnoy, D., Chen, Y. and Elkahloun, A. G. (00): E-CAST: A Data Mnng Algorthm For Gene Expresson Data. BIOKDD Bloch, K. M. and Arce, G. R. (00): Nonlnear Correlaton For The Analyss Of Gene Expresson Data. Proceedngs of the 00 Workshop on Genomc Sgnal Processng and Statstcs, Ralegh, North Carolna. Boley, D., Gn, M., Gross, R., Han, E., Hastngs, K., Karyps, G., Kumar, V., Mobasher, B. and Moore, J. (999): Parttonng-Based Clusterng For Web Document Categorzaton. Decson Support Systems 7: Daves, D. and Bouldn, D. (979): A Cluster Separaton Measure. IEEE Transactons on Patter Recognton and Machne Intellgence (): 4-7. Hu, X. (00): Usng Rough Sets Theory And Database Operatons To Construct A Good Ensemble Of Classfers For Data Mnng Applcatons, IEEE ICDM, Jan, A. K., Murty, M. N. and Flynn, P. J. (999): Data Clusterng: A Revew. ACM Computng Surveys 3(3): Rchard, M.D. and Lppmann, R.P. (99): Neural Network Classfers Estmate Bayesan A Posteror Probabltes. Neural Computaton 3(4): Xng, E. and Karp, R. (00): CLIFF: Clusterng of Hgh-Dmensonal Mcroarray Data Va Iteratve Feature Flterng Usng Normalzed Cuts. Bonformatcs 7: Zeng, Y., Tang, J., Garca-Fras, J. and Gao, G.R. (00): An Adaptve Meta-Clusterng Approach: Combnng The Informaton From Dfferent Clusterng Results, CSB00 IEEE Computer Socety Bonformatcs Conference Proceedngs Detterch, T. G. (000): Ensemble Methods n Machne Learnng. Frst Internatonal Workshop on Multple Classfer Systems, Lecture Notes n Computer Scence -5. Ben-Hur, A. and Guyon, I. (003): Detectng stable clusters usng prncpal component analyss. In Methods n Molecular Bology, M.J. Brownsten and A. Kohodursky (eds.) Humana press, Zhao, Y. and Karyps, G. (003): Clusterng n Lfe Scences, In Functonal Genomcs: Methods and Protocols Khodursky, A. and Brownsten M. (eds). Humana Press. Berrar, D., Dubtzky, W. and Granzow M. (00): A Practcal Approach To Mcroarray Data Analyss. Kluwer Academc Publshers, 00. Bezdek, J. C. and Pal, S.K. (99): Fuzzy Models For Pattern Recognton: Methods That Search for Structures n Data. New York, IEEE Press. Bezdek, J. C. (98): Pattern Recognton Wth Fuzzy Obectve Functon Algorthms. New York, Plenum Press. Han, J. and Kamber, M. (00): Data Mnng: Concepts And Technques. San Francsco, Morgan Kaufmann Publshers. Hartgan, J. A. (975): Clusterng Algorthms. New York, John Wley & Sons, Inc. Kohonen, T. (000): Self-Organzng Maps. New York, Sprnger. Ben-Dor, A., Shamr, R., and Yakhn, Z. (999): Clusterng gene expresson patterns. Journal of Computatonal Bology 6: Dunn, J. (974): Well Separated Clusters And Optmal Fuzzy Parttons. Journal of Cybernetcs 4: Goodman L. and Kruskal, W. (954): Measures of Assocatons For Cross-Classfcatons. J. Am. Stat. Assoc. 49: Hubert, L. and Schultz, J. (976): Quadratc Assgnment As A General Data-Analyss Strategy. Brtsh Journal of Mathematcal and Statstcal Psychology 9: Kayyps, G. and Kumar, V. (998): A Fast and Hgh Qualty Multlevel Scheme for Parttonng Irregular Graphs. SIAM Journal on Scentfc Computng 0(): Kayyps, G. and Kumar, V. (998): Multlevel k-way Parttonng Scheme for Irregular Graphs. Journal of Parallel and Dstrbuted Computng 48(): Strehl, A. and Ghosh, J. (00): Cluster Ensembles - A Knowledge Reuse Framework For Combnng Multple Parttons. Journal on Machne Learnng Research (JMLR) 3:

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

From Comparing Clusterings to Combining Clusterings

From Comparing Clusterings to Combining Clusterings Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (008 From Comparng Clusterngs to Combnng Clusterngs Zhwu Lu and Yuxn Peng and Janguo Xao Insttute of Computer Scence and Technology,

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 2 Sofa 2016 Prnt ISSN: 1311-9702; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-2016-0017 Hybrdzaton of Expectaton-Maxmzaton

More information

Graph-based Clustering

Graph-based Clustering Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component

More information

A fast algorithm for color image segmentation

A fast algorithm for color image segmentation Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au

More information

Clustering algorithms and validity measures

Clustering algorithms and validity measures Clusterng algorthms and valdty measures M. Hald, Y. Batstas, M. Vazrganns Department of Informatcs Athens Unversty of Economcs & Busness Emal: {mhal, yanns, mvazrg}@aueb.gr Abstract Clusterng ams at dscoverng

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China for Database Clusterng Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal: 6085@qq.com Me Zhang Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal:64605455@qq.com Database clusterng

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Clustering Algorithm of Similarity Segmentation based on Point Sorting Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

An Internal Clustering Validation Index for Boolean Data

An Internal Clustering Validation Index for Boolean Data BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 6 Specal ssue wth selecton of extended papers from 6th Internatonal Conference on Logstc, Informatcs and Servce Scence

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Fusion Performance Model for Distributed Tracking and Classification

Fusion Performance Model for Distributed Tracking and Classification Fuson Performance Model for Dstrbuted rackng and Classfcaton K.C. Chang and Yng Song Dept. of SEOR, School of I&E George Mason Unversty FAIRFAX, VA kchang@gmu.edu Martn Lggns Verdan Systems Dvson, Inc.

More information

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis Assgnment and Fuson of Multple Learnng Methods Appled to Remote Sensng Image Analyss Peter Bajcsy, We-Wen Feng and Praveen Kumar Natonal Center for Supercomputng Applcaton (NCSA), Unversty of Illnos at

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms Generatng Fuzzy Ter Sets for Software Proect Attrbutes usng Fuzzy C-Means C and Real Coded Genetc Algorths Al Idr, Ph.D., ENSIAS, Rabat Alan Abran, Ph.D., ETS, Montreal Azeddne Zah, FST, Fes Internatonal

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

KOHONEN'S SELF ORGANIZING NETWORKS WITH "CONSCIENCE"

KOHONEN'S SELF ORGANIZING NETWORKS WITH CONSCIENCE Kohonen's Self Organzng Maps and ther use n Interpretaton, Dr. M. Turhan (Tury) Taner, Rock Sold Images Page: 1 KOHONEN'S SELF ORGANIZING NETWORKS WITH "CONSCIENCE" By: Dr. M. Turhan (Tury) Taner, Rock

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation Internatonal Conference on Logstcs Engneerng, Management and Computer Scence (LEMCS 5) Maxmum Varance Combned wth Adaptve Genetc Algorthm for Infrared Image Segmentaton Huxuan Fu College of Automaton Harbn

More information

An Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment

An Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment IJCSI Internatonal Journal of Computer Scence Issues, Vol. 7, Issue 5, September 2010 ISSN (Onlne): 1694-0814 www.ijcsi.org 374 An Evolvable Clusterng Based Algorthm to Learn Dstance Functon for Supervsed

More information

INTELLECT SENSING OF NEURAL NETWORK THAT TRAINED TO CLASSIFY COMPLEX SIGNALS. Reznik A. Galinskaya A.

INTELLECT SENSING OF NEURAL NETWORK THAT TRAINED TO CLASSIFY COMPLEX SIGNALS. Reznik A. Galinskaya A. Internatonal Journal "Informaton heores & Applcatons" Vol.10 173 INELLEC SENSING OF NEURAL NEWORK HA RAINED O CLASSIFY COMPLEX SIGNALS Reznk A. Galnskaya A. Abstract: An expermental comparson of nformaton

More information

SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS

SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS J.H.Guan, F.B.Zhu, F.L.Ban a School of Computer, Spatal Informaton & Dgtal Engneerng Center, Wuhan Unversty, Wuhan, 430079,

More information

On Supporting Identification in a Hand-Based Biometric Framework

On Supporting Identification in a Hand-Based Biometric Framework On Supportng Identfcaton n a Hand-Based Bometrc Framework Pe-Fang Guo 1, Prabr Bhattacharya 2, and Nawwaf Kharma 1 1 Electrcal & Computer Engneerng, Concorda Unversty, 1455 de Masonneuve Blvd., Montreal,

More information

NIVA: A Robust Cluster Validity

NIVA: A Robust Cluster Validity 2th WSEAS Internatonal Conference on COMMUNICATIONS, Heralon, Greece, July 23-25, 2008 NIVA: A Robust Cluster Valdty ERENDIRA RENDÓN, RENE GARCIA, ITZEL ABUNDEZ, CITLALIH GUTIERREZ, EDUARDO GASCA, FEDERICO

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

K-means Optimization Clustering Algorithm Based on Hybrid PSO/GA Optimization and CS validity index

K-means Optimization Clustering Algorithm Based on Hybrid PSO/GA Optimization and CS validity index Orgnal Artcle Prnt ISSN: 3-6379 Onlne ISSN: 3-595X DOI: 0.7354/jss/07/33 K-means Optmzaton Clusterng Algorthm Based on Hybrd PSO/GA Optmzaton and CS valdty ndex K Jahanbn *, F Rahmanan, H Rezae 3, Y Farhang

More information

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition Optmal Desgn of onlnear Fuzzy Model by Means of Independent Fuzzy Scatter Partton Keon-Jun Park, Hyung-Kl Kang and Yong-Kab Km *, Department of Informaton and Communcaton Engneerng, Wonkwang Unversty,

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information