TOWARDS FUZZY-HARD CLUSTERING MAPPING PROCESSES. MINYAR SASSI National Engineering School of Tunis BP. 37, Le Belvédère, 1002 Tunis, Tunisia

Size: px
Start display at page:

Download "TOWARDS FUZZY-HARD CLUSTERING MAPPING PROCESSES. MINYAR SASSI National Engineering School of Tunis BP. 37, Le Belvédère, 1002 Tunis, Tunisia"

Transcription

1 TOWARDS FUZZY-HARD CLUSTERING MAPPING PROCESSES MINYAR SASSI Natonal Engneerng School of Tuns BP. 37, Le Belvédère, 00 Tuns, Tunsa Although the valdaton step can appear crucal n the case of clusterng adoptng fuzzy approaches, the problem of the partton valdty obtaned by those adoptng the hard ones was not tackled. To cure ths problem, we propose n ths paper fuzzy-hard mappng processes of clusterng whle benefttng from those adoptng the fuzzy case. These mappng processes concern: () local and global clusterng evaluaton measures: the frst for the detecton of the worst clusters to mergng or splttng them. The second relates to the evaluaton of the obtaned partton for each teraton, () mergng and splttng processes takng nto account the proposed measures, and (3) automatc clusterng algorthms mplementng these new concepts.. Introducton Classfcaton problem s a process for groupng a set of data nto groups so that data wthn a group have hgh smlarty, but are very dssmlar to data n other groups. Ths problem s declned n two alternatves: supervsed [,] and unsupervsed approaches [3]. In the frst, we know the possble groups and we have a data already classfed, beng used overall as tranng. The problem conssts n assocatng the data n the most adapted group whle beng useful of those already labeled. In unsupervsed classfcaton, also called clusterng, possble groups (or clusters) are not known n advance, and the data avalable are not classfed. The goal s then to classfy n the same cluster the data consdered as smlar. Snce clusterng s an unsupervsed method, there s a need of some knd of clusterng result qualty valdaton. Ths qualty s udged n general on the bass of two contradctory crtera [4]. The frst supposes that generated clusters must be most varous possble to each other, and the second requres that each cluster have to be most homogeneous possble. The data are grouped nto clusters based on a number of dfferent approaches. Partton-based clusterng and herarchcal clusterng are two of the man technques. Herarchcal clusterng technques [5] generate a nested seres of parttons based on a crteron, whch measures the smlarty between clusters or the separablty of a cluster, for mergng or splttng

2 Mnyar Sass clusters. We can menton, BIRCH (Balanced Iteratve Reducng and Clusterng usng Herarches) algorthm [6], CURE (Clusterng Usng REpresentatves) algorthm [5]. Partton-based clusterng often starts from an ntal partton and optmzes (usually locally) a clusterng crteron. A wdespread accepted clusterng scheme subdvdes these technques n two man groups: hard and fuzzy [3]. The dfference between these s manly the degree of membershp of each data to the clusters. Durng the constructon of the clusters, n the case of the hard group, each data belongs to only one cluster wth a untary degree of membershp, whereas, for the fuzzy group, each data can belong to several clusters wth dfferent degrees of membershp. We menton some algorthm lke k-means [7] and the Fuzzy C-Means (FCM) [8]. In ths work we lmt ourselves to partton-based clusterng methods. Snce most basc clusterng algorthms assume that the number of clusters s a user-defned parameter, whch s dffcult to known n advance n real applcatons. Thus t s dffcult to guarantee that the clusterng result can reflect the natural cluster structure of the datasets. Several work tackled ths problem [9,0,]. When we are confronted wth a problem of determnaton of the number of clusters, we brought to make assumptons on ths last. To prevent user to choose ths number, a soluton conssts n makng teratons untl obtanng an optmal number of clusters. Each teraton tres to mnmze (or maxmze) an obectve functon called valdty ndex [8,,3,4] whch measures the clusterng qualty to choose the optmal partton among all those obtaned wth the varous plausble values of the requred cluster s number. In [,5], clusterng approaches adoptng the fuzzy concept were presented and proven. They are based on mergng and splttng processes. Although these processes can appear crucal n the fuzzy case, the problem of the valdty of the partton obtaned by automatc hard clusterng methods wasn t tackled. To cure ths problem, we propose n ths artcle rules for mappng clusterng for hard parttonbased clusterng whle benefttng from those adoptng the fuzzy partton-based approaches. These mappng rules concern: The defnton of local and global measures for clusterng evaluaton: the frst for the detecton of worst clusters to mergng or splttng them wth others. The second relate to the evaluaton of the obtaned partton n each teraton. The bnarzaton of mergng and splttng processes. The modelng and mplementaton of automatc hard clusterng algorthm takng nto account the new concepts.

3 Towards Fuzzy-Hard Clusterng Mappng Processes The rest of the artcle s organzed as follows. In secton, we dscuss backgrounds related to fuzzy and hard clusterng technques. In secton 3, we present our motvaton. In secton 4, we present the mappng fuzzy-hard processes. Secton 5 gves the expermentaton and fnally, secton 7 concludes the paper and gves some futures works.. Backgrounds Clusterng methods group homogeneous data n clusters by maxmzng the smlarty of the obects n the same cluster whle mnmzng t for the obects n dfferent clusters. To make t easer for the readers understand the deas clusterng technques, we tred to unfy notatons used n ths last. To acheve that, the followng defntons are assumed: representng a set of N obects x n of clusters found. M R, N M X R denotes a set of data tems c denoted the cluster and c denoted the optmal number th In ths secton, we present basc concepts related to fuzzy and hard clusterng... Fuzzy Clusterng Fuzzy clusterng methods allow obects to belong to several clusters smultaneously, wth dfferent degrees of membershp [6]. The data set X s thus parttoned nto C fuzzy subsets [7]. The result s the partton matrx [ ] U µ = for N, c. Several researches were carred out for the automatc determnaton of the number of clusters and the qualty evaluaton [3,5,] of the obtaned parttons. Bezdek [8] ntroduced a famly of algorthms known under the name of Fuzzy C-Means (FCM). Te obectve functon ( J m ) mnmzed by FCM s defned as follows: c ( U, V ) = N m J m = = µ x c U and V can be calculated as : ( ) x m c µ =, c x m c = c = N m ( µ ) = x N m ( µ ) =

4 Mnyar Sass where, µ s the membershp value of the = { c c } a cluster scheme ' = C,,..., c c th th example, x, n the cluster, ' C pk and C pk s not a sngleton, k =,,..., m where C C { C FCM clusterng algorthm s as follows : Algorthm FCM algorthm m =. M Inputs: The dataset X = { x : =,..., N} R, the number of clusters c, the fuzzfer parameter m and Eucldan dstance functon. Outputs: The cluster centers c ( =,,..., c), the membershp matrx U and the elements of each cluster,.e., all the x such that u > u for all k. k Step : Input the number of clusters c, the fuzzffer m and the dstance functon. 0 Step : Intalze the cluster centers c ( =,,..., c) Step 3: Calculate u (,,..., N; =,,..., c) Step 4: Calculate c (,,..., c) =. =. Step 5: If max c 0 0 c ck ck ε then go to step 6; else let c = c ( =,,..., c) and go to step 3. A smpler way to prevent bad clusterng due to nadequate cluster s centers s to modfy the basc FCM algorthm. We start wth a large (resp. small) number of unformly dstrbuted seeds n the bounded M -dmensonal feature space, but we decrease (resp. ncrease) them consderably by mergng (resp. splttng) worst clusters untl the qualty measure stops ncreasng (resp. decreasng). In consequent, compactness and separaton are two reasonable measures makng t possble to evaluate the qualty of obtaned clusters. Automatc fuzzy clusterng algorthm s based on global measure representng the separatoncompactness for clusterng qualty evaluaton. Gven a cluster scheme C = { c c,..., } ' for a dataset X = { x x,..., }, let =, c c, x N C pk and C pk s not C { C sngleton, gven by: ' k =,,..., m where m = C }, the global separaton-compactness, SC, of a cluster scheme C s

5 Towards Fuzzy-Hard Clusterng Mappng Processes k SC = c c mn v ck = k c k µ = x C p, x v x C p, x v ( x ) x c / µ ( x ) th th where u ( x ) s the membershp value of x belongng to cluster. v s the center of cluster c. c s the number of clusters and c N. Whle basng on these measures, mergng (resp. splttng) clusterng algorthm s as follows: Algorthm Iteratf clusterng process M Inputs: A dataset X = { x : =,..., N} R and the Eucldan dstance functon. Outputs: Output C { c c,..., } =, c opt as an optmal cluster centers. Step : Intalze the parameters related to the FCM, c = cmax, c mn =, c opt = N, Step : Apply the basc FCM algorthm to update the membershp matrx ( U ) and the cluster schema. Step 3: Test for convergence; f no, go to step. Step 4: Calculate SC measure. Step 5: Repeat Perform mergng (resp. splttng) process to get canddate cluster centers, decrease c c (resp. ncrease c c + ); perform the bass FCM algorthm based on parameter c to fnd the optmal cluster centers. Calculate SC measure for new clusters, denote t as ' SC ; f ' SC > SC then SC = Untl c. SC, ' c opt = c.... Fuzzy Mergng Process The fuzzy mergng process generally used by earler studes nvolves some smlarty or compatblty measure to choose the most smlar or compatble par of clusters to merge nto one. In our merge process, we choose the worst cluster and delete t. Each element ncluded n ths cluster wll then be placed nto ts own nearest cluster. Then, centers of all clusters wll be adusted. That means, our merge process may affect multple clusters, whch we consder to be more practcal. How to choose the worst cluster? We stll use the measures of separaton and compactness to evaluate ndvdual clusters (except sngleton). Gven a cluster scheme C = { c c,..., c c } for a dataset X = { x x,..., x N }, for each c C, f c s not, sngleton, the local separaton-compactness of c, denoted as sc, s gven by:,

6 Mnyar Sass sc = mn c, k c c k x C, x v x C, x v u ( x ) u ( x ) x c th th where u ( x ) s the membershp value of x belongng to cluster. c s the center of cluster c. c s the number of clusters and c N. A small value of sc ndcates the worst cluster to be merged.... Fuzzy Splttng Process In ths process, we operate by splttng the worst cluster at each stage n testng the number of clusters c from c mn to c max. The global separaton-compactness measure s used. The general strategy adopted for the new algorthm s as follows: at each step of the algorthm, we dentfy the worst cluster and splt t nto two clusters whle keepng the other c clusters. The general dea n the splttng approach s to dentfy the worst cluster and splt t, thus ncreasng the value of c by one. Our maor contrbuton les n the defnton of the crteron for dentfyng the worst cluster. For dentfyng the worst cluster, a score functon S ( ) assocated wth each cluster, as follows: S ( ) N = µ = number _ of _ data _ vectors _ n _ cluster _ In general, when S( ) small, cluster s tends to contan a large number of data vectors wth low membershp values. The lower the membershp value, the farther the obect s from ts cluster center. Therefore, a small S( ) means that cluster s large n volume and sparse n dstrbuton. Ths s the reason that the cluster correspondng to the mnmum of S( ) as the canddate to splt when the value of c s ncreased. On the other hand, a larger S( ) tends to mean that cluster has a smaller number of elements and exerts a strong attracton on them.

7 Towards Fuzzy-Hard Clusterng Mappng Processes.. Hard Clusterng In hard clusterng, the data can be gathered n a table wth N lnes and M columns. If the data belong to a set of clusters, t s possble to assocate to ths data table a membershp table whose values or 0 respectvely mply the membershp of cluster C, wth =,,..., c. One of most known hard clusterng algorthms s k-means. Its goal s to mnmze the dstance from each data compared to the cluster center to whch t belongs. Ths algorthm corresponds to the search of centers c mnmzng the followng crteron: [9] E = c N = = δ x c where δ fx c = 0else k-means algorthm conssts n choosng ntal centers and mprovng the partton obtaned n an teratve way n three steps. Algorthm 3 k-means algorthm M Inputs: A dataset X = { x : =,..., N} R and the Eucldan dstance functon. M Output: Cluster scheme C = { c =,..., c} R :. Step : Intalze the c centers wth randomly values. Step : Affect each data n the nearest cluster x, f x c p x cl for =... N, l, =... c. C : c Step 3: Recalculate the poston of each new center : * = x, where N the cardnalty of center c and c N x c =,..., c. N and k are the sze of data and the number of clusters respectvely. It s necessary to repeat steps and 3 untl convergence,.e. untl the centers do not change. The result of mplementaton of ths algorthm does not depend on the data order nput. It has a lnear complexty [9] and adapts to the large data sets and requres fxng the number of clusters

8 Mnyar Sass whch nfluences the output. The result s senstve to the startng stuaton, as well on the number of cluster as the ntal center s postons. When we are confronted wth a problem of determnaton of the number of clusters, we brought to make assumptons on ths last. To prevent user to choose ths number, a soluton conssts n makng teratons untl obtanng an optmal number of clusters. Each teraton tres to mnmze (or maxmze) an obectve functon called valdty ndex [8,,3,4] whch measures the clusterng qualty to choose the optmal partton among all those obtaned wth the varous plausble values of the requred cluster s number. In ths sub-secton, we present, frstly, some valdty ndces; then, we gve new defntons of separaton and compactness qualty measures applcable n the context of hard partton-based clusterng. The Mean Squared Error (MSE) [0] of an estmator s one of many ways to quantfy the amount by whch an estmator dffers from the true value of the quantty beng estmated. MSE measures the average of the square of the error. The error s the amount by whch the estmator dffers from the quantty to be estmated. The dfference occurs because of randomness or because the estmator doesn't account for nformaton that could produce a more accurate estmate. In clusterng, ths measure corresponds to compactness. MSE = c N N = = δ x c In the case of a hard partton-based clusterng, the Dunn ndex [] takes nto account the compactness and the separablty of the clusters: the value of ths ndex s all the more low snce the clusters are compact and qute separate. Let us note that the complexty of the ndex of Dunn becomes prohbtory as soon as large data sets; t s consequently seldom used. k where D (, ) c and ( ) I Dum mn = max { Dmn ( c, ck )} { D ( c )} max mn c c k s the mnmal dstance whch separates a data from cluster c to data from cluster D max c s the maxmal dstance whch separates two data from cluster c : D mn D ( c, c ) = mn{ x y : x c ety c } max k ( c ) = max{ x y : ( x, y ) c } k

9 Towards Fuzzy-Hard Clusterng Mappng Processes The Daves-Bouldn ndex [] of the clusterng s combned wth the centrod dameters comparson between clusters. In the computaton of the Daves-Bouldn ndex, the centrod lnkage s used as the nter-cluster dstance. The centrod nter-cluster and ntra-cluster measures are selected for compatblty wth the k-means clusterng algorthm used n the sensors (whch essentally computes centrods of clusters at each teraton). Ths ndex takes nto account the compactness and the separablty of the clusters: ts value s all the more low snce the clusters are compact and qute separate. It s defned by the followng expresson: I DB c = max c = l { D c ( c ) + D c ( c k )} D ( c, c ) ce k where D c ( c ) s the average dstance between a data from cluster c and hs center, ce ( c ck ) D, s the dstance whch separates the cluster s centers c and c k. It s defned by the followng expresson: D c N ( c ) = N = x c D ce ( c, ck ) = c ck 3. Motvatons When we are confronted wth a problem of determnaton of the number of clusters, we brought to make assumptons on ths last. To prevent user to choose ths number, a soluton conssts n makng teratons untl obtanng an optmal number of clusters. Each teraton tres to mnmze (or maxmze) an obectve functon called valdty ndex whch measures the clusterng qualty to choose the optmal partton among all those obtaned wth the varous plausble values of the requred cluster s number. The prevously presented ndces are applcable n the case of hard partton-based clusterng and prove ther lmts n determnaton of optmal number of clusters. Moreover, all these measures are global,.e. they allow the evaluaton of all the partton. However, n the case of automatc clusterng where the number of clusters s unknown for the user, we wll need to terate a clusterng algorthm untl obtanng the optmal number of clusters. Therefore we wll have recourse to local evaluaton for the detecton of worst clusters to be merged or splt.

10 Mnyar Sass Whle basng on these prncples, we propose to evaluate, globally and locally, the obtaned to reach the optmum. The global evaluaton makes t possble to udge on the qualty of generated partton whereas the local evaluaton makes t possble to detect the worst clusters. 4. Mappng Processes When we are confronted wth a problem of determnaton of the number of clusters, we brought to make assumptons on ths last. As mentoned n secton, clusterng algorthms allow to partton data n clusters by maxmzng the smlarty of the data n the same cluster whle mnmzng t for data n the varous clusters. Consequently, compactness and separaton are the two reasonable crtera allowng the cluster s qualty evaluaton. The proposed qualty evaluaton measures are based on these measures. Global evaluaton makes t possble to udge on the qualty of generated partton. We propose here new formulatons whch defne hard global compactness and hard global separaton wthn the partton. Hard Global Compactness. Gven a cluster scheme C = { C C,..., } for a data set X = { x x,..., }, let, C c, x N ' C = { C C and C s not a sngleton, cluster scheme C s gven by:,,..., k = wth ' Cmp = k k = C = k Var }, the hard global compactness, Cmp, of the where Var s the varance of th cluster. It s gven by the followng expresson: Var = x C card x c ( C ) where c s the center of cluster C. Hard Global Separaton. The hard global separaton, Sep, of a cluster scheme C { C C,..., } a dataset X { x x,..., } =, C c for =, x N s gven by: c mn c ck = Sep = c

11 Towards Fuzzy-Hard Clusterng Mappng Processes Hard Global Separaton-Compactness. Gven a scheme C = { C C,..., } for a dataset X { x x,..., }, C c =, x N, let C ' = { C C and C s not a sngleton, ' =,,..., k where k = C }, the hard global separatoncompactness, SepCmp, of a cluster scheme C, s gven by: k SepCmp = Sep Cmp c In consequent, the choce of the best partton s obtaned by maxmzng the measure SepCmp. For the determnaton of the optmal number of clusters, we propose two approaches. The frst s based on the mergng prncple, whch we called EMk-means (Enhanced Mergng k-means). We start wth a maxmum number of clusters dentfyng the worst cluster to merge t. c max and whch t decreases durng varous teratons by The second s based on the splttng prncple whch we called ESk-means (Enhanced Splttng k- means). We start wth a mnmum number of clusters c mn and whch t ncrease durng teratons. Ths ncreasng s done by dvdng the clusters havng a maxmum value of varance. The prncple of the two adopted approaches s summarzed n the followng algorthm: Algorthm 3 Iteratve Clusterng Approach Step : Intalze cmax and c mn Step : Apply k-means algorthm. Step 3: Calculate SepCmp. Step 4: For k from cmax Step 5: Calculate SepCmp., Intalze the cluster s centers. to c mn (respectvely cmn to c max ) do Mergng (respectvely splttng) the clusters. Step 6: Determne the optmal value of the number of cluster copt whose value SepCmp s maxmal. 4.. Mergng Process Mappng The local evaluaton s used to dentfy the worst clusters to be merged wth the others. Each data belongng to ths cluster s affected to the nearest one. Then, the centers of all clusters are adusted. To dentfy worst cluster n each teraton, separaton and compactness measures are used for local evaluaton. Whle basng on hard local separaton-compactness measures, we present n ths sub secton mappng rules from fuzzy to hard. Hard local Compactness. Gven a cluster scheme C = { C C,..., } for a dataset X { x x,..., } each, C C, f C s not a sngleton, the hard local compactness of C, denoted as C c =, for, x N cmp, s gven by :

12 Mnyar Sass cmp = Var x C x ( C ) c where Var s the varance of th cluster. Hard Local Separaton. Gven a cluster scheme C = { C C,..., } for a dataset X { x x,..., } hard local separaton, denoted as sep, s gven by:, C c =, x N, the sep = mn c c l c l where c and c l s the centers of clusters C and Cl respectvely. Hard Local Separaton-Compactness. Gven a cluster scheme C { C C,..., } =, C c for a dataset = { x x x N }, for each C C, f C s not a sngleton, the hard local separaton-compactness of X,,..., denoted as sepcmp, s gven by: Thus, the worst cluster s the one wth the least sepcmp = sep cmp sepcmp value. By combnng these measures wth the k-means algorthm, the number of clusters s thus obtaned automatcally. The proposed algorthm, called EMk-means, s based on merge strategy. We start wth a maxmum number of clusters c max and whch t decreases durng varous teratons by dentfyng the worst cluster to merge t. For max c, we adopted the followng Bezdek suggeston [8]: c max = N (N s the sze of the dataset). In each teraton, the algorthm found to maxmze the hard global separaton-compactness measure SepCmp, obtaned for the varous plausble values of the requred number of clusters. The worst cluster s dentfed and merged wth the remanng clusters. The EMk-means algorthm s descrbed below. Algorthm 4 EMk-means algorthm Inputs: A dataset X = { x x,..., }, the Euclden dstance functon., Output: Optmal cluster scheme C { C C,..., } x N =, C c. Step : Intalze the parameters related to k-means algorthm Step : Apply the k-means algorthm. c max = c, c. mn = C,

13 Towards Fuzzy-Hard Clusterng Mappng Processes Step 3: If convergence, then go to step 3, else go to step. Step 4: Calculate SepCmp measure. Step 5: Repeat Apply merge-based procedure for obtan the canddate cluster scheme. Decrease the cluster number c Untl c c. Calculate the SepCmp measure for the new clusters. Step 6: Store SepCmp measure whch value s maxmal. Hard merge-based procedure s presented as follows: Procedure Hard merge-based procedure Input: Cluster scheme C * = { C C,..., }, * Output: Canddate cluster scheme C { C C,..., C } Step : Calculate C c =, c sepcmp measure for each cluster belongng to C. * Step : Delete the worst cluster whch have the mnmal value of sepcmp. Step 3: Assgn the data of ths cluster to the varous remanng clusters Step 4: Calculate the values of the new clusters centers accordng to the medan formula. Step 5: Apply the k-means algorthm to the new clusters. 4.. Splttng Process Mappng The proposed algorthm s based on a constructon strategy,.e., ntalzed wth a mnmum number of clusters c mn, whch s ncremented durng varous algorthm teratons accordng to a splttng process untl obtanng the maxmum cluster number c max = N, (N s the sze of the dataset). In each teraton, SepCmp measure s calculated for the determnaton of the optmal number of clusters c opt. In ths strategy, we select the worst cluster to dvde t nto two new clusters. Each data belongng to ths cluster s then affected n a new cluster of whch we must agan calculate the center. The splttng-based process s carred out by calculatng the varance of each cluster whch s gven by the followng equaton: Var ( C ) = D ( x c ) N C x C

14 Mnyar Sass where, N C : the number of data belongng to the cluster C. D( x c ) : the Eucldan dstance between x data belongng to the cluster C and hs center c. The obectve of the ESk-means algorthm ams at mnmzng the average ntra-cluster dstance, ths s why we choose the cluster correspondng to the maxmum value of V ( C ) as a canddate for the splttng-based process. When the value of Var ( C ) s hgh, the data belongng to cluster C are dspersed,.e. the data tend to move away from center c of cluster C. The splttng-based procedure conssts on dentfyng the worst cluster havng the maxmum value of varance, the set of ts data s noted E, then calculate for each one as of the ts data the total dstance compared to the varous centers of the not selected classes. The frst two data havng a maxmum total dstance of the centers are selected lke ntal centers beng used for startng of the k- means algorthm n order to parttoned E n two new sets E and E. The bnary procedure of dvson s detaled lke followng. Procedure Hard splttng-based procedure Input: Cluster scheme C * = { C C,..., }, Output: Canddate cluster scheme * { C C,..., C c, C } C c C =, c+. Step : Identfy the cluster to be dvded whle calculatng ts varance. Note E the set of data belongng to ths cluster and ts center c 0. Step : Seek n E two data vectors whch ther total dstances separatng them from all the cluster E remanders s maxmum. Note these two vectors c and c. Step 3: Apply the k-means algorthm to new centers c and c. n order to obtan two new parttons E and E. The am of algorthm s to select the worst cluster, to remove t and to replace t by two other new clusters and ths by applyng splttng-based process. At the end of each teraton, SepCmp measure s calculated and the clusters number s ncremented untl obtanng c max cluster whch value of SepCmp s maxmum.. Ths number s allowed to the

15 Towards Fuzzy-Hard Clusterng Mappng Processes The proposed algorthm for the determnaton of the values of the centers and descrbed below: Algorthm ESk-means algorthm Inputs: A dataset X { x x,..., } =, x N, the Eucldan dstance functon. Output: Optmal cluster scheme C { C C,..., } =, C c. copt clusters s Step : Intalze the parameters related to k-means algorthm Step : Apply the k-means algorthm. Step 3: If convergence, then go to step 3, else go to step. Step 4: Calculate SepCmp measure Step 5: Repeat Untl c max = c, c mn =. Apply splttng-based procedure for obtan the canddate cluster scheme. Increase the cluster number c c +. Calculate the SepCmp measure for the new clusters. c = cmax Step 6: Store SepCmp measure whch value s maxmal. 5. Expermentaton 5.. Data of Expermentaton To valdate the proposed approaches n determnaton of the number of clusters and clusterng qualty evaluaton, three data sets are used among the varous data fles placed at the dsposal of the artfcal tranng communty by the Unversty of Calforna wth Irvne (UCI) [3] as well as an artfcal data fle comng from the benchmark Concentrc : Irs: ths data set contans 3 clusters. Each cluster refers to a type of flower of the Irs: Setosa, Verscolor or Vrgnca. Each cluster contans 50 patterns of whch each one has 4 components. The st cluster (Setosa) s lnearly separable compared to the others, the two other clusters (Verscolor and Vrgnca) overlapped. Wne: ths data set counts the results of a chemcal analyss of varous wnes produced n the same area of Italy startng from varous type of vnes. The concentraton of 3 components s ndcated for each of the 78 analyzed wnes whch are dstrbuted as follows: 59 n the st cluster, 7 n the nd cluster and 48 n the 3rd cluster.

16 Mnyar Sass Dabetes: ths data set counts the results of an analyss concernng the dabetes, carred out on 7 donors to dagnose the dsease. The sze of ths data set s equal to 768 patterns dstrbuted n two clusters. 500 for the st cluster and 68 for the nd cluster. Patterns are wth 8 dmensons. Concentrc: ths data set s artfcal and rather complex. It contans 500 patterns, 579 for the st cluster and 9 for the nd cluster. The st cluster s nsde the second. Patterns are wth dmensons. 5.. Comparatve Study In ths sub-secton, we propose to evaluate the proposed qualty measures and clusterng algorthms Clusterng Qualty Evaluatons In ths sub-secton, we propose to evaluate the proposed qualty measures and clusterng algorthms. We present a comparatve study of the proposed clusterng algorthms va qualty evaluaton measures. We represent only some teratons for each data set Table. Results of the evaluaton of the EMk-meanss and ESk-means algorthms va Irs dataset. Iteratons Algorthm MSE I Dum I DB SepCmp EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means As shown n table, only I DB and SepCmp measures determned the optmal number of clusters wth respectve values (0.) and (0.69). Ths s explaned by the fact why the maxmum value (resp. mnmal) of SepCmp measure s assocated to the optmal number of clusters. For the two proposed algorthms (EMk-means and ESk-means), MSE and I Dum measures can not determne ths number.

17 Towards Fuzzy-Hard Clusterng Mappng Processes Table. Results of the evaluaton of the EMk-meanss and ESk-means algorthms va Wne dataset. Iteratons Algorthms MSE I Dum I DB SepCmp EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means As shown n table, only SepCmp measure could determne the optmal number of clusters va the two proposed clusterng algorthms. Wth I DB wth the EMk-means algorthm wth the (0.5). MSE and I Dum wth the two algorthms. measure, the optmal number of clusters s obtaned only measures dd not determne ths number Table 3. Results of the evaluaton of the EMk-meanss and ESk-means algorthms va Dabète dataset. Iteratons Algorthms MSE I Dum I DB SepCmp EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means 0, EMk-means ESk-means

18 Mnyar Sass As shown n table 3, wth the EMk-means algorthm, all measures ( MSE, I Dum, I DB and SepCmp) determne the optmal number of clusters wth the respectve values (0.0), (0.), (0.8), (.63). Wth the ESk-means algorthm, only measures respectve values (0.7) and (.45). I Dum et SepCmp determne ths number wth the Table 4. Results of the evaluaton of the EMk-meanss and ESk-means algorthms va Concentrc dataset. Iteratons Algorthms MSE I Dum I DB SepCmp EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means EMk-means ESk-means As shown n table 4, wth the EMk-means algorthm, all measures ( MSE, I Dum, I DB and SepCmp) determne the optmal number of clusters wth the respectve values (0.), (0.9), (0.), (3.09). Wth the ESk-means algorthm, all used measures could not determne ths number. However, the values obtaned by SepCmp measure va the ESk-means algorthm for and 3 number of clusters are respectvely (.83) and (.8). These two values are very close and almost equal. Based on these results, we can say that the proposed clusterng algorthms always gve more powerful results for any type of dataset.

19 Towards Fuzzy-Hard Clusterng Mappng Processes 5... Dstrbuton Unformty In ths sub secton, we propose to evaluate the two proposed clusterng algorthms n terms of dstrbuton of data n the varous clusters. Table 5. Results of the evaluaton of the dstrbutons of data va EMk-means and ESk-means. Optmal number Dataset of clusters Irs 3 Wne 3 Dabète Concentrc Data dstrbuton Algorthms Realy Obtaned Smlarty EMk-means 50, 50, 50 50, 48, 5.6% ESk-means 50, 47, % EMk-means 6, 7, 47.4% 59, 7, 48 ESk-means 6, 70, % EMk-means 500, , % ESk-means 446, 7.4% EMk-means 57, % 579, 9 ESk-means 567, % As shown n table 5, In the case of Irs data set, the smlarty between really and obtaned data dstrbuton wth algorthm EMk-means (resp. ESk-means) s equal to,6% (resp. 4,0%). In the case of Wne data set, the smlarty between really and obtaned data dstrbuton wth algorthm EMk-means (resp. ESk-means) s equal to,4% (resp. 3,0%). In the case of Dabetes data set, the smlarty between really and obtaned data dstrbuton wth algorthm EMk-means (resp. ESk-means) s equal to 0,8% (resp.,4%). In the case of the Concentrc data set, the smlarty between really and obtaned data dstrbuton wth algorthm EMk-means (resp. ESk-means) s equal to 0,69% (resp.,03%). Compared to the sze of the used data sets, we can conclude that these percentages are very satsfactory f the number of classes s unknown for the users. We also note that the obtaned results whle followng the merge-based strategy are more nterestng than those are gven by the splttng-based strategy About Complexty The theoretcal complexty of SepCmp measure s based on the complexty of two terms defned by Sep and Cmp measures.

20 Mnyar Sass M For a data set X = { x : =,..., N} R, c s the ntal number of cluster, theoretcal complexty of Sep measure s about ( NMc) complexty of SepCmp s about ( NMc ) O whle the complexty of Cmp measure s about O ( NMc ) O.. Then, The Usually, ( Nc ) O. M pp N, therefore the complexty of ths measure for a specfc cluster scheme s about We present n table 6 a study of the theoretcal complexty of the proposed clusterng algorthms: Table 6. Study of the complexty of the proposed clusterng algorthms. Algorthms k-means EMk-means ESk-means Theoretcal complexty O ( Nc) O ( Nc ) O ( Nc ) As shown n table 6, the complexty of these algorthms s square proportonal to the sze of the data and to the maxmum number of clusters. 6. Concluson A concluson secton s not requred. Although a concluson may revew the man ponts of the paper, do not replcate the abstract as the concluson. A concluson mght elaborate on the mportance of the work or suggest applcatons and extensons. The maorty of clusterng algorthms often run up aganst the problem of the optmal number of clusters to generate and therefore the clusterng qualty evaluaton of obtaned clusters. A soluton conssts n makng teratons untl obtanng a satsfyng number of clusters. Each teraton tres to mnmze a qualty measure called valdty ndex. Ths qualty s udged n general on the bass of two contradctory crtera. The frst supposes that the generated clusters must be most varous possble to each other wth respect to certan characterstcs, and the second requres that each cluster have to be most homogeneous possble wth respect to these characterstcs. Whle nsprng by publshed algorthms, we have proposed mappng rules for generalzng these last. Our maor contrbutons are as follows: Defnton of global qualty measures of a partton generated by a clusterng algorthm whle basng on compactness and separaton measures, Defnton of local qualty measures to dentfy the worst clusters to be merge or splt. Bnarzaton of the mergng and splttng processes.

21 Towards Fuzzy-Hard Clusterng Mappng Processes Modelng and mplementaton of two clusterng approaches mplementng the new proposed measures and processes. The frst proposed approach s based on the prncple of mergng. It starts wth a maxmum number of clusters and decreases t durng varous teratons. The second proposed approach s based on the prncple of splttng. It starts wth a mnmum number of clusters whch s ncremented durng ts executon. For the two approaches, the basc dea conssts n determnng the optmal number of clusters, by global and local evaluatons, of the obtaned partton. Proposed measures, processes and approaches are exploted successfully for varous data sets. Future work relate to prmarly the test of these approaches on large data sets. Indeed, we must smplfy ths operaton by a data samplng whle allowng a better evaluaton of ths last. References [] M. B Dale, P. E. R. Dale and P. Tan, Supervsed clusterng usng decson trees and decson graphs: An ecologcal comparson, Eco. Model. 04(-) (007), [] M. Grmald, K. Demal, R. Redon, J. L. Jamet and B. Rossetto, Reconnassance de formes et classfcaton supervsée applquée au comptage automatque de phytoplankton 40 (003). [3] B. Evertt, Cluster Analyss, 3rd ed., Edward Arnold, 993. [4] E. Atnour, F. Dubeau, S. Wang and D. Zou, Controllng mxture component overlap for clusterng algorthms evaluaton, J. Pattern Recog. Image Anal. (4) (00), [5] S. Guha, R. Rastog and K. Sh, CURE: an effcent clusterng algorthm for large databases, ACM SIGMOD Int. Conf. Management of Data, 998, pp [6] T. Zhang, R. Ramakrshnan and M. Lvny, Brch: an effcent data clusterng method for very large database, ACM-SIGMOD Int. Conf. Management of Data, Montreal, Canada (996), [7] J. McQueen, Some methods for classfcaton and analyss of multvarate observatons, The Ffth Berkeley Symposum on Mathematcal Statstcs and Probablty, 967, [8] J. C. Bezdek, Chapter F6: Pattern Recognton n Handbook of Fuzzy Computaton, IOP Publshng Ltd., 998. [9] R. P. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan, Automatc subspace clusterng of hgh dmensonal data for data mnng applcatons, ACM SIGMOD Int.Conf. Management of Data (SIGMOD 98), Seattle, WA (998), [0] J. Costa and M. Netto, Estmatng the number of clusters n multvarate data by self organzng maps, Int. J. Neural Systems 9(3) (999), [] H. Sun, S. Wang and Q. Jang, FCM-based model selecton algorthm for determnng the number of clusters, Pattern Recog. 37(0) (004), [] D. W. Km, K. H. Lee and D. Lee, On cluster valdty ndex for estmaton of the optmal number of fuzzy clusters, Pattern Recog. 37(0) (004), [3] M. Sass, A. Grssa Touz and H. Ounell, Two levels of extensons of valdty functon based fuzzy clusterng, 4th Internatonal Multconference on Computer Scence and Informaton Technology, Amman, Jordan ( 006), [4] X. Xe and G. Ben, A valdty measure for fuzzy clusterng, IEEE Trans. Pattern Anal. Mach. Intell. 3(8) (99),

22 Mnyar Sass [5] M. Sass, A. Grssa Touz and H. Ounell, Usng Gaussans functons to determne representatve clusterng prototypes, 7 th IEEE Internatonal Conference on Database and Expert Systems Applcatons, Poland (006), [6] M. Sato, Y. Sato and L. C. Jan, Fuzzy Clusterng Models and Applcatons, Physca Verlag, Hedelberg, New York, 997. [7] L. A. Zadeh, Fuzzy sets, Inf. Control 8 (965), [8] J. Bezdek, R. Hathaway, M. Sobn and W. Tucker, Convergence theory for fuzzy c-means: counterexamples and repars, IEEE Trans. Systems, Man and Cybernetcs 7(5) (987), [9] J. McQueen, Some methods for classfcaton and analyss of multvarate observatons, The Ffth Berkeley Symposum on Mathematcal Statstcs and Probablty, 967, pp [0] G. Casella and E. L. Lehmann, Theory of Pont Estmaton, Sprnger, 999. [] J. Dunn, Well separated clusters and optmal fuzzy parttons, J. Cybernetcs (4) (974), [] D. L. Daves and D. W. Bouldn, A cluster separaton measure, IEEE Trans. Pattern Anal. Mach. Intell. (4) (979), 4-7.

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Clustering Algorithm of Similarity Segmentation based on Point Sorting Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Clustering algorithms and validity measures

Clustering algorithms and validity measures Clusterng algorthms and valdty measures M. Hald, Y. Batstas, M. Vazrganns Department of Informatcs Athens Unversty of Economcs & Busness Emal: {mhal, yanns, mvazrg}@aueb.gr Abstract Clusterng ams at dscoverng

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

NIVA: A Robust Cluster Validity

NIVA: A Robust Cluster Validity 2th WSEAS Internatonal Conference on COMMUNICATIONS, Heralon, Greece, July 23-25, 2008 NIVA: A Robust Cluster Valdty ERENDIRA RENDÓN, RENE GARCIA, ITZEL ABUNDEZ, CITLALIH GUTIERREZ, EDUARDO GASCA, FEDERICO

More information

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005 Exercses (Part 4) Introducton to R UCLA/CCPR John Fox, February 2005 1. A challengng problem: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed

More information

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 2 Sofa 2016 Prnt ISSN: 1311-9702; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-2016-0017 Hybrdzaton of Expectaton-Maxmzaton

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms Generatng Fuzzy Ter Sets for Software Proect Attrbutes usng Fuzzy C-Means C and Real Coded Genetc Algorths Al Idr, Ph.D., ENSIAS, Rabat Alan Abran, Ph.D., ETS, Montreal Azeddne Zah, FST, Fes Internatonal

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

A New Measure of Cluster Validity Using Line Symmetry *

A New Measure of Cluster Validity Using Line Symmetry * JOURAL OF IFORMATIO SCIECE AD EGIEERIG 30, 443-461 (014) A ew Measure of Cluster Valdty Usng Lne Symmetry * CHIE-HSIG CHOU 1, YI-ZEG HSIEH AD MU-CHU SU 1 Department of Electrcal Engneerng Tamkang Unversty

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem

An Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

A NOTE ON FUZZY CLOSURE OF A FUZZY SET

A NOTE ON FUZZY CLOSURE OF A FUZZY SET (JPMNT) Journal of Process Management New Technologes, Internatonal A NOTE ON FUZZY CLOSURE OF A FUZZY SET Bhmraj Basumatary Department of Mathematcal Scences, Bodoland Unversty, Kokrajhar, Assam, Inda,

More information

A Clustering Algorithm for Chinese Adjectives and Nouns 1

A Clustering Algorithm for Chinese Adjectives and Nouns 1 Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

A Robust LS-SVM Regression

A Robust LS-SVM Regression PROCEEDIGS OF WORLD ACADEMY OF SCIECE, EGIEERIG AD ECHOLOGY VOLUME 7 AUGUS 5 ISS 37- A Robust LS-SVM Regresson József Valyon, and Gábor Horváth Abstract In comparson to the orgnal SVM, whch nvolves a quadratc

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks Decson Strateges for Ratng Objects n Knowledge-Shared Research etwors ALEXADRA GRACHAROVA *, HAS-JOACHM ER **, HASSA OUR ELD ** OM SUUROE ***, HARR ARAKSE *** * nsttute of Control and System Research,

More information

Intra-Parametric Analysis of a Fuzzy MOLP

Intra-Parametric Analysis of a Fuzzy MOLP Intra-Parametrc Analyss of a Fuzzy MOLP a MIAO-LING WANG a Department of Industral Engneerng and Management a Mnghsn Insttute of Technology and Hsnchu Tawan, ROC b HSIAO-FAN WANG b Insttute of Industral

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China for Database Clusterng Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal: 6085@qq.com Me Zhang Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal:64605455@qq.com Database clusterng

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

From Comparing Clusterings to Combining Clusterings

From Comparing Clusterings to Combining Clusterings Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (008 From Comparng Clusterngs to Combnng Clusterngs Zhwu Lu and Yuxn Peng and Janguo Xao Insttute of Computer Scence and Technology,

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information