An algorithm for correcting mislabeled data

Size: px
Start display at page:

Download "An algorithm for correcting mislabeled data"

Transcription

1 Intellgent Data Analyss 5 (2001) IOS Press An algorthm for correctng mslabeled data Xnchuan Zeng and Tony R. Martnez Computer Scence Department, Brgham Young Unversty, Provo, UT 842, USA E-mal: {zengx,martnez}@cs.byu.edu Receved 12 June 2001 Revsed 22 July 2001 Accepted 25 August 2001 Abstract. Relable evaluaton for the performance of classfers depends on the qualty of the data sets on whch they are tested. Durng the collectng and recordng of a data set, however, some nose may be ntroduced nto the data, especally n varous real-world envronments, whch can degrade the qualty of the data set. In ths paper, we present a novel approach, called ADE (automatc data enhancement), to correct mslabeled data n a data set. In addton to usng mult-layer neural networks traned by backpropagaton as the basc framework, ADE assgns each tranng pattern a class probablty vector as ts class label, n whch each component represents the probablty of the correspondng class. Durng tranng, ADE constantly updates the probablty vector based on ts dfference from the output of the network. Wth ths updatng rule, the probablty of a mslabeled class gradually becomes smaller whle that of the correct class becomes larger, whch eventually causes the correcton of mslabeled data after a number of tranng epochs. We have tested ADE on a number of data sets drawn from the UCI data repostory for nearest neghbor classfers. The results show that for most data sets, when there exsts mslabeled data, a classfer constructed usng a tranng set corrected by ADE can acheve sgnfcantly hgher accuracy than that wthout usng ADE. Keywords: Neural networks, backpropagaton, probablty labelng, mslabeled data, data correcton 1. Introducton In the feld of machne learnng, neural networks and pattern recognton, a typcal approach to evaluate the performance of classfers s to test them on some real-word data sets (such as those from the UCI machne learnng data repostory). Clearly, the qualty of data sets affects the relablty of the evaluatons. In the process of collectng and recordng data n the real-word, however, some nose may be ntroduced nto data sets, due to varous sources of error. The ncluson of nose n data sets would, as a consequence, affect the qualty of evaluaton of the classfers beng tested. Ths ssue has been addressed prevously usng varous approaches n several research areas, especally n nstance-based learnng whose performance s partcularly senstve to nose n tranng data. To elmnate nose n a tranng set, Wlson used a 3-NN (Nearest Neghbor) classfer as a flter (or preprocessor) to elmnate those nstances that are msclassfed by the 3-NN, and then appled 1-NN on the fltered data as the classfer [7]. Several versons of edted nearest neghbor algorthms [5 7,9] save only selected nstances for generalzaton n order to reduce storage whle stll mantanng a smlar accuracy. The algorthm proposed by Aha et al. [1,2] removed nose and reduced storage by retanng only those nstances that have good classfcaton records when appled as nearest neghbors. Wlson and Martnez [18,19] proposed several nstance-prunng technques whch are capable of removng nose and reducng the requrement of storage X/01/$ IOS Press. All rghts reserved

2 492 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data The dea of usng selected nstances n tranng data has also been appled to other classfers. In an approach proposed by John [12], tranng data s frst fltered by removng those nstances pruned by a C4.5 tree [15], and a new tree s then constructed usng fltered data. Gamberger et al. [8] proposed a nose detecton and elmnaton method based on compresson measures and the Mnmum Descrpton Length prncple. Brodley and Fredl [4] appled an ensemble of classfers as a flter to dentfy and elmnate mslabeled tranng data. Teng [16,17] appled a procedure to dentfy and correct nose n class and attrbutes, based on the predctons of c4.5 decson trees. In ths work we present a novel approach, called ADE (automatc data enhancement), to correct mslabeled nstances n a data set. Ths approach s based on the mechansms of neural networks traned by backpropagaton. However, a dstnct feature of ths approach, n contrast to those of standard backpropagaton, s that each pattern n the data set s assgned a class probablty vector whch s constantly updated (nstead of beng fxed) durng tranng. The class label for each pattern s determned by ts class probablty vector and thus s also updated durng tranng. Usng ths new mechansm, an ntally mslabeled class could be corrected through gradually changng ts class probablty vector. The class probablty vector s updated n such a way that t becomes closer to the output of the network. The output of the network s, however, determned by the archtecture and weght settngs of the network, whch s the result of prevous tranng usng all patterns n the whole data set. If the ntal mslabeled percentage s reasonably small (e.g., < %), the network wll be predomnantly determned by those correctly labeled patterns. Durng tranng, the outputs of the network become more consstent wth the class probablty vectors of correctly labeled patterns whle less consstent wth those of ncorrectly labeled patterns. The updatng rule modfes the class probablty vectors of mslabeled patterns by a larger amount due to ther hgher nconsstences wth the outputs than those of correctly labeled patterns. For a mslabeled pattern, the probablty component of the mslabeled class becomes smaller whle that of the correct class becomes larger. After a number of tranng epochs, the component of the correct class gradually ncreases to a level larger than that of the mslabeled class (whch s ntally the largest). At that pont, the mslabeled class s modfed to the correct class. We have tested the performance of ADE on 24 data sets drawn from the UCI data repostory usng a nearest neghbor classfer. For each data set, we frst mslabel a fracton of the tranng set, and then apply ADE to correct mslabeled tranng data. We compare the test set accuraces of two nearest neghbor classfers one usng the tranng set wth mslabelng and the other usng the tranng set corrected by ADE. The stratfed 10-fold cross-valdaton method s appled for estmatng the accuraces. We conducted 20 stratfed 10-fold cross-valdatons for each data set n order to acheve a relable estmaton. The results show that for most data sets, the classfers usng ADE for correcton are capable of achevng sgnfcantly hgher accuraces than those wthout usng ADE. Even when there s no mslabeled data, a classfer usng ADE can acheve a hgher accuracy for some data sets, showng the general utlty of ADE as a data correctng algorthm. 2. Related work Some approaches have been prevously proposed to handle mslabeled data. Most of them focused on dentfyng mslabeled nstances and then applyng a flterng mechansm to remove them from the tranng set. Several early works on nearest neghbor classfers [5 7,9] appled varous methods to remove nose or combne some nstances, formng an edted or condensed data set. The edted set was then appled for buldng classfers for generalzaton. The man beneft of the approaches s the reducton for storage

3 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data 493 requrement, whle stll mantanng accuraces smlar to or only slghtly lower than those usng the orgnal data sets. The algorthm IB3, one verson of IBL (nstance-based learnng) proposed by Aha et al. [1,2], keeps track of the record of classfcaton accuracy for each nstance n the orgnal data set, and then retans only those nstances whose record s better than a certan threshold. The retaned data s then appled to construct classfers. They showed that IBL has the capablty of removng nose and reducng storage as well. Wlson and Martnez [18,19] proposed several nstance-prunng technques whch are nose-tolerant and capable of reducng the number of nstances retaned n memory whle mantanng (and sometmes mprovng) generalzaton accuracy. Gamberger et al. [8] presented a nose detecton and elmnaton method for nductve learnng. Ths method s based on compresson measures and the Mnmum Descrpton Length prncple, and elmnates nosy nstances by fndng a mnmal example set. The method was appled to the CN2 rule nducton algorthm on a dseases dagnoss doman. The results showed ncreased accuracy after applyng the nose elmnaton algorthm. Brodley and Fredl [4] appled an ensemble of classfers as a flter to dentfy and elmnate mslabeled tranng data. An ensemble of three classfers, a 1-NN, a lnear machne and a unvarate decson tree, s appled to classfy each data nstance. An nstance s dentfed as msclassfed and s removed f all three classfers output the same class and that class s dfferent from the orgnal labelng. They evaluated ther algorthm by an emprcal study usng a real-world data set consstng of land-cover maps of the Earth s surface. The results demonstrated that the classfers constructed usng the fltered data can acheve a hgher accuracy than those usng the orgnal set, n whch certan controlled fractons of mslabelng are ntroduced. Teng [16,17] ntroduced a procedure, called polshng, to dentfy and correct nose both n classes and n attrbutes. In the frst phase (predcton phase), 10-fold cross-valdaton was appled for data parttonng of the tranng set and test set. For each test set, a c4.5 decson tree classfer [15] was constructed based on the tranng set. It was then appled to classfy each nstance n the test set, and ts output was consdered as a predcted value and was used as a reference for correcton. To deal wth nose for an attrbute, the class was treated as an nput and the attrbute as the output. In the second phase (adjustment phase), each msclassfed nstance s attrbutes were adjusted (replaced wth the predcted values from the frst phase) so that they can be correctly classfed. If no combnaton of attrbute value replacement was capable of correctng the classfcaton, then ts class value was replaced wth the predcted one (from the frst phase). The procedure was tested on 12 data sets from the UCI machne learnng data repostory, showng capablty of dentfyng and correctng nose, and the capablty of mprovng the accuracy of classfers through usng corrected tranng data. The ADE procedure presented n ths work has the followng dstnct features compared to prevous approaches for smlar tasks. () In prevous work [4,16,17], a data set was frst dvded nto two dsjont sets: a tranng set and a test set. The nose n the test set was dentfed through the use of predctons made by a classfer or an ensemble of classfers constructed from the tranng set. However, the tranng set tself conssts the same percentage of nose as the test set. A classfer constructed n such a way may not have good qualty (especally when a hgh level of nose exsts n the data set) and thus t may not be able to make accurate predctons about the nose. In contrast, ADE ncludes all nstances n the process and allows every nstance to change ts class, wthout relyng on a pre-constructed classfer. () By usng a class probablty nstead of a bnary class label, ADE allows a large number of hypotheses about class labelngs to nteract and compete wth each other smultaneously, and let them smoothly and

4 494 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data ncrementally converge to an optmal or near-optmal soluton. Ths type of strategy has been shown effcent n searchng a large soluton-space for NP-class optmzaton problems usng relaxaton-type neural networks [10]. () Compared to usng other types of classfers, usng mult-layer feed-forward networks can take advantage of ther hgh capacty for fttng the target functon. It has been shown that a network wth one hdden layer has the capacty of approxmatng any functon [11]. (v) Both nomnal and numercal attrbutes can be easly handled usng the ADE procedure, n contrast to the lmtaton to the type of attrbutes when usng other procedures (for example, each attrbute needs to be nomnal when usng the polshng procedure). (v) Compared to the strategy of removng nose [4,8], correctng mslabeled data s partcularly useful for small-sze data sets (for whch data s sparse or data collecton s costly) because every nstance can be used as the tranng data. In comparson, removng part of the data from an already sparse data set could sgnfcantly reduce the performance of a classfer whch was traned usng ths data set. 3. Algorthm Let S be an nput data set n whch some nstances have been mslabeled, Our task s to fnd a procedure to correct those mslabeled nstances and then output a corrected data set Ŝ. There are varous domans n whch pattern recognton technques can be appled. Most of the domans whch we are nterested n possess the followng two propertes: () a data set contans some degree of regularty (nstead of beng totally random), whch can be dscovered and be used to buld a classfer whch has capablty of makng predctons that are better than random guessng; () when a reasonably small fracton of the data set s mslabeled, those regulartes wll stll be mantaned to a certan degree, although they may be weakened by the mslabelng. Let α be the non-mslabeled fracton and β (= 1 α) the mslabeled fracton of nput data set S. Let S (c) (m) be the correctly labeled set and S the mslabeled set (S(c) + S(m) = S). The nstances n S (c) (m) have a tendency of strengthenng the regulartes possessed n S, whle those n S have a tendency of weakenng the regulartes due to the random nature of mslabelng. However, f the mslabeled fracton s small (.e., β<<α), the trend of mantanng the regulartes due to S (c) domnant. The strategy of ADE s to apply the dscovered regulartes n S (c) wll be to correct those mslabeled nstances n S (m). We use a mult-layer perceptron as the underlyng classfer to capture the regulartes contaned n S. The reason for ths choce s that neural networks have demonstrated capablty of detectng and representng regulartes (features) n a data set. Neural networks wth one hdden-layer have the capablty of approxmatng any functon [11]. We use backpropagaton as the tranng procedure for the network. The format and procedure adopted n our approach s the same as those of standard backpropagaton networks except for the followng dfferences. In standard procedure, each nstance v n S has the followng format: v =(x,y) (1) where x =(x 1,x 2,...,x f ) s the feature vector of v, and f s the number of features; y s the class (category) label of v.

5 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data 495 In our approach, however, we attach a class probablty vector to each nstance v n S: v =(x,y,p) where p =(p 1,p 2,...,p c ) s the class probablty vector of v, and c s the number of class labels. For each class, ts probablty p s proportonal to the output V of the correspondng output node n the network, whch s determned by the nput U of that node through the sgmod actvaton functon: V = 1 2 (1 + tanh(u )) (3) u 0 where u 0 =0.02 s the amplfcaton parameter that reflects the steepness of the actvaton functon. The sgmod functon s n the range of [0.0, 1.0]. In addton to updatng weghts n the network usng standard backpropagaton procedure, we also update the class probablty vector p durng tranng. The updatng of p depends on the dfference between the current p value and the values of the output nodes n the network. For each class, we frst update ts nput U, and then map the updated nput to ts output V (whch s proportonal to p ) through the sgmod functon. After each update, p gets closer to the output node value. Because the regulartes reman n S for a small fracton of mslabelng, the network wll gradually become capable of reflectng the regulartes after suffcng tranng. Durng ths process, the correctly labeled set S (c) plays a more mportant role than the mslabeled set S (m) n shapng the weght confguraton of the network. The reason s that S (c) contans more nstances than S (m) and thus, accordng to the updatng rule of backpropagaton, t has more opportuntes to update the weghts. In ths way, the weght confguraton wll be gradually changed to reflect the regulartes of the data set. Thus f v s a correctly labeled nstance, ts output vector O =(O 1,O 2,...,O c ) (where O s the value of output node ) wll become more consstent wth the class probablty vector p =(p 1,p 2,...,p c ) after a certan amount of tranng. In contrast, f v s a mslabeled nstance, O wll be less consstent wth p snce mslabeled nstances do not follow the regulartes. However, by updatng p so that t becomes closer to O usng ADE, we can cause a mslabeled nstance to gradually change ts class probablty vector p and eventually correct the mslabeled class. The followng explans the basc steps of ADE. The weghts of the network are ntally set randomly wth unform dstrbuton n the range [ 0.05, 0.05]. For each nstance v =(x,y,p) (where y s the ntal class label), ts output vector V (proportonal to ts probablty vector p) s ntally set as follows. V y (the output probablty for class y) s set to be a large fracton D (0.5 <D<1.0), and then (1-D) s dvded equally nto the other (C 1) output components. We have tested dfferent D values n the experment. The results are smlar when D s n the range (0.8, 1.0) and dropp slowly when D decreases from 0.8 to 0.5. In our experment we chose D =0.95. If C =3and y =1, for example, then p (O) 1 =0.95, and p (O) 2 = p (O) 3 = = The nput U s then determned from the correspondng output usng the nverse sgmod functon. The ntal number of hdden nodes s set to be 1. For each tranng nstance v, the weghts n the network and the probablty vector p for v are updated usng the followng procedure: () Update the network weghts usng standard backpropagaton algorthm, wth the learnng rate L n and momentum M n for the net. (2)

6 496 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data () For each class, update the output class probablty usng formula: U = U + L p (O V ) (4) where O s the value of output node for nstance v, and L p (dfferent from L n ) s the learnng rate for probablty vector. We treat O as the target for updatng the probablty vector. The output V s updated based on the nput U usng the sgmod functon (Eq. (3)). After each tranng epoch, the values of V for all ( =1, 2...C) are normalzed so that ther sum s 1, and then the probablty vector p s set to be equal to V (.e. p = V for =1, 2...C). () The class label y for nstance v (= (x,y,p)) s also updated, usng the followng formula: y =argmax{p ( =1, 2,...C)} (5) that s, y s relabeled to be the class wth the maxmum probablty. If v s a mslabeled nstance, for example, ts class label could be corrected usng ths mechansm after a certan number of tranng epochs, whch gradually update the class probabltes. After every N e epochs, the sum of squared errors (SSE) over all nstances n the data set s calculated to montor the progress of the tranng. If N e s too small, more computaton s needed; but f N e s too large, t s not able to accurately modnor the tranng progress. We have tred dfferent values for N e and found that the range for good performance s 5 <N e <. We chose N e =20n our experment because t was slghtly better than the other choces. Instead of usng SSE drectly, we use an adjusted verson: SSE (adj), whch s calculated usng the formula: SSE (adj) = SSE (std) + SSE (hn) + SSE (dst) (6) where SSE (std) s the standard SSE. SSE (hn) s an addtonal term takng nto account the effects of the number of hdden nodes. More hdden nodes can usually lead to a smaller SSE (std), but wth a hgher possblty of overfttng. To reduce ths effect, we add an addtonal error term SSE (hn) that ncreases wth the number of hdden nodes. We adopt the followng emprcal formula n ADE: SSE (hn) = A 1 (H 1)N(C 1)/C (H I) (7) =(A 1 (I 1) + A 2 (H I))N(C 1)/C (H >I) where H s the number of hdden nodes, I s the number of nput nodes, N s the number of nstances n the data set, and C s the number of classes. A 1 and A 2 are two emprcal parameters wth the constrant A 2 >A 1. We tred dfferent values for A 1 and A 2 and found that performance s good (and smlar) when 0.01 <A 1 < 0.1 and < 0.1 <A 2 < 0.5. In our experment, we chose A 1 =0.05 and A 2 =0.2. From the formula, we see that when H I, the error SSE (hn) s relatvely small (A 1 s small); but when H>I, SSE (hn) starts to ncrease more rapdly (A 2 >> A 1 ). SSE (dst) s another addtonal term takng nto account the devaton of current class dstrbuton from the ntal (orgnal) one. We assume that mslabelngs have a random nature each nstance has an equal chance to be mslabeled. Based on ths assumpton, we can nfer that the class dstrbuton for a data set wth mslabelng should reflect the one wthout mslabelng. Thus, we expect that f a procedure can accurately correct mslabeled data, the class dstrbuton should be about same before and after the correctng procedure and the dfference should be very small. That s why we ntroduce an error term SSE (dst) that ncreases wth the dfference. The class dstrbuton vector q s defned as q =(q 1,q 2,...,q C )=( N 1 N, N 2 N,... N C N ) (8)

7 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data 497 where N s the total number of nstances n the data set, and N s the number of nstances labeled wth class ( =1, 2,...,C). Note that an nstance u labeled wth class means that s the class wth the maxmum magntude among C components n ts current probablty vector p. Let q (nt) =(q (nt) 1,q (nt) 2,...q (nt) C ) and q (curr) =(q (curr) 1,q (curr) 2,...q (curr) C ) be the ntal and current class dstrbuton vector respectvely. Then SSE (dst) s calculated usng the formula SSE (dst) = N(C 1) C C =1 B D max(q (curr),q (nt) ) (9) where D = q (curr) q (nt) /q (nt) s the dfference fracton between q (curr) and q (nt). B s an emprcal parameter whch vares when D s n dfferent ranges. We have expermented wth varous parameter settngs for B and performance s good through a wde range 0.05 <B < 1.5. In our experment, we set B n the followng way (snce t performed slghtly better than other settngs): B =0.1when D < 0.05, and B =1.0when D We can see from Eq. (9) that the error SSE (dst) ncreases when the dfference between q (curr) and q (nt) becomes larger. SSE (dst) ncreases slowly when D s small, and t ncreases more rapdly when D surpasses a threshold (0.05). For a fxed number of hdden nodes H (startng from H =1n our experment), the calculated error SSE (adj) s compared wth the stored best (mnmum) of the prevous SSE (adj) after each N e (=20) epochs. If t s smaller, then t wll replace the prevous one as the new best SSE (adj) and be stored for future comparson and retreval, along wth the current network confguraton (weght settngs and number of hdden nodes) and class probablty vectors. If no better SSE (adj) s found after N m of N e epochs (equvalent to N e N m =20 N m epochs), we assume that the best confguraton for fxed H hdden nodes has been found and then we begn the tranng wth H +1hdden nodes n an effort to dscover an optmal confguraton. Dfferent N m values have been evaluated n our experment. If N m s too small (< 5), the performance drops because of the ncapablty of fndng the optmal confguraton. But f N m s too large (> ), the computaton ncreases greatly wthout any performance gan. The performance s about same as long as N m 10. In our experment we chose N m =10to save computaton cost whle mantanng performance. If two consecutve addtons of hdden nodes do not yeld a better result, we assume that the best confguraton has been found for the data set (For example, f 4 and 5 hdden nodes do not yeld a better result than 3 hdden nodes, we use 3 hdden nodes as the optmal choce). Usng the optmal settng, we relabel the data set usng the correspondng class probablty vectors. 4. Experments We have tested ADE on 24 data sets drawn from the UCI machne learnng data repostory [14], and evaluated ts performance usng the nearest neghbor classfer [7]. For each tested data set, we frst artfcally mslabel a fracton of the tranng data, and then apply ADE to correct mslabeled data. We then compare the test set accuraces of two versons of nearest neghbor classfers: one traned wth the mslabeled tranng set wthout correcton and the other traned wth the tranng set corrected by ADE. We appled stratfed 10-fold cross-valdaton [3,13] for estmatng the accuraces. For each data set, we conducted 20 stratfed 10-fold cross-valdatons and averaged the results to acheve a more relable estmaton.

8 498 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data Table 1 Descrpton of 24 UCI data sets Data Set sze #attr #num #symb #class Data Set sze #attr #num #symb #class australan led balance lense crx lymph echoc monk ecol monk hayes monk heartc pma hearth postop horse votng ono wave rs wave led zoo In each of 10 teratons for one stratfed 10-fold cross-valdaton, 9 folds of data are used as the the tranng set S and the other fold as the testng set T. We obtan a mslabeled tranng set S m by mslabelng β fracton of output classes n S usng the followng process. For each class (=1,2,...,C), βn nstances (N s the number of nstances of class ) are randomly mslabeled to one of the other (C-1) classes (.e., classes 1, 2,...-1, +1,...C). Among the βn nstances, the number of nstances labeled to class j s proportonal to the populaton of class j (same as q j defned n Eq. (8)). Usng ths procedure, the mslabeled set S m keeps the same class dstrbuton as the orgnal set S, whch s consstent wth the assumpton of random mslabelng. We then run ADE on S m and output a corrected tranng set S c. The performance of ADE s evaluated by comparng the test-set accuraces of the followng two classfers usng the nearest neghbor rule: NNR c based on the corrected tranng set S c and NNR m based on the mslabeled set S m wthout correcton (both usng 1-nearest neghbor). Both NNR c and NNR m use T as the testng set for each teraton. The nearest neghbor rule (NNR) [7] works as follows. To classfy an nput nstance v, NNR compares v wth all nstances n the tranng set and fnds the most smlar one u, and then classfes v as the same class as u (whch s also called 1-NN). One varaton s to classfy v based on the top k most smlar nstances (k-nn) n the tranng set usng a votng mechansm. The accuracy for one stratfed 10-fold cross-valdaton s the total number of correctly classfed nstances n all the 10 teratons dvded by the total number of nstances n the data set ( S + T ). For each data set, we conduct 20 such stratfed 10-fold cross-valdatons and then average them. Table 1 shows the sze and other propertes of the data sets; sze s the number of nstances; #attr s the number of attrbutes (not ncludng class); #num s the number of contnuous attrbutes; #symb s the number of nomnal attrbutes; #class s the number of classes. Fgures 1 and 2 show smulaton results on the 24 tested data sets. In each graph, the two curves dsplay the test-set accuraces of two nearest neghbor classfers one wthout usng ADE and the other usng ADE to correct mslabeled tranng data. Each graph also dsplays how the accuraces vary wth dfferent mslabelng levels (β). Each data pont represents the accuracy averaged over 20 stratfed 10-fold cross-valdatons, along wth the correspondng error bar wth a 95% confdence level. The results show that for most of these data sets, the classfer usng ADE performs sgnfcantly better than that wthout usng ADE, as long as the mslabeled level s less than %. In ths mslabeled range, the correctly labeled data s domnant and s capable of controllng the formaton of the network archtecture. Durng ths process, the formed network s able to gradually correct the class probablty vector of mslabeled data.

9 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data 499 Australan Balance Crx Echoc Ecol Hayes Heart(C) Heart(H) Horse Iono Irs Led Fg. 1. Smulaton results on real-world domans whch compare test-set accuraces of nearest neghbor classfers wthout ADE and wth ADE to correct mslabeled tranng data.

10 0 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data Led Lense Lymph Monk1 Monk2 Monk Pma Postop Votng Wave21 Wave Zoo Fg. 2. Smulaton results on real-world domans whch compare test-set accuraces of nearest neghbor classfers wthout ADE and wth ADE to correct mslabeled tranng data.

11 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data 1 One observaton s that as the mslabeled level ncreases (> %), the performance of ADE starts degradng. The reason s that the domnance of correctly labeled data becomes weaker wth the ncreased mslabeled level. As t approaches about %, there s no obvous domnance from ether correctly or ncorrectly labeled data. Ths explans why the performance drops dramatcally (the accuraces usng ADE are stll hgher than that wthout usng ADE n some cases) at ths pont. The performance of ADE vares wth dfferent data sets. It s sgnfcantly better than that wthout usng ADE for most tested data sets and s stll slghtly better than or smlar to for others. Another observaton s that even when the mslabeled level s 0 (.e. wthout addng any artfcally mslabeled data), the accuracy usng ADE s stll sgnfcantly hgher than that wthout usng ADE for some data sets (australan, crx, echoc, ecol, hearth, led7, lense, pma, postop, and wave21). Ths ndcates that these data sets may nclude some nose or mslabelngs already, and usng ADE to correct them allows neares neghbor classfers to acheve hgher test-set accuraces. 5. Summary In summary, we have presented an approach ADE to correct mslabeled data. In ths approach, a class probablty vector s attached to each nstance ts value evolves as tranng contnues. ADE combnes the backpropagaton network wth a relaxaton mechansm for the tranng. A learnng algorthm s proposed to update the class probablty vector based on the dfference of ts current value and the network output value. The archtecture, weght settngs and output values of the network are determned predomnantly by those correctly labeled nstances when mslabeled percentage n a data set s less than %. Ths mechansm enables class label correcton by allowng gradual changes for the class probablty vectors of mslabeled nstances durng tranng. We have tested the performance of ADE on 24 data sets drawn from the UCI data repostory by comparng the accuraces of two versons of nearest neghbor classfers, one usng the tranng set corrected by ADE and the other usng the tranng set wthout correcton. The results show that the classfers based on corrected tranng set usng ADE perform sgnfcantly better than those wthout usng ADE for most data sets. References [1] D.W. Aha and D. Kbler, Nose-tolerant nstance-based learnng algorthms, n Proceedngs of the Eleventh Internatonal Jont Conference on Artfcal Intellgence, Morgan Kaufmann, Detrot, MI, 1989, pp [2] D.W. Aha, D. Kbler and M.K. Albert, Instance-based learnng algorthms, Machne Learnng 6 (1991), [3] L. Breman, J.H. Fredman, R.A. Olshen and C.J. Stone, Classfcaton and Regresson Trees, Wadsworth Internatonal Group, [4] C.E. Brodley and M.A. Fredl, Identfyng and elmnatng mslabeled tranng nstances, n Proceedngs of Thrteenth Natonal Conference on Artfcal Intellgence, 1996, pp [5] G.W. Gates, The reduced nearest neghbor rule, IEEE Transactons on Informaton Theory (1972), [6] B.V. Dasarathy, Nosng around the neghborhood: A new system structure and classfcaton rule for recognton n partal exposed envronments, Pattern Analyss and Machne Intellgence 2 (19), [7] B.V. Dasarathy, Nearest Neghbor (NN) Norms: NN Pattern Classfcaton Technques, IEEE Computer Socety Press, Los Alamtos, CA, [8] D. Gamberger, N. Lavrac and S. Saso Dzerosk, Nose elmnaton n nductve concept learnng, n Proceedngs of 7th Internatonal Workshop on Algorthmc Learnng Theory, 1996, pp [9] P.E. Hart, The condensed nearest neghbor rule, Insttute of Electrcal and Electroncs Engneers and Transactons on Informaton Theory 14 (1968),

12 2 X. Zeng and T.R. Martnez / An algorthm for correctng mslabeled data [10] J.J. Hopfeld and D.W. Tank, Neural Computatons of Decsons n Optmzaton Problems, Bologcal Cybernetcs 52 (1985), [11] K. Hornk, M. Stnchcombe and H. Whte, Multlayer feedforward networks are unversal approxmators, Neural Networks 2 (1989), [12] G.H. John, Robust decson tree: Removng outlers from data, n Proceedngs of the Frst Internatonal Conference on Knowledge Dscovery and Data Mnng, AAAI Press, Montreal, Quebec, 1995, pp [13] R. Kohav, A study of cross-valdaton and bootstrap for accuracy estmaton and model selecton, n Proceedngs of the Internatonal Jont Conference on Artfcal Intellgence (IJCAI), 1995, pp [14] C.J. Merz and P.M. Murphy, UCI repostory of machne learnng databases, mlearn/mlrepostory.html, [15] J.R. Qunlan, C4.5: Programs for Machne Learnng, Morgan Kaufman, Los Altos, CA, [16] C.M. Teng, Correctng nosy data, n Proceedngs of 16th Internatonal Conference on Machne Learnng, 1999, pp [17] C.M. Teng, Evaluatng nose correcton, n Proceedngs of 6th Pacfc Rm Internatonal Conference on Artfcal Intellgence, Lecture Notes n AI, Sprnger-Verlag, [18] D.R. Wlson and T.R. Martnez, Instance Prunng Technques, n Machne Learnng: Proceedngs of the Fourteenth Internatonal Conference (ICML 97), Morgan Kaufmann Publshers, San Francsco, CA, 1997, pp [19] D.R. Wlson and T.R. Martnez, Reducton Technques for Exemplar-Based Learnng Algorthms, Machne Learnng 38(3) (2000),

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

A Lazy Ensemble Learning Method to Classification

A Lazy Ensemble Learning Method to Classification IJCSI Internatonal Journal of Computer Scence Issues, Vol. 7, Issue 5, September 2010 ISSN (Onlne): 1694-0814 344 A Lazy Ensemble Learnng Method to Classfcaton Haleh Homayoun 1, Sattar Hashem 2 and Al

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

Using Decision Trees and Soft Labeling to Filter Mislabeled Data. Abstract

Using Decision Trees and Soft Labeling to Filter Mislabeled Data. Abstract Using Decision Trees and Soft Labeling to Filter Mislabeled Data Xinchuan Zeng and Tony Martinez Department of Computer Science Brigham Young University, Provo, UT 84602 E-Mail: zengx@axon.cs.byu.edu,

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks In AAAI-93: Proceedngs of the 11th Natonal Conference on Artfcal Intellgence, 33-1. Menlo Park, CA: AAAI Press. Learnng Non-Lnearly Separable Boolean Functons Wth Lnear Threshold Unt Trees and Madalne-Style

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Error Detection and Impact-Sensitive Instance Ranking in Noisy Datasets

Error Detection and Impact-Sensitive Instance Ranking in Noisy Datasets Error Detecton and Impact-Senstve Instance Ranng n osy Datasets Xngquan Zhu, Xndong Wu, and Yng Yang Department of Computer Scence, Unversty of Vermont, Burlngton VT 05405, USA {xqzhu, xwu, yyang}@cs.uvm.edu

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY Proceedngs of the 20 Internatonal Conference on Machne Learnng and Cybernetcs, Guln, 0-3 July, 20 THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY JUN-HAI ZHAI, NA LI, MENG-YAO

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

A classification scheme for applications with ambiguous data

A classification scheme for applications with ambiguous data A classfcaton scheme for applcatons wth ambguous data Thomas P. Trappenberg Centre for Cogntve Neuroscence Department of Psychology Unversty of Oxford Oxford OX1 3UD, England Thomas.Trappenberg@psy.ox.ac.uk

More information

Research of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification

Research of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification Research of Neural Network Classfer Based on FCM and PSO for Breast Cancer Classfcaton Le Zhang 1, Ln Wang 1, Xujewen Wang 2, Keke Lu 2, and Ajth Abraham 3 1 Shandong Provncal Key Laboratory of Network

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis

Relevance Assignment and Fusion of Multiple Learning Methods Applied to Remote Sensing Image Analysis Assgnment and Fuson of Multple Learnng Methods Appled to Remote Sensng Image Analyss Peter Bajcsy, We-Wen Feng and Praveen Kumar Natonal Center for Supercomputng Applcaton (NCSA), Unversty of Illnos at

More information

Associative Based Classification Algorithm For Diabetes Disease Prediction

Associative Based Classification Algorithm For Diabetes Disease Prediction Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Improving anti-spam filtering, based on Naive Bayesian and neural networks in multi-agent filters

Improving anti-spam filtering, based on Naive Bayesian and neural networks in multi-agent filters J. Appl. Envron. Bol. Sc., 5(7S)381-386, 2015 2015, TextRoad Publcaton ISSN: 2090-4274 Journal of Appled Envronmental and Bologcal Scences www.textroad.com Improvng ant-spam flterng, based on Nave Bayesan

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

A Fusion of Stacking with Dynamic Integration

A Fusion of Stacking with Dynamic Integration A Fuson of Stackng wth Dynamc Integraton all Rooney, Davd Patterson orthern Ireland Knowledge Engneerng Laboratory Faculty of Engneerng, Unversty of Ulster Jordanstown, ewtownabbey, BT37 OQB, U.K. {nf.rooney,

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Extraction of Fuzzy Rules from Trained Neural Network Using Evolutionary Algorithm *

Extraction of Fuzzy Rules from Trained Neural Network Using Evolutionary Algorithm * Extracton of Fuzzy Rules from Traned Neural Network Usng Evolutonary Algorthm * Urszula Markowska-Kaczmar, Wojcech Trelak Wrocław Unversty of Technology, Poland kaczmar@c.pwr.wroc.pl, trelak@c.pwr.wroc.pl

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

A Saturation Binary Neural Network for Crossbar Switching Problem

A Saturation Binary Neural Network for Crossbar Switching Problem A Saturaton Bnary Neural Network for Crossbar Swtchng Problem Cu Zhang 1, L-Qng Zhao 2, and Rong-Long Wang 2 1 Department of Autocontrol, Laonng Insttute of Scence and Technology, Benx, Chna bxlkyzhangcu@163.com

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Classifier Swarms for Human Detection in Infrared Imagery

Classifier Swarms for Human Detection in Infrared Imagery Classfer Swarms for Human Detecton n Infrared Imagery Yur Owechko, Swarup Medasan, and Narayan Srnvasa HRL Laboratores, LLC 3011 Malbu Canyon Road, Malbu, CA 90265 {owechko, smedasan, nsrnvasa}@hrl.com

More information

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining A Notable Swarm Approach to Evolve Neural Network for Classfcaton n Data Mnng Satchdananda Dehur 1, Bjan Bhar Mshra 2 and Sung-Bae Cho 1 1 Soft Computng Laboratory, Department of Computer Scence, Yonse

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Dynamic Integration of Regression Models

Dynamic Integration of Regression Models Dynamc Integraton of Regresson Models Nall Rooney 1, Davd Patterson 1, Sarab Anand 1, Alexey Tsymbal 2 1 NIKEL, Faculty of Engneerng,16J27 Unversty Of Ulster at Jordanstown Newtonabbey, BT37 OQB, Unted

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

Learning-based License Plate Detection on Edge Features

Learning-based License Plate Detection on Edge Features Learnng-based Lcense Plate Detecton on Edge Features Wng Teng Ho, Woo Hen Yap, Yong Haur Tay Computer Vson and Intellgent Systems (CVIS) Group Unverst Tunku Abdul Rahman, Malaysa wngteng_h@yahoo.com, woohen@yahoo.com,

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

A User Selection Method in Advertising System

A User Selection Method in Advertising System Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Automated Selection of Training Data and Base Models for Data Stream Mining Using Naïve Bayes Ensemble Classification

Automated Selection of Training Data and Base Models for Data Stream Mining Using Naïve Bayes Ensemble Classification Proceedngs of the World Congress on Engneerng 2017 Vol II, July 5-7, 2017, London, U.K. Automated Selecton of Tranng Data and Base Models for Data Stream Mnng Usng Naïve Bayes Ensemble Classfcaton Patrca

More information

Bootstrapping Color Constancy

Bootstrapping Color Constancy Bootstrappng Color Constancy Bran Funt and Vlad C. Carde * Smon Fraser Unversty Vancouver, Canada ABSTRACT Bootstrappng provdes a novel approach to tranng a neural network to estmate the chromatcty of

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Complex System Reliability Evaluation using Support Vector Machine for Incomplete Data-set

Complex System Reliability Evaluation using Support Vector Machine for Incomplete Data-set Internatonal Journal of Performablty Engneerng, Vol. 7, No. 1, January 2010, pp.32-42. RAMS Consultants Prnted n Inda Complex System Relablty Evaluaton usng Support Vector Machne for Incomplete Data-set

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

A Weighted Method to Improve the Centroid-based Classifier

A Weighted Method to Improve the Centroid-based Classifier 016 Internatonal onference on Electrcal Engneerng and utomaton (IEE 016) ISN: 978-1-60595-407-3 Weghted ethod to Improve the entrod-based lassfer huan LIU, Wen-yong WNG *, Guang-hu TU, Nan-nan LIU and

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

Biological Sequence Mining Using Plausible Neural Network and its Application to Exon/intron Boundaries Prediction

Biological Sequence Mining Using Plausible Neural Network and its Application to Exon/intron Boundaries Prediction Bologcal Sequence Mnng Usng Plausble Neural Networ and ts Applcaton to Exon/ntron Boundares Predcton Kuochen L, Dar-en Chang, and Erc Roucha CECS, Unversty of Lousvlle, Lousvlle, KY 40292, USA Yuan Yan

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES Aram AlSuer, Ahmed Al-An and Amr Atya 2 Faculty of Engneerng and Informaton Technology, Unversty of Technology, Sydney, Australa

More information

An Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment

An Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment IJCSI Internatonal Journal of Computer Scence Issues, Vol. 7, Issue 5, September 2010 ISSN (Onlne): 1694-0814 www.ijcsi.org 374 An Evolvable Clusterng Based Algorthm to Learn Dstance Functon for Supervsed

More information

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers 62626262621 Journal of Uncertan Systems Vol.5, No.1, pp.62-71, 211 Onlne at: www.us.org.u A Smple and Effcent Goal Programmng Model for Computng of Fuzzy Lnear Regresson Parameters wth Consderng Outlers

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information