A Balanced Ensemble Approach to Weighting Classifiers for Text Classification

Size: px
Start display at page:

Download "A Balanced Ensemble Approach to Weighting Classifiers for Text Classification"

Transcription

1 A Balanced Enseble Approach to Weghtng Classfers for Text Classfcaton Gabrel Pu Cheong Fung 1, Jeffrey Xu Yu 1, Haxun Wang 2, Davd W. Cheung 3, Huan Lu 4 1 The Chnese Unversty of Hong Kong, Hong Kong, Chna, {pcfung,yu}@se.cuhk.edu.hk 2 IBM T. J. Watson Research Center, New York, USA, haxun@us.b.co 3 The Unversty of Hong Kong, Hong Kong, Chna, dcheung@cs.hku.hk 4 Arzona State Unversty, Arzona, USA, hlu@asu.edu Abstract Ths paper studes the proble of constructng an effectve heterogeneous enseble classfer for text classfcaton. One ajor challenge of ths proble s to forulate a good cobnaton functon, whch cobnes the decsons of the ndvdual classfers n the enseble. We show that the classfcaton perforance s affected by three weght coponents and they should be ncluded n dervng an effectve cobnaton functon. They are: (1) Global effectveness, whch easures the effectveness of a eber classfer n classfyng a set of unseen docuents; (2) Local effectveness, whch easures the effectveness of a eber classfer n classfyng the partcular doan of an unseen docuent; and (3) Decson confdence, whch descrbes how confdent a classfer s when akng a decson when classfyng a specfc unseen docuent. We propose a new balanced cobnaton functon, called Dynac Classfer Weghtng (DCW), that ncorporates the aforeentoned three coponents. The eprcal study deonstrates that the new cobnaton functon s hghly effectve for text classfcaton. 1 Introducton Let U be a set of unseen docuents and C be a set of predefned categores. Autoated text classfcaton s the process of labelng U wth C, such that every d U wll be assgned to soe of the categores n C. Note that d can be assgned to none of the categores n C. If the nuber of categores n C s ore than two ( C > 2), t s a ultlabel text classfcaton proble. Snce every ult-label text classfcaton proble can be transfored to a bnarylabel text classfcaton proble, we focus on the bnary proble n ths paper ( C = 2). Let c C. Bnary-label text classfcaton s to construct a bnary classfer, denoted by Φ( ), for c such that: { 1 f f (d) > 0, Φ(d) = (1) 1 otherwse, where Φ(d) = 1 ndcates that d belongs to c and Φ(d) = 1 ndcates that d does not belong to t. f ( ) R s a decson functon. Every classfer, Φ, has ts own decson functon, f ( ). If there are dfferent classfers, there wll be dfferent decson functons. The goal of constructng a bnary classfer, Φ( ), s to approxate the unknown true target functon Φ( ), so that Φ( ) and Φ( ) are concdent as uch as possble [17]. In order to prove the effectveness, enseble classfers (a.k.a classfer cottee) were proposed [1, 3, 5, 6, 7, 8, 9, 15, 16, 17, 18, 19]. An enseble classfer s constructed by groupng a nuber of eber classfers. If the decsons of the eber classfers are cobned properly, the enseble s robust and effectve. There are two knds of enseble classfers: hoogeneous and heterogeneous. A hoogeneous enseble classfer contans bnary classfers n whch all classfers are constructed by the sae learnng algorth. Baggng and boostng [19] are two coon technques [1, 15, 16, 18]. A heterogeneous enseble classfer contans bnary classfers n whch all classfers are constructed by dfferent learnng algorths (e.g., one SVM classfer and one knn classfer are grouped together) [19]. The ndvdual decsons of the classfers n the enseble are cobned (e.g., through stackng [19]): { 1 f g ( Φ 1 (d),φ 2 (d),...,φ (d) ) > 0, Θ(d) = (2) 1 otherwse, where Θ( ) s an enseble classfer; g( ) s a cobnaton functon that cobnes the outputs of all Φ ( ). The effectveness of the enseble classfer, Θ( ), depends on the effectveness of g( ). In ths paper, we concentrate on analyzng heterogeneous enseble classfers. Our proble s thus to exane how to forulate a good g( ). Four wdely used g( ) are: (1) Majorty votng (MV) [8, 9]; (2) Weghted lnear cobnaton (WLC) [7]; (3) Dynac classfers selecton (DCS) [3, 8, 6, 5]; and (4) Adaptve classfers cobnaton (ACC) [8, 9]. Except for MV, the other three functons assgn dfferent weghts to the classfers n the enseble. The bgger the weght, the ore ef- 1

2 Fgure 1. Illustraton of local effectveness and decson confdence. fectve s that classfer. In MV, all classfers n the enseble are equally weghted. It can end up wth a wrong decson f the norty votes are sgnfcant. WLC assgns statc weghts to the classfers based on ther perforance on a valdaton data. However, a generally well-perfored classfer can perfor poorly n soe specfc doans. For nstance, the cro-f 1 scores of SVM and Nave Bayes (NB) for the benchark Reuters21578 are respectvely and In ths sense, SVM excels NB. Yet, for the categores Potato and Retal n Reuters21578, the F 1 scores for NB are both 0.667, but are both 0.0 for SVM. DCS and ACC weght the classfers by parttonng the valdaton data (doan specfc), they do not cobne the classfers decsons, but select one of the classfers fro the enseble and rely on t solely. We wll show n the experents that ths wll lead to nferor results. In ths paper, we propose a new cobnaton functon called Dynac Classfers Weghtng (DCW). We consder three coponents when cobnng classfers: (1) Global Effectveness, whch s the effectveness of a classfer n an enseble when t classfes a set of unseen docuents; (2) Local effectveness, whch s the effectveness of a classfer n an enseble when t classfes the partcular doan of the unseen docuent; and (3) Decson confdence, whch s the confdence of a classfer n akng a decson of the enseble for a specfc unseen docuent. 2 Motvatons Let Φ 1 ( ),Φ 2 ( ),...,Φ ( ) be dfferent bnary classfers and f 1 ( ), f 2 ( ),..., f ( ) be ther correspondng decson functons. Conceptually, Φ ( ) dvdes the entre doan nto two parts accordng to f ( ). Fgure 1 llustrates ths dea. The dashed lnes are the decson boundares. If the unseen docuent, d, falls nto the upper (lower) trangle, t would be labeled as postve (negatve). Usually, f d s further away fro the decson boundary, the decson of d by Φ (d) s ore confdent. Every classfer has dfferent effectveness. For nstance, Support Vectors Machne (SVM) s beng regarded as ore accurate (effectve) than Nave Bayes (NB) [20]). Although t does not ply all of the decsons ade by SVM ust be superor than NB, t does ply that we should value the judgent of SVM hgher than that of NB n general. In ths paper, we ter ths knd of effectveness as global effectveness of a classfer, denoted by α (E.g. α SVM > α NB ). α gves us good nsght about how to weght the classfers n an enseble. Intutvely, f we construct an enseble classfer by groupng Φ a ( ) and Φ b ( ) together, where α a > α b, then we should value Φ a ( ) hgher than Φ b ( ). Yet, a globally effectve classfer ay soetes perfor poorly on soe specfc dataset (doan). As an exaple, consder two classfers, SVM and NB. Accordng to the benchark Reuters21578, the cro-f 1 scores for SVM and NB are respectvely and Unfortunately, the F 1 score for SVM when classfyng Retal (Retal Reuters21578) s 0.0, but t s for NB. As a result, an effectve classfer ay not always perfor well n all doans (e.g., SVM perfors poorly n Retal). Ths can be further llustrated n Fgure 1. The two ovals, A and B, represent two dfferent doans. Oval A covers over the decson boundary, whereas Oval B resdes n the lower trangle. All of the docuents wthn the doan of Oval A are algned near the decson boundary. An unseen docuent that belongs to ths doan ay easly be classfed wrongly. On the other hand, the docuents wthn the doan of Oval B are well separated by the decson boundary. An unseen docuent that belongs to ths doan wll ost lkely be classfed correctly. So, the effectveness of the classfer also reles on the doan of the unseen data. We ter ths knd of effectveness as local effectveness of the classfer, denoted by β. β helps us to adjust the weghts of the classfers n the enseble. If the α of Φ s very hgh but t s not effectve n classfyng the doan of the unseen docuent, we should re-consder ts effectveness. For every decson a classfer akes, one ay ask how confdent the classfer s about the decson? Consder the two unseen docuents, docuent 1 and docuent 2, n the sae doan (Oval B) n Fgure 1. Whle both docuent 1 and docuent 2 resde near the boarder of ther doan, docuent 2 locates closer to the decson boundary (the dashed lne) whereas docuent 1 locates far away fro t. Snce both docuent 1 and docuent 2 belong to the sae doan, the local effectveness of the classfer upon the are the sae. Yet, the confdence n akng a correct decson for docuent 1 should be hgher than that of docuent 2, as docuent 1 s further away fro the decson boundary (d 1 > d 2 ). In ths paper, we ter t as decson confdence. It s estated accordng to the dstance between the unseen docuent and the decson boundary. We suarze the needs for the above coponents as follows: f we gnore α, over-fttng ay result as we neglect the cobned nfluence of all doans. If we gnore 2

3 β, over-generalzaton ay ensue as t reles on the doan where the unseen docuent appears. α and β do not easure the classfer s decson confdence, γ s proposed as t ndcates how uch confdence a classfer has when t classfes the unseen docuents. 3 Dynac Classfers Weghtng (DCW) In the prevous secton, we have explaned why the three weght coponents (α, β and γ ) are helpful n constructng an effectve cobnaton functon, g( ). We now descrbe how they are estated and how they are cobned n an enseble classfer. α s the effectveness of the classfer when we use t to classfy a set of unseen docuents. Durng the tranng phase, although we do not have a set of labeled unseen docuents, we can estate α fro the tranng data, D: we estate α by 10-folded cross valdaton. Whle our experence suggested that estatng the effectveness of a classfer based on cross valdaton would always yeld an optstc result than evaluatng t fro the unseen data, ths would not be a proble n our stuaton, as we are not targetng for evaluatng the real global effectveness of the classfers, but ang at obtanng the relatve global effectveness. We noralze α such that 0 < α < 1 and =1 α = 1. β s the effectveness of the classfer when we use t to classfy the doan of the unseen docuent, d. For an unseen docuent, we would never know what the true doan of d s. As above, we can only estate ts doan accordng to the tranng data, D. Let D be a subset of docuents n the tranng data,.e., D D. We can fnd the doan of the unseen docuent, d, by usng D, to extract the docuents n D that are slar to d. Accordngly, the extracton of D s based on a nearest neghbor strategy. We extract the top n docuents that are ost slar to d fro D. The value n can be readly obtaned through a valdaton dataset. The slartes aong these n docuents are easured by the cosne coeffcent [13]. Snce D s a subset of the tranng data (D D), we wll know precsely the labels of those docuents that appear n D. We estate β by evaluatng D usng the F 1 score. β s noralzed such that 0 < β < 1 and =1 β = 1. γ s a easure about how confdent the classfer s when t akes a decson upon d. Fro Eq. (1), the classfcaton decson of the classfer, Φ ( ), s based on the decson functon, f ( ). For ost cases, f not all, the hgher the agntude of f ( ), the ore confdent are ther decsons. Consequently, we can copute γ by usng the decson functon, f ( ). Unfortunately, the range of f ( ) vares aong dfferent algorths. For exaple, Φ ( ) ay have f ( ) n the range of [ 1,1], whereas Φ j ( ) ay have another f j ( ) n the range of (, + ). Snce dfferent decson functons have dfferent ranges, a drect coparson aong the s napproprate. We solve the proble as follows: Let D be the doan of the unseen docuent. D s obtaned by the technque descrbed prevously. We copute γ as follows: γ = f (d) µ, (3) µ = 1 D f (d ), (4) d D where µ s the average confdence of the decsons ade by f ( ) aong the docuents n D. Snce D D, we can presue that µ s non-zero. When γ > 1, f (d), has ore than average confdence to ake a correct classfcaton on d, where d wll be far away fro the decson boundary (e.g., docuent 1 n Fgure 1). When γ < 1, the decson functon, f (d), has less than average confdence to ake a correct classfcaton on d, where d wll be closer to the decson boundary (e.g., docuent 2 n Fgure 1). We noralze γ such that 0 < γ < 1 and =1 γ = 1. We now present how α, β and γ are cobned. Assue that there are classfers n the enseble. In the ost splest for, the cobnaton functon, g( ) s: decson, (5) where decson = Φ (d) {1,-1} (Eq. (eq:c)). Here, all classfers n the enseble are equally weghted (.e. MV). In DCW, snce a confdence (γ ) s assocated wth each decson, therefore: decson γ. (6) Yet, even for a confdent decson, we need to revew whether the classfer, whch akes ths decson, s effectve n the enseble. Consequently: decson γ effectveness. (7) Snce there are two knds of effectveness for each of the classfer (α and β ), we have: 4 Experental Study (Φ (d) α β γ ), (8) The purpose of the experents s twofold. (1) We want to exane how effectve the Dynac Classfers Weghtng (DCW) s, when t s copared wth the other knds of heterogeneous enseble classfers. As such, we pleented four exstng enseble classfers for coparson: 3

4 No. Cobnaton Reuters21578 Newsgroup20 MV WLC DCS ACC DCW MV WLC DCS ACC DCW 1 S+N S+R S+K R+N K+N K+R S+K+R S+K+N S+R+N K+R+N S+K+R+N Table 1. The results of the cro-f 1 for dfferent enseble classfers. Majorty votng (MV) [8, 9], Weghted lnear cobnaton (WLC) [7], Dynac classfers selecton (DCS) [3, 8, 6, 5], and Adaptve classfers cobnaton (ACC) [8, 9]. We report the results n Secton 4.1. (2) We want to understand how sgnfcant the results are whenever one of the enseble classfers outperfors the others. As such, we perfored a parwse sgnfcant test n Secton 4.2. In the experents, two bencharks are used: Reuters21578 and Newsgroup20. For Reuters21578, we separate the dataset nto tranng data and testng data usng the ModApte splt [2]. For Newsgroup20, for each of the categores, we randoly select 80% of the postngs as tranng data, and the reanng as testng data. For the data preprocessng, punctuaton, nubers, web page addresses, and eal addresses are reoved. All features are steed and converted to lower cases, and are weghted usng the standard tf df schea [14]. Features that appear n only one docuent are gnored. All features are ranked based on the NGL Coeffcent[12], and the top X features are selected. Ths X s tuned for dfferent classfers and for dfferent bencharks. For creatng the enseble classfers, dfferent cobnatons of four knds of classfers are used: (1) Support Vectors Machne (SVM); (2) k-nearest Neghbor (knn); (3) Roccho (ROC); (4) Nave Bayes (NB). Ther default settngs are as follows: For SVM, we use lnear kernel wth C = 1.0. No feature selecton s requred [4]. For knn, we set k = 50 and select 2,750 and 4,900 features for Reuters21578 and Newsgroup20. For ROC, we pleent the verson n [11] and selects 2,750 and 7,500 features for Reuters21578 and Newsgroup20. For NB, we pleent the ultnoal verson [10] and selects 2,750 and 9,500 features for Reuters21578 and Newsgroup Effectveness Analyss Table 1 shows the results of the cro-f 1 score for all enseble classfers (MV, WLC, DCS, ACC and DCW) when they are created usng dfferent cobnatons of the bnary classfers for both bencharks. The left ost colun denotes whch of the bnary classfers are used for creatng the correspondng enseble classfer. We use S, K, R and N to denote SVM, knn, Roccho and Nave Bayes, respectvely. For exaple, S+K+R represents an enseble classfer whch s coprsed of SVM, knn and Roccho. Note that MV cannot be created f the nuber of bnary classfers n the enseble s an even nuber, hence the entres n Table 1. At the frst glance, the results are prosng. DCW, the proposed approach, donates over all other approaches when they are beng created usng the sae set of bnary classfers. Slar results are obtaned when we use the acro-f 1 score. The only case where DCW perfors nferor s case 6 when DCW s created by knn and Roccho (K+R), eanwhle t s evaluated usng Reuters Its cro-f 1 s 0.831, whch s lower than DCS (Dynac Classfers Weghtng). Nevertheless, such a dfference can be neglgbled. Concernng DCW, the best cobnaton of bnary classfers n the enseble s SVM and Roccho (case 2) for Reuters The cro-f 1 score s It s also the best results obtaned aong all of the enseble classfers that we have evaluated. For Newsgroup20 the best result s obtaned by coprsng SVM, knn and Roccho together (case 8). The cro-f 1 score s It s also the best result obtaned aong all approaches. For MV, ts phlosophy s to take the ajorty agreeent aong the bnary classfers n the enseble. Hence, the nuber of bnary classfers ust be an odd nuber. So we can only create MV usng three dfferent bnary classfers. Interestngly, all cobnatons perfor slarly. Concernng WLC, the best cobnaton for Reuters21578 (case 2), ts cro-f 1 score s 0.883, whch s hgher than all enseble classfers (except DCW). For Newsgroup20, slar observatons are ade, where ts best cobnaton s case 8. Although the dea of WLC s very sple assgns statc weghts to the classfers n the enseble accordng to ther global effectveness and 4

5 cobnes the lnearly t perfors surprsngly well. Another nterestng fndng s that when SVM s ncluded n the enseble, the effectveness of WLC would be ncreased draatcally. Ths suggests that the choce of the classfers n WLC s partcularly portant. Concernng DCS, ts best cro-f 1 score for Reuters21578 (case 2) s only. It s far lag behnd all the other approaches. For Newsgroup20, none of the F 1 score s hgher than We beleve that the reasons of why DCS perfors poorly are because: (1) It does not cobne the classfers decsons. Rather, t selects one of the classfer n the enseble and reles on t copletely. (2) It nether pays attenton to the global effectveness of the classfers nor the decson confdence. ACC perfors slghtly better than DCS. Ths ay be because the decson strategy for ACC s ore sophstcated that DCS. The best ensebles for Reuters21578 and Newsgroup20 are both case 7. However, these results are all nferor than both WLC and our DCW. 4.2 Sgnfcant Test In ths secton, we conduct a parwse coparson aong the usng the sgnfcant test [20]. Gven two classfers, Φ A ( ) and Φ B ( ), the sgnfcant test deternes whether Φ A ( ) perfors better than Φ B ( ) based on the errors that Φ A ( ) and Φ B ( ) ade. Let N be the total nuber of the unseen docuents, and a = {0,1} (b = {0,1}) ndcate whether Φ A ( ) (Φ B ( )) akes a correct classfcaton upon the th unseen docuent. a = 0 eans Φ A ( ) akes an ncorrect classfcaton whereas a = 1 eans Φ A ( ) akes a correct one. Slar defnton s also appled to b. Let d a be the nuber of tes that Φ A ( ) perfors better than Φ B ( ), and d b be the nuber of tes that Φ B ( ) perfors better than Φ A ( ). In ths test, the null hypothess s that both classfers perfor the sae (H 0 : d a = d b ). The alternatve s that Φ A ( ) and Φ B ( ) perfors dfferently (H 1 : d a d b ). Table 2 shows the results of coparng the perforance of DCW wth the other enseble classfers. A B eans A perfors sgnfcantly better than B (P-Value 0.01). A > B eans A perfors slghtly better than B. A B eans no evdence ndcates A and B has any dfferences n ters of the errors they ade. A suary s gven below: Reuters21578: {DCW, WLC} > {MV, ACC} DCS Newsgroup20: DCW > WLC > ACC MV DCS 5 Conclusons In order to forulate an effectve cobnaton functon for heterogeneous enseble classfer, three weght coponents are necessary: Global Effectveness, Local Effectveness, and Decson Confdence. We copare DCW wth A B Reuters21578 Newsgroup20 MV WLC < MV DCS MV ACC MV DCW WLC DCS WLC ACC > > WLC DCW < DCS ACC DCS DCW ACC DCW < Table 2. Results of the sgnfcant test. four other knds of heterogeneous enseble classfers usng two bencharks. The results ndcated that DCW can effectvely balance the contrbutons of the three coponents and outperfors the exstng approaches. References [1] W. W. Cohen and Y. Snger. Context-senstve learnng ethods for text categorzaton. ACM Transactons on Inforaton Systes (TOIS), 17(2): , [2] F. Debole and F. Sebastan. An analyss of the relatve hardness of Reuters subsets. Journal of the Aercan Socety for InforatonScence and Technology, 56(6): , [3] G. Gacnto and F. Rol. Adaptve selecton of age classfers. In Proceedngs of the 9th InternatonalConference on Iage Analyss and Processng (ICIAP 97), pages 38 45, Florence, Italy, [4] T. Joachs. Text categorzaton wth support vector achnes: Learnng wth any relevant features. In Proceedngs of 10th European Conference on Machne Learnng (ECML 98), pages , Chentz, Gerany, [5] K. B. Kevn Woods, W. Phlp Kegeleyer. Cobnaton of ultple classfers usng local accuracy estates. IEEE Transactons on Pattern Analyss and Machne Intellgence (TPAMI), 19(4): , [6] W. La and K.-Y. La. A eta-learnng approach for text categorzaton. In Proceedngs of the 24th Annual InternatonalACMSIGIR Conference on Research anddevelopent ninforatonretreval (SIGIR 01), pages , New Orleans, Lousana, USA, [7] L. S. Larkey and W. B. Croft. Cobnng classfers n text categorzaton. In Proceedngs of the 19th Annual InternatonalACMSIGIR Conference on Research anddevelopent ninforatonretreval (SIGIR 96), pages , Zurch, Swtzerland, [8] Y. H. L and A. K. Jan. Classfcaton of text docuents. The Coputer Journal, 41(8): , [9] R. Lere and P. Tadepall. Actve learnng wth cottees for text categorzaton. In Proceedngs of 14th NatonalConference on Artfcal Intellgence (AAAI 97), pages , Provdence, Rhode Island, [10] A. McCallu and K. Nga. A Coparson of Event Models for Nave Bayes Text Classfcaton. In The 15th NatonalConference on Artfcal Intellgence (AAAI 98) Workshop on Learnng for Text Categorzaton, [11] A. Moschtt. Astudyonoptalparaeter tunngforrocchotextclassfer. InProceedngs ofthe25theuropean Conference on InforatonRetreval Research (ECIR 03), pages , Psa, Italy, [12] H. T. Ng, W. B. Goh, and K. L. Low. Feature selecton, percepton learnng, and a usablty case study for text categorzaton. In Proceedngs of the 20th Annual Internatonal ACM SIGIR Conference on Research and Developent n InforatonRetreval (SIGIR 97), pages 67 73, Phladelpha, PA, USA, [13] E. Rasussen. Clusterngalgorth. In W. B. Freakes and R. Baeza-Yates, edtors, Inforaton Retreval Data Structures & Algorths, pages Prentce Hall PTR, [14] G. Salton and C. Buckley. Ter-weghtng approaches n autoatc text retreval. Inforaton Processng and Manageent (IPM), 24(5): , [15] R. E. Schapre and Y. Snger. BoosTexter: a boostng-based syste for text categorzaton. Machne Learnng, 39(2 3): , [16] R. E. Schapre, Y. Snger, and A. Snghal. Boostng and Roccho appled to text flterng. In Proceedngs of the 21st Annual Internatonal ACM SIGIR Conference on Research and Developent n Inforaton Retreval (SIGIR 98), pages , Melbourne, Australa, [17] F. Seabastan. Machne learnng n autoated text categorzaton. ACM CoputngSurveys, 34(1):1 47, [18] S. M. Wess, C. Apte, F. J. Daerau, D. E. Johnson, F. J. Oles, T. Goetz, and T. Happ. Maxzng text-nng perforance. IEEE Intellgent Systes, 14(4):63 69, [19] I. H. Wttenand E. Frank. DataMnng: Practcal Machne Learnng Tools andtechnques. Morgan Kaufann, second edton, [20] Y. Yang and X. Lu. A re-exanaton of text categorzaton ethods. In Proceedngs of the 22nd Annual InternatonalACMSIGIR Conference on Research anddevelopent ninforatonretreval (SIGIR 99), pages 42 49, Berkeley, Calforna, USA,

Optimally Combining Positive and Negative Features for Text Categorization

Optimally Combining Positive and Negative Features for Text Categorization Optally Cobnng Postve and Negatve Features for Text Categorzaton Zhaohu Zheng ZZHENG3@CEDAR.BUFFALO.EDU Rohn Srhar ROHINI@CEDAR.BUFFALO.EDU CEDAR, Dept. of Coputer Scence and Engneerng, State Unversty

More information

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90)

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90) CIS 5543 Coputer Vson Object Detecton What s Object Detecton? Locate an object n an nput age Habn Lng Extensons Vola & Jones, 2004 Dalal & Trggs, 2005 one or ultple objects Object segentaton Object detecton

More information

Using Gini-Index for Feature Selection in Text Categorization

Using Gini-Index for Feature Selection in Text Categorization 3rd Internatonal Conference on Inforaton, Busness and Educaton Technology (ICIBET 014) Usng Gn-Index for Feature Selecton n Text Categorzaton Zhu Wedong 1, Feng Jngyu 1 and Ln Yongn 1 School of Coputer

More information

Merging Results by Using Predicted Retrieval Effectiveness

Merging Results by Using Predicted Retrieval Effectiveness Mergng Results by Usng Predcted Retreval Effectveness Introducton Wen-Cheng Ln and Hsn-Hs Chen Departent of Coputer Scence and Inforaton Engneerng Natonal Tawan Unversty Tape, TAIWAN densln@nlg.cse.ntu.edu.tw;

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Solutions to Programming Assignment Five Interpolation and Numerical Differentiation

Solutions to Programming Assignment Five Interpolation and Numerical Differentiation College of Engneerng and Coputer Scence Mechancal Engneerng Departent Mechancal Engneerng 309 Nuercal Analyss of Engneerng Systes Sprng 04 Nuber: 537 Instructor: Larry Caretto Solutons to Prograng Assgnent

More information

On-line Scheduling Algorithm with Precedence Constraint in Embeded Real-time System

On-line Scheduling Algorithm with Precedence Constraint in Embeded Real-time System 00 rd Internatonal Conference on Coputer and Electrcal Engneerng (ICCEE 00 IPCSIT vol (0 (0 IACSIT Press, Sngapore DOI: 077/IPCSIT0VNo80 On-lne Schedulng Algorth wth Precedence Constrant n Ebeded Real-te

More information

Low training strength high capacity classifiers for accurate ensembles using Walsh Coefficients

Low training strength high capacity classifiers for accurate ensembles using Walsh Coefficients Low tranng strength hgh capacty classfers for accurate ensebles usng Walsh Coeffcents Terry Wndeatt, Cere Zor Unv Surrey, Guldford, Surrey, Gu2 7H t.wndeatt surrey.ac.uk Abstract. If a bnary decson s taken

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

A Cluster Tree Method For Text Categorization

A Cluster Tree Method For Text Categorization Avalable onlne at www.scencedrect.co Proceda Engneerng 5 (20) 3785 3790 Advanced n Control Engneerngand Inforaton Scence A Cluster Tree Meod For Text Categorzaton Zhaoca Sun *, Yunng Ye, Weru Deng, Zhexue

More information

Pose Invariant Face Recognition using Hybrid DWT-DCT Frequency Features with Support Vector Machines

Pose Invariant Face Recognition using Hybrid DWT-DCT Frequency Features with Support Vector Machines Proceedngs of the 4 th Internatonal Conference on 7 th 9 th Noveber 008 Inforaton Technology and Multeda at UNITEN (ICIMU 008), Malaysa Pose Invarant Face Recognton usng Hybrd DWT-DCT Frequency Features

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Large Margin Nearest Neighbor Classifiers

Large Margin Nearest Neighbor Classifiers Large Margn earest eghbor Classfers Sergo Bereo and Joan Cabestany Departent of Electronc Engneerng, Unverstat Poltècnca de Catalunya (UPC, Gran Captà s/n, C4 buldng, 08034 Barcelona, Span e-al: sbereo@eel.upc.es

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Performance Analysis of Coiflet Wavelet and Moment Invariant Feature Extraction for CT Image Classification using SVM

Performance Analysis of Coiflet Wavelet and Moment Invariant Feature Extraction for CT Image Classification using SVM Perforance Analyss of Coflet Wavelet and Moent Invarant Feature Extracton for CT Iage Classfcaton usng SVM N. T. Renukadev, Assstant Professor, Dept. of CT-UG, Kongu Engneerng College, Perundura Dr. P.

More information

Multiple Instance Learning via Multiple Kernel Learning *

Multiple Instance Learning via Multiple Kernel Learning * The Nnth nternatonal Syposu on Operatons Research and ts Applcatons (SORA 10) Chengdu-Juzhagou, Chna, August 19 23, 2010 Copyrght 2010 ORSC & APORC, pp. 160 167 ultple nstance Learnng va ultple Kernel

More information

Identifying Key Factors and Developing a New Method for Classifying Imbalanced Sentiment Data

Identifying Key Factors and Developing a New Method for Classifying Imbalanced Sentiment Data Identfyng Key Factors and Developng a New Method for Classfyng Ibalanced Sentent Data Long-Sheng Chen* and Kun-Cheng Sun Abstract Bloggers opnons related to coercal products/servces ght have a sgnfcant

More information

Relevance Feedback Document Retrieval using Non-Relevant Documents

Relevance Feedback Document Retrieval using Non-Relevant Documents Relevance Feedback Document Retreval usng Non-Relevant Documents TAKASHI ONODA, HIROSHI MURATA and SEIJI YAMADA Ths paper reports a new document retreval method usng non-relevant documents. From a large

More information

A Novel System for Document Classification Using Genetic Programming

A Novel System for Document Classification Using Genetic Programming Journal of Advances n Inforaton Technology Vol. 6, No. 4, Noveber 2015 A Novel Syste for Docuent Classfcaton Usng Genetc Prograng Saad M. Darwsh, Adel A. EL-Zoghab, and Doaa B. Ebad Insttute of Graduate

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Color Image Segmentation Based on Adaptive Local Thresholds

Color Image Segmentation Based on Adaptive Local Thresholds Color Iage Segentaton Based on Adaptve Local Thresholds ETY NAVON, OFE MILLE *, AMI AVEBUCH School of Coputer Scence Tel-Avv Unversty, Tel-Avv, 69978, Israel E-Mal * : llero@post.tau.ac.l Fax nuber: 97-3-916084

More information

Handwritten English Character Recognition Using Logistic Regression and Neural Network

Handwritten English Character Recognition Using Logistic Regression and Neural Network Handwrtten Englsh Character Recognton Usng Logstc Regresson and Neural Network Tapan Kuar Hazra 1, Rajdeep Sarkar 2, Ankt Kuar 3 1 Departent of Inforaton Technology, Insttute of Engneerng and Manageent,

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

An Efficient Fault-Tolerant Multi-Bus Data Scheduling Algorithm Based on Replication and Deallocation

An Efficient Fault-Tolerant Multi-Bus Data Scheduling Algorithm Based on Replication and Deallocation BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volue 16, No Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-001 An Effcent Fault-Tolerant Mult-Bus Data

More information

Optimization Methods: Integer Programming Integer Linear Programming 1. Module 7 Lecture Notes 1. Integer Linear Programming

Optimization Methods: Integer Programming Integer Linear Programming 1. Module 7 Lecture Notes 1. Integer Linear Programming Optzaton Methods: Integer Prograng Integer Lnear Prograng Module Lecture Notes Integer Lnear Prograng Introducton In all the prevous lectures n lnear prograng dscussed so far, the desgn varables consdered

More information

Survey of Classification Techniques in Data Mining

Survey of Classification Techniques in Data Mining Proceedngs of the Internatonal MultConference of Engneers and Coputer Scentsts 2009 Vol I Survey of Classfcaton Technques n Data Mnng Thar Nu Phyu Abstract Classfcaton s a data nng (achne learnng) technque

More information

Comparative Study between different Eigenspace-based Approaches for Face Recognition

Comparative Study between different Eigenspace-based Approaches for Face Recognition Coparatve Study between dfferent Egenspace-based Approaches for Face Recognton Pablo Navarrete and Javer Ruz-del-Solar Departent of Electrcal Engneerng, Unversdad de Chle, CHILE Eal: {pnavarre, jruzd}@cec.uchle.cl

More information

A Novel Fuzzy Classifier Using Fuzzy LVQ to Recognize Online Persian Handwriting

A Novel Fuzzy Classifier Using Fuzzy LVQ to Recognize Online Persian Handwriting A Novel Fuzzy Classfer Usng Fuzzy LVQ to Recognze Onlne Persan Handwrtng M. Soleyan Baghshah S. Bagher Shourak S. Kasae Departent of Coputer Engneerng, Sharf Unversty of Technology, Tehran, Iran soleyan@ce.sharf.edu

More information

A Semantic Model for Video Based Face Recognition

A Semantic Model for Video Based Face Recognition Proceedng of the IEEE Internatonal Conference on Inforaton and Autoaton Ynchuan, Chna, August 2013 A Seantc Model for Vdeo Based Face Recognton Dhong Gong, Ka Zhu, Zhfeng L, and Yu Qao Shenzhen Key Lab

More information

Adaptive Sampling with Optimal Cost for Class-Imbalance Learning

Adaptive Sampling with Optimal Cost for Class-Imbalance Learning Proceedngs of the Twenty-Nnth AAAI Conference on Artfcal Intellgence Adaptve Saplng wth Optal Cost for Class-Ibalance Learnng Yuxn Peng Insttute of Coputer Scence and Technology, Pekng Unversty, Bejng

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Experiments in Text Categorization Using Term Selection by Distance to Transition Point

Experiments in Text Categorization Using Term Selection by Distance to Transition Point Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Introduction. Leslie Lamports Time, Clocks & the Ordering of Events in a Distributed System. Overview. Introduction Concepts: Time

Introduction. Leslie Lamports Time, Clocks & the Ordering of Events in a Distributed System. Overview. Introduction Concepts: Time Lesle Laports e, locks & the Orderng of Events n a Dstrbuted Syste Joseph Sprng Departent of oputer Scence Dstrbuted Systes and Securty Overvew Introducton he artal Orderng Logcal locks Orderng the Events

More information

Human Face Recognition Using Radial Basis Function Neural Network

Human Face Recognition Using Radial Basis Function Neural Network Huan Face Recognton Usng Radal Bass Functon eural etwor Javad Haddadna Ph.D Student Departent of Electrcal and Engneerng Arabr Unversty of Technology Hafez Avenue, Tehran, Iran, 594 E-al: H743970@cc.au.ac.r

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms

Generating Fuzzy Term Sets for Software Project Attributes using and Real Coded Genetic Algorithms Generatng Fuzzy Ter Sets for Software Proect Attrbutes usng Fuzzy C-Means C and Real Coded Genetc Algorths Al Idr, Ph.D., ENSIAS, Rabat Alan Abran, Ph.D., ETS, Montreal Azeddne Zah, FST, Fes Internatonal

More information

A system based on a modified version of the FCM algorithm for profiling Web users from access log

A system based on a modified version of the FCM algorithm for profiling Web users from access log A syste based on a odfed verson of the FCM algorth for proflng Web users fro access log Paolo Corsn, Laura De Dosso, Beatrce Lazzern, Francesco Marcellon Dpartento d Ingegnera dell Inforazone va Dotsalv,

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Key-Words: - Under sear Hydrothermal vent image; grey; blue chroma; OTSU; FCM

Key-Words: - Under sear Hydrothermal vent image; grey; blue chroma; OTSU; FCM A Fast and Effectve Segentaton Algorth for Undersea Hydrotheral Vent Iage FUYUAN PENG 1 QIAN XIA 1 GUOHUA XU 2 XI YU 1 LIN LUO 1 Electronc Inforaton Engneerng Departent of Huazhong Unversty of Scence and

More information

Relevance Feedback in Content-based 3D Object Retrieval A Comparative Study

Relevance Feedback in Content-based 3D Object Retrieval A Comparative Study 753 Coputer-Aded Desgn and Applcatons 008 CAD Solutons, LLC http://www.cadanda.co Relevance Feedback n Content-based 3D Object Retreval A Coparatve Study Panagots Papadaks,, Ioanns Pratkaks, Theodore Trafals

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Nighttime Motion Vehicle Detection Based on MILBoost

Nighttime Motion Vehicle Detection Based on MILBoost Sensors & Transducers 204 by IFSA Publshng, S L http://wwwsensorsportalco Nghtte Moton Vehcle Detecton Based on MILBoost Zhu Shao-Png,, 2 Fan Xao-Png Departent of Inforaton Manageent, Hunan Unversty of

More information

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples 94 JOURNAL OF COMPUTERS, VOL. 4, NO. 1, JANUARY 2009 Relable Negatve Extractng Based on knn for Learnng from Postve and Unlabeled Examples Bangzuo Zhang College of Computer Scence and Technology, Jln Unversty,

More information

Using Ambiguity Measure Feature Selection Algorithm for Support Vector Machine Classifier

Using Ambiguity Measure Feature Selection Algorithm for Support Vector Machine Classifier Usng Ambguty Measure Feature Selecton Algorthm for Support Vector Machne Classfer Saet S.R. Mengle Informaton Retreval Lab Computer Scence Department Illnos Insttute of Technology Chcago, Illnos, U.S.A

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Aircraft Engine Gas Path Fault Diagnosis Based on Fuzzy Inference

Aircraft Engine Gas Path Fault Diagnosis Based on Fuzzy Inference 202 Internatonal Conference on Industral and Intellgent Inforaton (ICIII 202) IPCSIT vol.3 (202) (202) IACSIT Press, Sngapore Arcraft Engne Gas Path Fault Dagnoss Based on Fuzzy Inference Changzheng L,

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

A New Scheduling Algorithm for Servers

A New Scheduling Algorithm for Servers A New Schedulng Algorth for Servers Nann Yao, Wenbn Yao, Shaobn Ca, and Jun N College of Coputer Scence and Technology, Harbn Engneerng Unversty, Harbn, Chna {yaonann, yaowenbn, cashaobn, nun}@hrbeu.edu.cn

More information

Local Subspace Classifiers: Linear and Nonlinear Approaches

Local Subspace Classifiers: Linear and Nonlinear Approaches Local Subspace Classfers: Lnear and Nonlnear Approaches Hakan Cevkalp, Meber, IEEE, Dane Larlus, Matths Douze, and Frederc Jure, Meber, IEEE Abstract he -local hyperplane dstance nearest neghbor (HNN algorth

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Comparing High-Order Boolean Features

Comparing High-Order Boolean Features Brgham Young Unversty BYU cholarsarchve All Faculty Publcatons 2005-07-0 Comparng Hgh-Order Boolean Features Adam Drake adam_drake@yahoo.com Dan A. Ventura ventura@cs.byu.edu Follow ths and addtonal works

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY Proceedngs of the 20 Internatonal Conference on Machne Learnng and Cybernetcs, Guln, 0-3 July, 20 THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY JUN-HAI ZHAI, NA LI, MENG-YAO

More information

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made

More information

Predicting Power Grid Component Outage In Response to Extreme Events. S. BAHRAMIRAD ComEd USA

Predicting Power Grid Component Outage In Response to Extreme Events. S. BAHRAMIRAD ComEd USA 1, rue d Artos, F-75008 PARIS CIGRE US Natonal Cottee http : //www.cgre.org 016 Grd of the Future Syposu Predctng Power Grd Coponent Outage In Response to Extree Events R. ESKANDARPOUR, A. KHODAEI Unversty

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Multimodal Biometric System Using Face-Iris Fusion Feature

Multimodal Biometric System Using Face-Iris Fusion Feature JOURNAL OF COMPUERS, VOL. 6, NO. 5, MAY 2011 931 Multodal Boetrc Syste Usng Face-Irs Fuson Feature Zhfang Wang, Erfu Wang, Shuangshuang Wang and Qun Dng Key Laboratory of Electroncs Engneerng, College

More information

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Monte Carlo Evaluation of Classification Algorithms Based on Fisher's Linear Function in Classification of Patients With CHD

Monte Carlo Evaluation of Classification Algorithms Based on Fisher's Linear Function in Classification of Patients With CHD IOSR Journal of Matheatcs (IOSR-JM) e-issn: 2278-5728, p-issn: 2319-765X. Volue 13, Issue 1 Ver. IV (Jan. - Feb. 2017), PP 104-109 www.osrjournals.org Monte Carlo Evaluaton of Classfcaton Algorths Based

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Efficient Text Classification by Weighted Proximal SVM *

Efficient Text Classification by Weighted Proximal SVM * Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

MINING VERY LARGE DATASETS WITH SVM AND VISUALIZATION

MINING VERY LARGE DATASETS WITH SVM AND VISUALIZATION MINING VERY LARGE DATASETS WITH SVM AND VISUALIZATION Author, Author2 Address Eal: eal, eal2 Keywords: Mnng very large datasets, Support vector achnes, Actve learnng, Interval data analyss, Vsual data

More information

Prediction of Dumping a Product in Textile Industry

Prediction of Dumping a Product in Textile Industry Int. J. Advanced Networkng and Applcatons Volue: 05 Issue: 03 Pages:957-96 (03) IN : 0975-090 957 Predcton of upng a Product n Textle Industry.V.. GANGA EVI Professor n MCA K..R.M. College of Engneerng

More information

STATIC MAPPING FOR OPENCL WORKLOADS IN HETEROGENEOUS COMPUTER SYSTEMS

STATIC MAPPING FOR OPENCL WORKLOADS IN HETEROGENEOUS COMPUTER SYSTEMS STATIC MAPPING FOR OPENCL WORKLOADS IN HETEROGENEOUS COMPUTER SYSTEMS 1 HENDRA RAHMAWAN, 2 KUSPRIYANTO, 3 YUDI SATRIA GONDOKARYONO School of Electrcal Engneerng and Inforatcs, Insttut Teknolog Bandung,

More information

A Robust Descriptor based on Weber s Law

A Robust Descriptor based on Weber s Law A Robust Descrptor based on Weber s Law Je Chen,2,3 Shguang Shan Guoyng Zhao 2 Xln Chen Wen Gao,3 Matt Petkänen 2 Key Laboratory of Intellgent Inforaton Processng, Chnese Acadey of Scences (CAS), Insttute

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Specialized Weighted Majority Statistical Techniques in Robotics (Fall 2009)

Specialized Weighted Majority Statistical Techniques in Robotics (Fall 2009) Statstcal Technques n Robotcs (Fall 09) Keywords: classfer ensemblng, onlne learnng, expert combnaton, machne learnng Javer Hernandez Alberto Rodrguez Tomas Smon javerhe@andrew.cmu.edu albertor@andrew.cmu.edu

More information

A Novel Term_Class Relevance Measure for Text Categorization

A Novel Term_Class Relevance Measure for Text Categorization A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure

More information

Clustering of Words Based on Relative Contribution for Text Categorization

Clustering of Words Based on Relative Contribution for Text Categorization Clusterng of Words Based on Relatve Contrbuton for Text Categorzaton Je-Mng Yang, Zh-Yng Lu, Zhao-Yang Qu Abstract Term clusterng tres to group words based on the smlarty crteron between words, so that

More information

An Anti-Noise Text Categorization Method based on Support Vector Machines *

An Anti-Noise Text Categorization Method based on Support Vector Machines * An Ant-Nose Text ategorzaton Method based on Support Vector Machnes * hen Ln, Huang Je and Gong Zheng-Hu School of omputer Scence, Natonal Unversty of Defense Technology, hangsha, 410073, hna chenln@nudt.edu.cn,

More information

An Evaluation of Divide-and-Combine Strategies for Image Categorization by Multi-Class Support Vector Machines

An Evaluation of Divide-and-Combine Strategies for Image Categorization by Multi-Class Support Vector Machines An Evaluaton of Dvde-and-Combne Strateges for Image Categorzaton by Mult-Class Support Vector Machnes C. Demrkesen¹ and H. Cherf¹, ² 1: Insttue of Scence and Engneerng 2: Faculté des Scences Mrande Galatasaray

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

ENSEMBLE learning has been widely used in data and

ENSEMBLE learning has been widely used in data and IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 9, NO. 5, SEPTEMBER 2012 943 Sparse Kernel-Based Hyperspectral Anoaly Detecton Prudhv Gurra, Meber, IEEE, Heesung Kwon, Senor Meber, IEEE, andtothyhan Abstract

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

Extraction of User Preferences from a Few Positive Documents

Extraction of User Preferences from a Few Positive Documents Extracton of User Preferences from a Few Postve Documents Byeong Man Km, Qng L Dept. of Computer Scences Kumoh Natonal Insttute of Technology Kum, kyungpook, 730-70,South Korea (Bmkm, lqng)@se.kumoh.ac.kr

More information

User Behavior Recognition based on Clustering for the Smart Home

User Behavior Recognition based on Clustering for the Smart Home 3rd WSEAS Internatonal Conference on REMOTE SENSING, Vence, Italy, Noveber 2-23, 2007 52 User Behavor Recognton based on Clusterng for the Sart Hoe WOOYONG CHUNG, JAEHUN LEE, SUKHYUN YUN, SOOHAN KIM* AND

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Joint Registration and Active Contour Segmentation for Object Tracking

Joint Registration and Active Contour Segmentation for Object Tracking Jont Regstraton and Actve Contour Segentaton for Object Trackng Jfeng Nng a,b, Le Zhang b,1, Meber, IEEE, Davd Zhang b, Fellow, IEEE and We Yu a a College of Inforaton Engneerng, Northwest A&F Unversty,

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES Aram AlSuer, Ahmed Al-An and Amr Atya 2 Faculty of Engneerng and Informaton Technology, Unversty of Technology, Sydney, Australa

More information

Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection

Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton

More information

Web Document Classification Based on Fuzzy Association

Web Document Classification Based on Fuzzy Association Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA charuech@mam.edu, shyu@mam.edu

More information