An Improvement to Naive Bayes for Text Classification
|
|
- Tabitha Gibson
- 5 years ago
- Views:
Transcription
1 Avalable onlne at Proceda Engneerng 15 (2011) Advancen Control Engneerngand Informaton Scence An Improvement to Nave Bayes for Text Classfcaton We Zhang a, Feng Gao a, a* a MOE KLINNS Lab, X an Jaotong Unversty, X an, Shaanx Provnce, ,Chna Abstract Naïve Bayes classfers whch are wdely used for text classfcaton n machne learnng are based on the condtonal probablty of features belongng to a class, whch the features are selected by feature selecton methods. In ths paper, an auxlary feature methos proposed. It determnes features by an exstng feature selecton method, and selects an auxlary feature whch can reclassfy the text space amed at the chosen features. Then the correspondng condtonal probablty s adusten order to mprove classfcaton accuracy. Illustratve examples show that the proposed meth-ondeemproves the performance of naïve Bayes classfer Publshed by Elsever Ltd. Open access under CC BY-NC-ND lcense. Selecton and/or peer-revew under responsblty of [CEIS 2011] Keywords:Text classfcaton; Feature selecton; Machne learnng; Naïve Bayes. 1. Introducton Nave Bayes s based on Bayes theorem and an attrbute ndependence assumpton [1],[2]. Its compettve performance n classfcaton s surprsng, because the condtonal ndependence assumpton on whch t s based, s rarely true n real world applcatons. Naïve Bayes have been studed extensvely by some researchers n text classfcaton task [3],[4]. The exstng lteratures about text classfcaton wth naïve Bayes have focused on three aspects. One s to construct anmprove naïve Bayes model[5],[6],[7], another s to dscuss the naïve hypothess enable or not, and analyze the effect on classfy performance, then present the correspondng mprovement[8]. The other s to mprove feature selecton because naïve Bayes s hghly senstve to feature selecton * Correspondng author. Tel.: E-mal address: wzhang@se.xtu.edu.cn Publshed by Elsever Ltd. do: /.proeng Open access under CC BY-NC-ND lcense.
2 We Zhang and Feng Gao / Proceda Engneerng 15 (2011) [9],[10],[11],[12]. In theses lteratures, the naïve Bayes classfers are based on the condtonal probablty of the features belongng to one class after features selecton usng the exstng methods. An auxlary feature methos proposen ths paper. It determnes features by an exstng feature selecton method, and selects an auxlary feature whch can reclassfy the class space amed at the chosen features. Then the correspondng condtonal probablty s adusten order to mprove classfcaton accuracy. The experments wth data sets obtaned from CCERT shows that the auxlary feature methomproves the classfcaton accuracy of naïve Bayes because of reclassfcaton of the class space. In ths paper, we propose the auxlary feature method. The feature wth auxlary feature was found and the probablty of the feature wth auxlary feature was adusted after feature selecton. Ths paper s structured as follows. Secton 2 ntroduces Naïve Bayes model. In Secton 3 we present our approach for enhancng nave Bayes by usng auxlary feature adust probablty. Secton 4 contans expermental results demonstratng that the predctve accuracy of nave Bayes can be mproved by auxlary feature method. Secton 5 dscusses related work and future work. 2. Naïve Bayes Model n Text Classfcaton Denote a vector of varables D = d, = 1,2,..., n, represent document, where d s correspondng to a letter, a word, or other attrbutes about some text n realty, and a set of C = { c1, c 2,..., ck } s predefned classes. Text classfcaton s to assgn a class label c, = 1,2,..., k from C to a document. Bayes classfer s a hybrd parameter probablty model n essence: ) P( D c ) = P( D) (1) P c s pror nformaton of the appearng probablty of class, P D s the nformaton Where ( ) c ( ) from observatons, whch s the knowledge from the text tself to be classfed, and dstrbuton probablty of document D n classes space. Bayes classfer s to ntegrate these nformaton and compute separately the posteror of document D fallng nto each class c, and assgn the document to the class wth the hghest probablty, that s * c ( D) = arg max P( c (2) P( D c ) s the Assume the components of D are ndependent wth each other snce condtonal probablty P( D c ) cannot be computed drectly n practce. Thus P( D c ) = P( d c) (3) The model wth the above assumpton s called natve Bayes model, and equaton (1) becomes ) P( d c) = (4) P( D) Because the sample nformaton P( D) s dentcal to each class c, = 1,2,..., k, equaton (2) becomes * c D = arg max P c P d c (5) ( ) ( ) ( )
3 2162 We Zhang and Feng Gao / Proceda Engneerng 15 (2011) Ths paper wll deal wth two knds of classfcaton problems for smplcty, and denote class set C = { +, }. We choose mult-varable Bernoull model as our Bayes model[13], that s, f correspondng to feature presence n the document, assgn ts value 1, otherwse, assgn the value Auxlary Feature Method There are lots of feature selecton methods n text classfcaton, such as DF (Document Frequency) IG (Informaton Gan) MI (Mutual Informaton) OR(Odds Rato) χ 2 (CHI Squared Statstc) ECE (Expected Cross Entropy) WET (the Weght of Evdence for Text)[14],[15],[16],[17], and plentful works have been researched on these methods. Ths paper won t evaluate the pros and cons of them here, and we ust adust the probablty of feature wth the auxlary property to mprove the performance of orgnal naïve Bayes classfer, after feature selecton usng exstng algorthm. Denote the feature correspondng to the component d as, and denote p w + and p w as p + p an separately for smplcty. For every ndependent feature, f p+ > p, Bayes classfer wll assgn the document to class +. The dea here s to fnd a auxlary feature for feature,whch satsfes p p > p p + + p +, and ans the probablty of document belongng to class + and - w p respectvely when feature and auxlary feature presence smultaneously n the document. The geometrc llustraton for the methos as follows: w w ( ) w ( ). (b) The text space. (d) The msclass- Fg.1. The geometrc llustraton of auxlary feature method. (a) The text space dvson by the feature dvson by the feature w and the auxlary feature w. (c) The msclassfcaton nduced by the feature fcaton nduced by the feature and the auxlary feature w w. w w In fgure 1, (a) denotes all the document ncludng the features ans a subspace of the whole text space, and the dagonal separate dfferent classes; The shadow part of (b) denotes the text subspace smultaneously ncludng auxlary feature w ; The shadow part of (c) denotes the classfcaton fault of orgnal naïve Bayes method. The shadow part of (d) denotes classfcaton fault of auxlary feature method. The constrant condton p p+ > p+ p when fndng auxlary feature s to guarantee the top part of the shadow s smaller than the blank porton below n (d). Algorthm of auxlary feature: Step1:make feature selecton to determne wth text vector of the dmenson n. Step2:fnd the set Θ composed of all document n whch presence. Step3:fnd the auxlary feature for d n Θ. Step4:repeat step2 and step3 for = 1, 2,..., n. w
4 We Zhang and Feng Gao / Proceda Engneerng 15 (2011) To classfy a new document wth a feature vector d1, d2,..., dn, hypothess the auxlary feature of the feature correspondng to the component presence n the document make p = p, p = p exst. Whle the auxlary feature and orgnal feature all, otherwse same wth naïve Bayes. Note that not all of the features have correspondng auxlary one. If one feature exst more than a auxlary feature, we choce auxlary feature whch s subect to max( ( p p ) ( p p )) Experments The data sets we used here s two mal sets wth unk mals and normal mals from CCERT[18]. We choose unk mals and normal mals as the data sets for our experment. There are three cases here: (a) choose 1000 features to represent the document, ant turns out that 96 features have auxlary feature; (b) choose 1500 features to represent the document, ant turns out that 152 features have auxlary feature; (c) choose 2000 features to represent the document, ant turns out that 217 features have auxlary feature. We conduct 5 tmes 10-fold cross valdaton usng naïve Bayes method and the proposed method respectvely for the three cases. The average results are as follows. Table1.The classfcaton precse of auxlary feature method vs. nave Bayes. Naïve Bayes Auxlary Feature Method 1000 features features features Concluson After feature selecton n text classfcaton, nave Bayes classfer partton the text subspace composed of all document n whch d present based on each. Because naïve Bayes classfer assgns the document to the class wth the hghest probablty, naïve Bayes classfer s the optmal n probablty sense. The auxlary feature method proposed here partton the text subspace agan, so t outperforms the tradtonal way, anllustratve examples show that the proposed methondeemproves the performance of nave Bayes classfer. Snce the auxlary feature method need choose features twce, how to gve the auxlary drectly s meanngful and can reduce the computaton complexty. Meanwhle, the relatonshp between exstng choosng features methods and our methos promsng. In addton,due to the sparsty problem n text classfcaton, whether to take the feature total of document nto account when adustng the probablty s worth to work other than substtuton n ths paper. Acknowledgements The research s supporten part by the Natonal Natural Scence Foundaton ( , , , ), Natonal Scence Fund for Dstngushed Young Scholars ( ), Key
5 2164 We Zhang and Feng Gao / Proceda Engneerng 15 (2011) Proects n the Natonal Scence &Technology Pllar Program (2011BAK08B02), 863 Hgh Tech Development Plan (2007AA01Z480, 2008AA01Z415). References [1] Duda,R.O.and Hart,P.E.:Pattern Classfcaton and Scene Analyss.New York:John Wley [2] Lews, D.D. Nave (Bayes) at forty: The ndependence assumpton n nformaton retreval. MachneLearnng: ECML-98, Tenth European Conference on Machne Learnng [3] D.D. Lews, Representaton and Learnng n Informaton Retreval, PhD dssertaton, Dept. of Computer Scence, Unv.of Massachusetts, Amherst, [4] A.K. McCallum and K. Ngam, Employng EM n Pool-Based Actve Learnng for Text Classfcaton, Proc. ICML-98, 15th Int l Conf. Machne Learnng, J.W. Shavlk, ed., pp , [5] McCallum, A., Ngam, K.: A comparson of event models for Nave Bayes text classfcaton. In: Learnng for Text Categorzaton: Papers from the AAAI Workshop, AAAI Press Techncal Report WS [6] Eyheramendy, S., Lews, D.D., Madgan, D.: On the Nave Bayes model for text categorzaton. In Bshop, C.M., Frey, B.J., eds.: AI & Statstcs 2003: Proceedngs of the Nnth InternatonalWorkshop on Artfcal Intellgence and Statstcs [7] D. Pavlov, R. Balasubramanyan, B. Dom, S. Kapur, and J. Parkh. Document preprocessng for naïve bayes classfcaton and clusterng wth mxture of multnomals. In Proceedngs of the ACM SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng (KDD-2004), [8] Eyheramendy, S., Lews, D.D., Madgan, D.: On the Nave Bayes model for text categorzaton. In Bshop, C.M., Frey, B.J., eds.: AI & Statstcs 2003: Proceedngs of the Nnth InternatonalWorkshop on Artfcal Intellgence and Statstcs. 2003, [9] Y. Yang and J. O. Pedersen. A comparatve study on feature selecton n text categorzaton. In Internatonal Conference on Machne Learnng, pages , [10] Rogat, M.; Yang, Y. Hgh-performng feature selecton for text classfcaton. CIKM 02, 2002, pp [11] J. Chen, H. Huang, S. Tan, and Y. Qu, "Feature selecton for text classfcaton wth Naïve Bayes," Expert Systems wth Applcatons, vol. 36, no. 3, pp , Aprl [12] M. Srnvas, K. P. Supreeth, E. V. Prasad, and S. A. Kumar, "Effcent Text Classfcaton Usng Best Feature Selecton and Combnaton of Methods," n Proceedngs of the Symposum on Human Interface 2009 on ConferenceUnversal Access n Human-Computer Interacton. Part I: Held as Part of HCI Internatonal Sprnger-Verlag, 2009, pp [13] McCallum, A., Ngam, K.: A comparson of event models for Nave Bayes text classfcaton. In: Learnng for Text Categorzaton: Papers from the AAAI Workshop,AAAI Press (1998) Techncal Report WS [14] D. Koller and M. Saham. Toward optmal feature selecton. In Internatonal Conference on Machne Learnng, pages , [15] Y. Yang and J. O. Pedersen. A comparatve study on feature selecton n text categorzaton. In Internatonal Conference on Machne Learnng, pages , [16] S. Das. Flters, wrappers and a boostng-based hybrd for feature selecton. In Internatonal Conference on Machne Learnng, [17] E. P. Xng, M. I. Jordan, and R. M. Karp. Feature selecton for hgh-dmensonal genomc mcroarray data. In Proc. 18th Internatonal Conf. on Machne Learnng, pages Morgan Kaufmann, San Francsco, CA, [18]
The Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationInvestigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers
Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,
More informationImplementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status
Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationAvailable online at Available online at Advanced in Control Engineering and Information Science
Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationKeywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines
(IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak
More informationYan et al. / J Zhejiang Univ-Sci C (Comput & Electron) in press 1. Improving Naive Bayes classifier by dividing its decision regions *
Yan et al. / J Zhejang Unv-Sc C (Comput & Electron) n press 1 Journal of Zhejang Unversty-SCIENCE C (Computers & Electroncs) ISSN 1869-1951 (Prnt); ISSN 1869-196X (Onlne) www.zju.edu.cn/jzus; www.sprngerlnk.com
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationAn Anti-Noise Text Categorization Method based on Support Vector Machines *
An Ant-Nose Text ategorzaton Method based on Support Vector Machnes * hen Ln, Huang Je and Gong Zheng-Hu School of omputer Scence, Natonal Unversty of Defense Technology, hangsha, 410073, hna chenln@nudt.edu.cn,
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationAssociative Based Classification Algorithm For Diabetes Disease Prediction
Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha
More informationImpact of a New Attribute Extraction Algorithm on Web Page Classification
Impact of a New Attrbute Extracton Algorthm on Web Page Classfcaton Gösel Brc, Banu Dr, Yldz Techncal Unversty, Computer Engneerng Department Abstract Ths paper ntroduces a new algorthm for dmensonalty
More informationDeep Classification in Large-scale Text Hierarchies
Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong
More informationAn Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment
IJCSI Internatonal Journal of Computer Scence Issues, Vol. 7, Issue 5, September 2010 ISSN (Onlne): 1694-0814 www.ijcsi.org 374 An Evolvable Clusterng Based Algorthm to Learn Dstance Functon for Supervsed
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationClassifier Ensemble Design using Artificial Bee Colony based Feature Selection
IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 3, No 2, May 2012 ISSN (Onlne): 1694-0814 www.ijcsi.org 522 Classfer Ensemble Desgn usng Artfcal Bee Colony based Feature Selecton Shunmugaprya
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationAn Improved Image Segmentation Algorithm Based on the Otsu Method
3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,
More informationA User Selection Method in Advertising System
Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy
More informationA Novel Term_Class Relevance Measure for Text Categorization
A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure
More informationNetwork Intrusion Detection Based on PSO-SVM
TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*
More informationDocument Representation and Clustering with WordNet Based Similarity Rough Set Model
IJCSI Internatonal Journal of Computer Scence Issues, Vol. 8, Issue 5, No 3, September 20 ISSN (Onlne): 694-084 www.ijcsi.org Document Representaton and Clusterng wth WordNet Based Smlarty Rough Set Model
More informationThe Shortest Path of Touring Lines given in the Plane
Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationAvailable online at ScienceDirect. Procedia Environmental Sciences 26 (2015 )
Avalable onlne at www.scencedrect.com ScenceDrect Proceda Envronmental Scences 26 (2015 ) 109 114 Spatal Statstcs 2015: Emergng Patterns Calbratng a Geographcally Weghted Regresson Model wth Parameter-Specfc
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationEfficient Text Classification by Weighted Proximal SVM *
Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna
More informationExperiments in Text Categorization Using Term Selection by Distance to Transition Point
Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur
More informationCSCI 5417 Information Retrieval Systems Jim Martin!
CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne
More informationA Deflected Grid-based Algorithm for Clustering Analysis
A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan
More informationBAYESIAN MULTI-SOURCE DOMAIN ADAPTATION
BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,
More informationReliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples
94 JOURNAL OF COMPUTERS, VOL. 4, NO. 1, JANUARY 2009 Relable Negatve Extractng Based on knn for Learnng from Postve and Unlabeled Examples Bangzuo Zhang College of Computer Scence and Technology, Jln Unversty,
More informationA Weighted Method to Improve the Centroid-based Classifier
016 Internatonal onference on Electrcal Engneerng and utomaton (IEE 016) ISN: 978-1-60595-407-3 Weghted ethod to Improve the entrod-based lassfer huan LIU, Wen-yong WNG *, Guang-hu TU, Nan-nan LIU and
More informationSemi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation
Sem Supervsed Learnng usng Hgher Order Cooccurrence Paths to Overcome the Complexty of Data Representaton Murat Can Ganz Computer Engneerng Department, Faculty of Engneerng Marmara Unversty, İstanbul,
More informationAn Image Fusion Approach Based on Segmentation Region
Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua
More informationA New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1
A New Feature of Unformty of Image Texture Drectons Concdng wth the Human Eyes Percepton Xng-Jan He, De-Shuang Huang, Yue Zhang, Tat-Mng Lo 2, and Mchael R. Lyu 3 Intellgent Computng Lab, Insttute of Intellgent
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationTHE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY
Proceedngs of the 20 Internatonal Conference on Machne Learnng and Cybernetcs, Guln, 0-3 July, 20 THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY JUN-HAI ZHAI, NA LI, MENG-YAO
More informationResearch Article A High-Order CFS Algorithm for Clustering Big Data
Moble Informaton Systems Volume 26, Artcle ID 435627, 8 pages http://dx.do.org/.55/26/435627 Research Artcle A Hgh-Order Algorthm for Clusterng Bg Data Fanyu Bu,,2 Zhku Chen, Peng L, Tong Tang, 3 andyngzhang
More informationJournal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray
More informationWeb Document Classification Based on Fuzzy Association
Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA charuech@mam.edu, shyu@mam.edu
More informationNon-Split Restrained Dominating Set of an Interval Graph Using an Algorithm
Internatonal Journal of Advancements n Research & Technology, Volume, Issue, July- ISS - on-splt Restraned Domnatng Set of an Interval Graph Usng an Algorthm ABSTRACT Dr.A.Sudhakaraah *, E. Gnana Deepka,
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationConcurrent Apriori Data Mining Algorithms
Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng
More informationNon-Negative Matrix Factorization and Support Vector Data Description Based One Class Classification
IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No, September 01 ISSN (Onlne): 1694-0814 www.ijcsi.org 36 Non-Negatve Matrx Factorzaton and Support Vector Data Descrpton Based One
More informationA Clustering Algorithm for Chinese Adjectives and Nouns 1
Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,
More informationThe Research of Tax Text Categorization based on Rough Set
Avalable onlne at www.scencedrect.com Physcs Proceda 33 (01 ) 1683 1688 01 Internatonal Conference on Medcal Physcs and Bomedcal Engneerng The esearch of Tax Text Categorzaton based on ough Set Bn Lu,
More informationMachine Learning. Topic 6: Clustering
Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess
More informationFrom Comparing Clusterings to Combining Clusterings
Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (008 From Comparng Clusterngs to Combnng Clusterngs Zhwu Lu and Yuxn Peng and Janguo Xao Insttute of Computer Scence and Technology,
More informationUnsupervised Learning
Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationComplex System Reliability Evaluation using Support Vector Machine for Incomplete Data-set
Internatonal Journal of Performablty Engneerng, Vol. 7, No. 1, January 2010, pp.32-42. RAMS Consultants Prnted n Inda Complex System Relablty Evaluaton usng Support Vector Machne for Incomplete Data-set
More informationIntra-Parametric Analysis of a Fuzzy MOLP
Intra-Parametrc Analyss of a Fuzzy MOLP a MIAO-LING WANG a Department of Industral Engneerng and Management a Mnghsn Insttute of Technology and Hsnchu Tawan, ROC b HSIAO-FAN WANG b Insttute of Industral
More informationFast Feature Value Searching for Face Detection
Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com
More informationKernel Collaborative Representation Classification Based on Adaptive Dictionary Learning
Internatonal Journal of Intellgent Informaton Systems 2018; 7(2): 15-22 http://www.scencepublshnggroup.com/j/js do: 10.11648/j.js.20180702.11 ISSN: 2328-7675 (Prnt); ISSN: 2328-7683 (Onlne) Kernel Collaboratve
More informationAutomated Selection of Training Data and Base Models for Data Stream Mining Using Naïve Bayes Ensemble Classification
Proceedngs of the World Congress on Engneerng 2017 Vol II, July 5-7, 2017, London, U.K. Automated Selecton of Tranng Data and Base Models for Data Stream Mnng Usng Naïve Bayes Ensemble Classfcaton Patrca
More informationPrivate Information Retrieval (PIR)
2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market
More informationBOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET
1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School
More informationHuman Face Recognition Using Generalized. Kernel Fisher Discriminant
Human Face Recognton Usng Generalzed Kernel Fsher Dscrmnant ng-yu Sun,2 De-Shuang Huang Ln Guo. Insttute of Intellgent Machnes, Chnese Academy of Scences, P.O.ox 30, Hefe, Anhu, Chna. 2. Department of
More informationText Similarity Computing Based on LDA Topic Model and Word Co-occurrence
2nd Internatonal Conference on Software Engneerng, Knowledge Engneerng and Informaton Engneerng (SEKEIE 204) Text Smlarty Computng Based on LDA Topc Model and Word Co-occurrence Mngla Shao School of Computer,
More informationFeature Selection Algorithm Based on Correlation between Muti Metric Network Traffic Flow Features
362 The Internatonal Arab Journal of Informaton Technology, Vol. 14, No. 3, May 2017 Feature Selecton Algorthm Based on Correlaton between Mut Metrc Network Traffc Flow Features Yongfeng Cu 1,2, Sh Dong
More informationFast Sparse Gaussian Processes Learning for Man-Made Structure Classification
Fast Sparse Gaussan Processes Learnng for Man-Made Structure Classfcaton Hang Zhou Insttute for Vson Systems Engneerng, Dept Elec. & Comp. Syst. Eng. PO Box 35, Monash Unversty, Clayton, VIC 3800, Australa
More informationAn Ensemble Learning algorithm for Blind Signal Separation Problem
An Ensemble Learnng algorthm for Blnd Sgnal Separaton Problem Yan L 1 and Peng Wen 1 Department of Mathematcs and Computng, Faculty of Engneerng and Surveyng The Unversty of Southern Queensland, Queensland,
More informationRobust visual tracking based on Informative random fern
5th Internatonal Conference on Computer Scences and Automaton Engneerng (ICCSAE 205) Robust vsual trackng based on Informatve random fern Hao Dong, a, Ru Wang, b School of Instrumentaton Scence and Opto-electroncs
More informationFuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System
Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105
More informationA Topology-aware Random Walk
A Topology-aware Random Walk Inkwan Yu, Rchard Newman Dept. of CISE, Unversty of Florda, Ganesvlle, Florda, USA Abstract When a graph can be decomposed nto clusters of well connected subgraphs, t s possble
More informationAudio Content Classification Method Research Based on Two-step Strategy
(IJACSA) Internatonal Journal of Advanced Computer Scence and Applcatons, Audo Content Classfcaton Method Research Based on Two-step Strategy Sume Lang Department of Computer Scence and Technology Chongqng
More informationUsing Fuzzy Logic to Enhance the Large Size Remote Sensing Images
Internatonal Journal of Informaton and Electroncs Engneerng Vol. 5 No. 6 November 015 Usng Fuzzy Logc to Enhance the Large Sze Remote Sensng Images Trung Nguyen Tu Huy Ngo Hoang and Thoa Vu Van Abstract
More informationSI485i : NLP. Set 5 Using Naïve Bayes
SI485 : NL Set 5 Usng Naïve Baes Motvaton We want to predct somethng. We have some text related to ths somethng. somethng = target label text = text features Gven, what s the most probable? Motvaton: Author
More informationPruning Training Corpus to Speedup Text Classification 1
Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan
More informationIncremental Learning with Support Vector Machines and Fuzzy Set Theory
The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and
More informationClustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b
Internatonal Conference on Advances n Mechancal Engneerng and Industral Informatcs (AMEII 05) Clusterng Algorthm Combnng CPSO wth K-Means Chunqn Gu, a, Qan Tao, b Department of Informaton Scence, Zhongka
More informationRecommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm
Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton
More informationMaximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation
Internatonal Conference on Logstcs Engneerng, Management and Computer Scence (LEMCS 5) Maxmum Varance Combned wth Adaptve Genetc Algorthm for Infrared Image Segmentaton Huxuan Fu College of Automaton Harbn
More informationBIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING
An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College
More informationFace Recognition Based on SVM and 2DPCA
Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationSpam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection
E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationFINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK
FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,
More informationAnalyzing Popular Clustering Algorithms from Different Viewpoints
1000-9825/2002/13(08)1382-13 2002 Journal of Software Vol.13, No.8 Analyzng Popular Clusterng Algorthms from Dfferent Vewponts QIAN We-nng, ZHOU Ao-yng (Department of Computer Scence, Fudan Unversty, Shangha
More informationFeature Selection as an Improving Step for Decision Tree Construction
2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor
More informationOnline Detection and Classification of Moving Objects Using Progressively Improving Detectors
Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816
More informationCAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University
CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made
More informationData Mining: Model Evaluation
Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct
More informationReal-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems
Real-tme Fault-tolerant Schedulng Algorthm for Dstrbuted Computng Systems Yun Lng, Y Ouyang College of Computer Scence and Informaton Engneerng Zheang Gongshang Unversty Postal code: 310018 P.R.CHINA {ylng,
More informationModular PCA Face Recognition Based on Weighted Average
odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract
More informationRelevance Feedback Document Retrieval using Non-Relevant Documents
Relevance Feedback Document Retreval usng Non-Relevant Documents TAKASHI ONODA, HIROSHI MURATA and SEIJI YAMADA Ths paper reports a new document retreval method usng non-relevant documents. From a large
More informationHybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 2 Sofa 2016 Prnt ISSN: 1311-9702; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-2016-0017 Hybrdzaton of Expectaton-Maxmzaton
More informationClustering of Words Based on Relative Contribution for Text Categorization
Clusterng of Words Based on Relatve Contrbuton for Text Categorzaton Je-Mng Yang, Zh-Yng Lu, Zhao-Yang Qu Abstract Term clusterng tres to group words based on the smlarty crteron between words, so that
More informationQuery Clustering Using a Hybrid Query Similarity Measure
Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan
More informationArabic Text Classification Using N-Gram Frequency Statistics A Comparative Study
Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu
More information