Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification

Size: px
Start display at page:

Download "Credibility Adjusted Term Frequency: A Supervised Term Weighting Scheme for Sentiment Analysis and Text Classification"

Transcription

1 Credblty Adjusted Term Frequency: A Supervsed Term Weghtng Scheme for Sentment Analyss and Text Classfcaton Yoon Km New York Unversty yhk255@nyu.edu Owen Zhang zhonghua.zhang2006@gmal.com Abstract We provde a smple but novel supervsed weghtng scheme for adjustng term frequency n tf-df for sentment analyss and text classfcaton. We compare our method to baselne weghtng schemes and fnd that t outperforms them on multple benchmarks. The method s robust and works well on both snppets and longer documents. 1 Introducton Baselne dscrmnatve methods for text classfcaton usually nvolve tranng a lnear classfer over bag-of-words (BoW) representatons of documents. In BoW representatons (also known as Vector Space Models), a document s represented as a vector where each entry s a count (or bnary count) of tokens that occurred n the document. Gven that some tokens are more nformatve than others, a common technque s to apply a weghtng scheme to gve more weght to dscrmnatve tokens and less weght to non-dscrmnatve ones. Term frequency-nverse document frequency (tfdf ) (Salton and McGll, 1983) s an unsupervsed weghtng technque that s commonly employed. In tf-df, each token n document d s assgned the followng weght, w,d = tf,d log N df (1) where tf,d s the number of tmes token occurred n document d, N s the number of documents n the corpus, and df s the number of documents n whch token occurred. Many supervsed and unsupervsed varants of tf-df exst (Debole and Sebastan (2003); Martneau and Fnn (2009); Wang and Zhang (2013)). The purpose of ths paper s not to perform an exhaustve comparson of exstng weghtng schemes, and hence we do not lst them here. Interested readers are drected to Paltoglou and Thelwall (2010) and Deng et al. (2014) for comprehensve revews of the dfferent schemes. In the present work, we propose a smple but novel supervsed method to adjust the term frequency porton n tf-df by assgnng a credblty adjusted score to each token. We fnd that t outperforms the tradtonal unsupervsed tf-df weghtng scheme on multple benchmarks. The benchmarks nclude both snppets and longer documents. We also compare our method aganst Wang and Mannng (2012) s Nave-Bayes Support Vector Machne (NBSVM), whch has acheved state-of-the-art results (or close to t) on many datasets, and fnd that t performs compettvely aganst NBSVM. We addtonally fnd that the tradtonal tf-df performs compettvely aganst other, more sophstcated methods when used wth the rght scalng and normalzaton parameters. 2 The Method Consder a bnary classfcaton task. Let C,k be the count of token n class k, wth k { 1, 1}. Denote C to be the count of token over both classes, and y (d) to be the class of document d. For each occurrence of token n the tranng set, we calculate the followng, s (j) = { C,1 C, f y (d) = 1, f y (d) = 1 C, 1 C (2) Here, j s the j-th occurrence of token. Snce there are C such occurrences, j ndexes from 1 to C. We assgn a score to token by, ŝ = 1 C C j=1 s (j) (3) Intutvely, ŝ s the average lkelhood of makng the correct classfcaton gven token s occurrence n the document, f was the only token n 79 Proceedngs of the 5th Workshop on Computatonal Approaches to Subjectvty, Sentment and Socal Meda Analyss, pages 79 83, Baltmore, Maryland, USA. June 27, c 2014 Assocaton for Computatonal Lngustcs

2 the document. In a bnary classfcaton case, ths reduces to, ŝ = C2,1 + C2, 1 C 2 (4) Note that by constructon, the support of ŝ s [0.5, 1]. 2.1 Credblty Adjustment Suppose ŝ = ŝ j = 0.75 for two dfferent tokens and j, but C = 5 and C j = 100. Intuton suggests that ŝ j s a more credble score than ŝ, and that ŝ should be shrunk towards the populaton mean. Let ŝ be the (weghted) populaton mean. That s, ŝ = C ŝ (5) C where C s the count of all tokens n the corpus. We defne credblty adjusted score for token to be, s = C2,1 + C2, 1 + ŝ γ C 2 + γ (6) where γ s an addtve smoothng parameter. If C,k s are small, then s ŝ (otherwse, s ŝ ). Ths s a form of Buhlmann credblty adjustment from the actuaral lterature (Buhlmann and Gsler, 2005). We subsequently defne tf, the credblty adjusted term frequency, to be, tf,d = (0.5 + ŝ ) tf,d (7) and tf s replaced wth tf. That s, w,d = tf,d log N df (8) We refer to above as cred-tf-df hereafter. 2.2 Sublnear Scalng It s common practce to apply sublnear scalng to tf. A word occurrng (say) ten tmes more n a document s unlkely to be ten tmes as mportant. Paltoglou and Thelwall (2010) confrm that sublnear scalng of term frequency results n sgnfcant mprovements n varous text classfcaton tasks. We employ logarthmc scalng, where tf s replaced wth log(tf) + 1. For our method, tf s smply replaced wth log(tf) + 1. We found vrtually no dfference n performance between log scalng and other sublnear scalng methods (such as augmented scalng, where tf s replaced wth tf max tf ). 2.3 Normalzaton Usng normalzed features resulted n substantal mprovements n performance versus usng un-normalzed features. We thus use ˆx (d) = x (d) / x (d) 2 n the SVM, where x (d) s the feature vector obtaned from cred-tf-df weghts for document d. 2.4 Nave-Bayes SVM (NBSVM) Wang and Mannng (2012) acheve excellent (sometmes state-of-the-art) results on many benchmarks usng bnary Nave Bayes (NB) logcount ratos as features n an SVM. In ther framework, w,d = 1{tf,d } log (df,1 + α)/ (df,1 + α) (df, 1 + α)/ (df, 1 + α) (9) where df,k s the number of documents that contan token n class k, α s a smoothng parameter, and 1{ } s the ndcator functon equal to one f tf,d > 0 and zero otherwse. As an addtonal benchmark, we mplement NBSVM wth α = 1.0 and compare aganst our results. 1 3 Datasets and Expermental Setup We test our method on both long and short text classfcaton tasks, all of whch were used to establsh baselnes n Wang and Mannng (2012). Table 1 has summary statstcs of the datasets. The snppet datasets are: PL-sh: Short move revews wth one sentence per revew. Classfcaton nvolves detectng whether a revew s postve or negatve. (Pang and Lee, 2005). 2 PL-sub: Dataset wth short subjectve move revews and objectve plot summares. Classfcaton task s to detect whether the sentence s objectve or subjectve. (Pang and Lee, 2004). And the longer document datasets are: 1 Wang and Mannng (2012) use the same α but they dffer from our NBSVM n two ways. One, they use l 2 hnge loss (as opposed to l 1 loss n ths paper). Two, they nterpolate NBSVM weghts wth Multvarable Nave Bayes (MNB) weghts to get the fnal weght vector. Further, ther tokenzaton s slghtly dfferent. Hence our NBSVM results are not drectly comparable. We lst ther results n table All the PL datasets are avalable here. 80

3 Dataset Length Pos Neg Test PL-sh CV PL-sub CV PL-2k CV IMDB k 12.5k 25k AthR XGraph Table 1: Summary statstcs for the datasets. Length s the average number of ungram tokens (ncludng punctuaton) per document. Pos/Neg s the number of postve/negatve documents n the tranng set. Test s the number of documents n the test set (CV means that there s no separate test set for ths dataset and thus a 10-fold crossvaldaton was used to calculate errors). PL-2k: 2000 full-length move revews that has become the de facto benchmark for sentment analyss (Pang and Lee, 2004). IMDB: 50k full-length move revews (25k tranng, 25k test), from IMDB (Maas et al., 2011). 3 AthR, XGraph: The 20-Newsgroup dataset, 2nd verson wth headers removed. 4 Classfcaton task s to classfy whch topc a document belongs to. AthR: alt.athesm vs relgon.msc, XGraph: comp.wndows.x vs comp.graphcs. 3.1 Support Vector Machne (SVM) For each document, we construct the feature vector x (d) usng weghts obtaned from cred-tf-df wth log scalng and l 2 normalzaton. For credtf-df, γ s set to 1.0. NBSVM and tf-df (also wth log scalng and l 2 normalzaton) are used to establsh baselnes. Predcton for a test document s gven by y (d) = sgn (w T x (d) + b) (10) In all experments, we use a Support Vector Machne (SVM) wth a lnear kernel and penalty parameter of C = 1.0. For the SVM, w, b are obtaned by mnmzng, w T w+c N max(0, 1 y (d) (w T x (d) +b)) (11) d=1 usng the LIBLINEAR lbrary (Fan et al., 2008). 3 amaas/data/sentment/ndex.html Tokenzaton We lower-case all words but do not perform any stemmng or lemmatzaton. We restrct the vocabulary to all tokens that occurred at least twce n the tranng set. 4 Results and Dscusson For PL datasets, there are no separate test sets and hence we use 10-fold cross valdaton (as do other publshed results) to estmate errors. The standard tran-test splts are used on IMDB and Newsgroup datasets. 4.1 cred-tf-df outperforms tf-df Table 2 has the comparson of results for the dfferent datasets. Our method outperforms the tradtonal tf-df on all benchmarks for both ungrams and bgrams. Whle some of the dfferences n performance are sgnfcant at the 0.05 level (e.g. IMDB), some are not (e.g. PL-2k). The Wlcoxon sgned ranks test s a non-parametrc test that s often used n cases where two classfers are compared over multple datasets (Demsar, 2006). The Wlcoxon sgned ranks test ndcates that the overall outperformance s sgnfcant at the <0.01 level. 4.2 NBSVM outperforms cred-tf-df cred-tf-df dd not outperform Wang and Mannng (2012) s NBSVM (Wlcoxon sgned ranks test p- value = 0.1). But t dd outperform our own mplementaton of NBSVM, mplyng that the extra modfcatons by Wang and Mannng (2012) (.e. usng squared hnge loss n the SVM and nterpolatng between NBSVM and MNB weghts) are mportant contrbutons of ther methodology. Ths was especally true n the case of shorter documents, where our unnterpolated NBSVM performed sgnfcantly worse than ther nterpolated NBSVM. 4.3 tf-df stll performs well We fnd that tf-df stll performs remarkably well wth the rght scalng and normalzaton parameters. Indeed, the tradtonal tf-df outperformed many of the more sophstcated methods that employ dstrbuted representatons (Maas et al. (2011); Socher et al. (2011)) or other weghtng schemes (Martneau and Fnn (2009); Deng et al. (2014)). 81

4 Method PL-sh PL-sub PL-2k IMDB AthR XGraph tf-df-un tf-df-b Our cred-tfdf-un results cred-tfdf-b NBSVM-un NBSVM-b MNB-un Wang & MNB-b Mannng NBSVM-un NBSVM-b Appr. Tax.* Str. SVM* aug-tf-m Other Dsc. Conn results Word Vec.* LLR RAE MV-RNN Table 2: Results of our method (cred-tf-df ) aganst baselnes (tf-df, NBSVM), usng ungrams and bgrams. cred-tf-df and tf-df both use log scalng and l 2 normalzaton. Best results (that do not use external sources) are underlned, whle top three are n bold. Rows 7-11 are MNB and NBSVM results from Wang and Mannng (2012). Our NBSVM results are not drectly comparable to thers (see footnote 1). Methods wth * use external data or software. Appr. Tax: Uses apprasal taxonomes from WordNet (Whtelaw et al., 2005). Str. SVM: Uses OpnonFnder to fnd objectve versus subjectve parts of the revew (Yessenalna et al., 2010). aug-tf-m: Uses augmented term-frequency wth mutual nformaton gan (Deng et al., 2014). Dsc. Conn.: Uses dscourse connectors to generate addtonal features (Trved and Esensten, 2013). Word Vec.: Learns sentment-specfc word vectors to use as features combned wth BoW features (Maas et al., 2011). LLR: Uses log-lkelhood rato on features to select features (Aue and Gamon, 2005). RAE: Recursve autoencoders (Socher et al., 2011). MV-RNN: Matrx-Vector Recursve Neural Networks (Socher et al., 2012). 5 Conclusons and Future Work In ths paper we presented a novel supervsed weghtng scheme, whch we call credblty adjusted term frequency, to perform sentment analyss and text classfcaton. Our method outperforms the tradtonal tf-df weghtng scheme on multple benchmarks, whch nclude both snppets and longer documents. We also showed that tf-df s compettve aganst other state-of-the-art methods wth the rght scalng and normalzaton parameters. From a performance standpont, t would be nterestng to see f our method s able to acheve even better results on the above tasks wth proper tunng of the γ parameter. Relatedly, our method could potentally be combned wth other supervsed varants of tf-df, ether drectly or through ensemblng, to mprove performance further. References A. Aue, M. Gamon Customzng sentment classfers to new domans: A case study. Proceedngs of the Internatonal Conference on Recent Advances n NLP, H. Buhlmann, A. Gsler A Course n Credblty Theory and ts Applcatons Sprnger-Verlag, Berln. F. Debole, F. Sebastan Supervsed Term Weghtng for Automated Text Categorzaton Proceedngs of the 2003 ACM symposum on Appled Computng J. Demsar Statstcal Comparson of classfers over multple data sets. Journal of Machne Learnng Research, 7: Z. Deng, K. Luo, H. Yu A study of supervsed term weghtng scheme for sentment analyss Ex- 82

5 pert Systems wth Applcatons. Volume 41, Issue 7, R. Fan, K. Chang, J. Hseh, X. Wang, C. Ln LI- BLINEAR: A lbrary for large lnear classfcaton. Journal of Machne Learnng Research, 9: , June. A. Maas, R. Daly, P. Pham, D. Huang, A. Ng, C. Potts Learnng Word Vectors for Sentment Analyss. In Proceedngs of ACL J. Martneau, T. Fnn Delta TFIDF: An Improved Feature Space for Sentment Analyss. Thrd AAAI Internatonal Conference on Weblogs and Socal Meda G. Paltoglou, M. Thelwall A study of Informaton Retreval weghtng schemes for sentment analyss. In Proceedngs of ACL B. Pang, L. Lee A sentmental educaton: Sentment analyss usng subjectvty summarzaton based on mnmum cuts. In Proceedngs of ACL B. Pang, L. Lee Seeng stars: Explotng class relatonshps for sentment categorzaton wth respect to ratng scales. In Proceedngs of ACL R. Socher, J. Pennngton, E. Huang, A. Ng, C. Mannng Sem-Supervsed Recursve Autoencoders for Predctng Sentment Dstrbutons. Proceedngs of EMNLP R. Socher, B. Huval, C. Mannng, A. Ng Semantc Compostonalty through Recursve Matrx- Vector Spaces. In Proceedngs of EMNLP R. Trved, J. Esensten Dscourse Connectors for Latent Subjectvty n Sentment Analyss. In Proceedngs of NAACL G. Salton, M. McGll Introducton to Modern Informaton Retreval. McGraw-Hll. S. Wang, C. Mannng Baselnes and Bgrams: Smple, Good Sentment and Topc Classfcaton. In proceedngs of ACL D. Wang, H. Zhang Inverse-Category- Frequency Based Supervsed Term Weghtng Schemes for Text Categorzaton. Journal of Informaton Scence and Engneerng 29, C. Whtelaw, N. Garg, S. Argamon Usng apprasal taxonomes for sentment analyss. In Proceedngs of CIKM A. Yessenalna, Y. Yue, C. Carde Multlevel Structured Models for Document-level Sentment Classfcaton. In Proceedngs of ACL In 83

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

A Novel Term_Class Relevance Measure for Text Categorization

A Novel Term_Class Relevance Measure for Text Categorization A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Sentiment Classification and Polarity Shifting

Sentiment Classification and Polarity Shifting Sentment Classfcaton and Polarty Shftng Shoushan L Sopha Yat Me Lee Yng Chen Chu-Ren Huang Guodong Zhou Department of CBS The Hong Kong Polytechnc Unversty {shoushan.l, sophaym, chenyng3176, churenhuang}

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network

Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network Learnng Tag Embeddngs and Tag-specfc Composton Functons n Recursve Neural Network Qao Qan, Bo Tan, Mnle Huang, Yang Lu*, Xuan Zhu*, Xaoyan Zhu State Key Lab. of Intellgent Technology and Systems, Natonal

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

Associative Based Classification Algorithm For Diabetes Disease Prediction

Associative Based Classification Algorithm For Diabetes Disease Prediction Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha

More information

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers Journal of Convergence Informaton Technology Volume 5, Number 2, Aprl 2010 Investgatng the Performance of Naïve- Bayes Classfers and K- Nearest Neghbor Classfers Mohammed J. Islam *, Q. M. Jonathan Wu,

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Semi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation

Semi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation Sem Supervsed Learnng usng Hgher Order Cooccurrence Paths to Overcome the Complexty of Data Representaton Murat Can Ganz Computer Engneerng Department, Faculty of Engneerng Marmara Unversty, İstanbul,

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Experiments in Text Categorization Using Term Selection by Distance to Transition Point

Experiments in Text Categorization Using Term Selection by Distance to Transition Point Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Efficient Text Classification by Weighted Proximal SVM *

Efficient Text Classification by Weighted Proximal SVM * Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna

More information

Feature Selection for Natural Language Call Routing Based on Self-Adaptive Genetic Algorithm

Feature Selection for Natural Language Call Routing Based on Self-Adaptive Genetic Algorithm IOP Conference Seres: Materals Scence and Engneerng PAPER OPEN ACCESS Feature Selecton for Natural Language Call Routng Based on Self-Adaptve Genetc Algorthm To cte ths artcle: A Koromyslova et al 017

More information

Classic Term Weighting Technique for Mining Web Content Outliers

Classic Term Weighting Technique for Mining Web Content Outliers Internatonal Conference on Computatonal Technques and Artfcal Intellgence (ICCTAI'2012) Penang, Malaysa Classc Term Weghtng Technque for Mnng Web Content Outlers W.R. Wan Zulkfel, N. Mustapha, and A. Mustapha

More information

Issues and Empirical Results for Improving Text Classification

Issues and Empirical Results for Improving Text Classification Issues and Emprcal Results for Improvng Text Classfcaton Youngoong Ko 1 and Jungyun Seo 2 1 Dept. of Computer Engneerng, Dong-A Unversty, 840 Hadan 2-dong, Saha-gu, Busan, 604-714, Korea yko@dau.ac.kr

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection

Spam Filtering Based on Support Vector Machines with Taguchi Method for Parameter Selection E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton We-Chh Hsu, Tsan-Yng Yu E-mal Spam Flterng Based on Support Vector Machnes wth Taguch Method for Parameter Selecton

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A Misclassification Reduction Approach for Automatic Call Routing

A Misclassification Reduction Approach for Automatic Call Routing A Msclassfcaton Reducton Approach for Automatc Call Routng Fernando Uceda-Ponga 1, Lus Vllaseñor-Pneda 1, Manuel Montes-y-Gómez 1, Alejandro Barbosa 2 1 Laboratoro de Tecnologías del Lenguaje, INAOE, Méxco.

More information

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples

Reliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples 94 JOURNAL OF COMPUTERS, VOL. 4, NO. 1, JANUARY 2009 Relable Negatve Extractng Based on knn for Learnng from Postve and Unlabeled Examples Bangzuo Zhang College of Computer Scence and Technology, Jln Unversty,

More information

Discriminative classifiers for object classification. Last time

Discriminative classifiers for object classification. Last time Dscrmnatve classfers for object classfcaton Thursday, Nov 12 Krsten Grauman UT Austn Last tme Supervsed classfcaton Loss and rsk, kbayes rule Skn color detecton example Sldng ndo detecton Classfers, boostng

More information

Using an Automatic Weighted Keywords Dictionary for Intelligent Web Content Filtering

Using an Automatic Weighted Keywords Dictionary for Intelligent Web Content Filtering Journal of Advances n Computer Research Quarterly pissn: 2345-606x eissn: 2345-6078 Sar Branch, Islamc Azad Unversty, Sar, I.R.Iran (Vol. 6, No. 1, February 2015), Pages: 101-114 www.jacr.ausar.ac.r Usng

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Chi Square Feature Extraction Based Svms Arabic Language Text Categorization System

Chi Square Feature Extraction Based Svms Arabic Language Text Categorization System Journal of Computer Scence 3 (6): 430-435, 007 ISSN 1549-3636 007 Scence Publcatons Ch Square Feature Extracton Based Svms Arabc Language Text Categorzaton System Abdelwadood Moh'd A MESLEH Faculty of

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES

A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES A MODIFIED K-NEAREST NEIGHBOR CLASSIFIER TO DEAL WITH UNBALANCED CLASSES Aram AlSuer, Ahmed Al-An and Amr Atya 2 Faculty of Engneerng and Informaton Technology, Unversty of Technology, Sydney, Australa

More information

User Tweets based Genre Prediction and Movie Recommendation using LSI and SVD

User Tweets based Genre Prediction and Movie Recommendation using LSI and SVD User Tweets based Genre Predcton and Move Recommendaton usng LSI and SVD Saksh Bansal, Chetna Gupta Department of CSE/IT Jaypee Insttute of Informaton Technology,sec-62 Noda, Inda sakshbansal76@gmal.com,

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu

More information

Using Ambiguity Measure Feature Selection Algorithm for Support Vector Machine Classifier

Using Ambiguity Measure Feature Selection Algorithm for Support Vector Machine Classifier Usng Ambguty Measure Feature Selecton Algorthm for Support Vector Machne Classfer Saet S.R. Mengle Informaton Retreval Lab Computer Scence Department Illnos Insttute of Technology Chcago, Illnos, U.S.A

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Using Neural Networks and Support Vector Machines in Data Mining

Using Neural Networks and Support Vector Machines in Data Mining Usng eural etworks and Support Vector Machnes n Data Mnng RICHARD A. WASIOWSKI Computer Scence Department Calforna State Unversty Domnguez Hlls Carson, CA 90747 USA Abstract: - Multvarate data analyss

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 97-735 Volume Issue 9 BoTechnology An Indan Journal FULL PAPER BTAIJ, (9), [333-3] Matlab mult-dmensonal model-based - 3 Chnese football assocaton super league

More information

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel Syntactc Tree-based Relaton Extracton Usng a Generalzaton of Collns and Duffy Convoluton Tree Kernel Mahdy Khayyaman Seyed Abolghasem Hassan Abolhassan Mrroshandel Sharf Unversty of Technology Sharf Unversty

More information

Automatic Text Categorization of Mathematical Word Problems

Automatic Text Categorization of Mathematical Word Problems Automatc Text Categorzaton of Mathematcal Word Problems Suleyman Cetntas 1, Luo S 2, Yan Png Xn 3, Dake Zhang 3, Joo Young Park 3 1,2 Department of Computer Scence, 2 Department of Statstcs, 3 Department

More information

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval A Generaton Model to Unfy Topc Relevance and Lexcon-based Sentment for Opnon Retreval Mn Zhang State key lab of Intellgent Tech.& Sys, Dept. of Computer Scence, Tsnghua Unversty, Bejng, 00084, Chna 86-0-6279-2595

More information

Intrinsic Plagiarism Detection Using Character n-gram Profiles

Intrinsic Plagiarism Detection Using Character n-gram Profiles Intrnsc Plagarsm Detecton Usng Character n-gram Profles Efstathos Stamatatos Unversty of the Aegean 83200 - Karlovass, Samos, Greece stamatatos@aegean.gr Abstract: The task of ntrnsc plagarsm detecton

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25 Transformaton Networks for Target-Orented Sentment Classfcaton 1 Xn L 1, Ldong Bng 2, Wa Lam 1, Be Sh 1 1 The Chnese Unversty of Hong Kong 2 Tencent AI Lab ACL 2018 1 Jont work wth Tencent AI Lab Transformaton

More information

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis C2 Tranng: June 8 9, 2010 Introducton to meta-analyss The Campbell Collaboraton www.campbellcollaboraton.org Combnng effect szes across studes Compute effect szes wthn each study Create a set of ndependent

More information

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Web Document Classification Based on Fuzzy Association

Web Document Classification Based on Fuzzy Association Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA charuech@mam.edu, shyu@mam.edu

More information

Why visualisation? IRDS: Visualization. Univariate data. Visualisations that we won t be interested in. Graphics provide little additional information

Why visualisation? IRDS: Visualization. Univariate data. Visualisations that we won t be interested in. Graphics provide little additional information Why vsualsaton? IRDS: Vsualzaton Charles Sutton Unversty of Ednburgh Goal : Have a data set that I want to understand. Ths s called exploratory data analyss. Today s lecture. Goal II: Want to dsplay data

More information

Wavefront Reconstructor

Wavefront Reconstructor A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Logitboost of Multinomial Bayesian Classifier for Text Classification

Logitboost of Multinomial Bayesian Classifier for Text Classification Internatonal Revew on Computers and Software (I.RE.CO.S.), Vol. 1, n. 3 Logtboost of Multnomal Bayesan Classfer for Text Classfcaton S. Kotsants 1, E. Athanasopoulou 2, and P. Pntelas 3 Abstract Automated

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Incremental Learning with Support Vector Machines and Fuzzy Set Theory The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks Federated Search of Text-Based Dgtal Lbrares n Herarchcal Peer-to-Peer Networks Je Lu School of Computer Scence Carnege Mellon Unversty Pttsburgh, PA 15213 jelu@cs.cmu.edu Jame Callan School of Computer

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

A Feature-Weighted Instance-Based Learner for Deep Web Search Interface Identification

A Feature-Weighted Instance-Based Learner for Deep Web Search Interface Identification Research Journal of Appled Scences, Engneerng and Technology 5(4): 1278-1283, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scentfc Organzaton, 2013 Submtted: June 28, 2012 Accepted: August 08, 2012

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

Feature-Based Matrix Factorization

Feature-Based Matrix Factorization Feature-Based Matrx Factorzaton arxv:1109.2271v3 [cs.ai] 29 Dec 2011 Tanq Chen, Zhao Zheng, Quxa Lu, Wenan Zhang, Yong Yu {tqchen,zhengzhao,luquxa,wnzhang,yyu}@apex.stu.edu.cn Apex Data & Knowledge Management

More information

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University

CAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made

More information

Single Document Keyphrase Extraction Using Neighborhood Knowledge

Single Document Keyphrase Extraction Using Neighborhood Knowledge Proceedngs of the Twenty-Thrd AAAI Conference on Artfcal Intellgence (2008) Sngle Document Keyphrase Extracton Usng Neghborhood Knowledge Xaoun Wan and Janguo Xao Insttute of Computer Scence and Technology

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information