Semi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation

Size: px

Start display at page:

Download "Semi Supervised Learning using Higher Order Cooccurrence Paths to Overcome the Complexity of Data Representation"

Dwain Smith
5 years ago
Views:

1 Sem Supervsed Learnng usng Hgher Order Cooccurrence Paths to Overcome the Complexty of Data Representaton Murat Can Ganz Computer Engneerng Department, Faculty of Engneerng Marmara Unversty, İstanbul, Turkey Abstract We present a novel approach to sem-supervsed learnng for text classfcaton based on the hgher-order co-occurrence paths of words. We name the proposed method as Sem-Supervsed Semantc Hgher-Order Smoothng (S3HOS). The S3HOS s bult on a tr-partte graph based data representaton of labeled and unlabeled documents that allows semantcs n hgherorder co-occurrence paths between terms (words) to be exploted. There are several graph-based technques proposed n the lterature to dffuse class labels from labeled documents to the unlabeled documents. In ths study we propose a dfferent and natural way of estmatng class condtonal probabltes for the terms n unlabeled documents wthout need to label the documents frst. The proposed approach allows estmatng class condtonal probabltes for the terms n unlabeled documents and mprove the estmaton of terms n the labeled documents at the same tme. We expermentally show that S3HOS can hghly mprove the parameter estmaton and hence ncrease the classfcaton accuracy partcularly when the amount of the labeled data s scarce but unlabeled data s plentful. Keywords Sem-Supervsed Learnng, Nave Bayes, Semantc Smoothng, Hgher-Order Nave Bayes, Text Classfcaton. I. INTRODUCTION The sem-supervsed classfcaton algorthms am to mprove classfer performance n terms of accuracy by ncorporatng large amount of unlabeled data, whch s readly avalable or easy to acqure. In many real-world applcatons t s costly to obtan labeled nstances therefore the use of un-labeled nstances to mprove the accuracy may be a much cost effectve alternatve than labelng more nstances. The research on sem-supervsed learnng (SSL) s of nterest recent years and receved consderable attenton from the research communty on several domans ncludng mage classfcaton and text classfcaton both of whch are the domans that labelng nstances are partcularly expensve due to the sze and complexty of ther structures demands human nterventon. Several tutorals and surveys can be found about the semsupervsed learnng n [15], [16], [17]. There are several dfferent approaches to the sem-supervsed learnng, n partcular sem-supervsed classfcaton. These nclude generatve models, Sem-Supervsed Support Vector Machnes, graph-based models [24], co-tranng and multvew models [15]. In text classfcaton doman, one of the most famous generatve model s the sem-supervsed varant of the Multnomal Naïve Bayes algorthm proposed by the Ngam, McCallum, and Mtchell [18]. Ths study uses a generatve classfer; the Naïve Bayes (NB), to classfy unlabeled nstances startng wth a small tranng set of labeled nstances. Ths classfcaton process of unlabeled nstances s appled teratvely and optmzed by usng Expectaton Maxmzaton (EM). Our approach bears smlartes wth generatve models snce we use parameter estmatons smlar to the Naïve Bayes (NB). Addtonally, t can be consdered as a graphbased model as we too are usng graph data structures. On the other hand, our approach dverges from generatve models by not employng an teratve approach to label the unlabeled nstances and t dffers from the graph-based models as our graph data structures models terms (words) n documents nstead of the documents. Furthermore, many graph-based SSL algorthms are ntrnscally transductve [19] and as a result they can t naturally classfy prevously unseen nstances that are not present durng tranng. In ths paper, we present a novel approach to sem-supervsed learnng based on the hgher-order co-occurrence paths of words. We name the proposed method as Sem-Supervsed Semantc Hgher-Order Smoothng (S3HOS). It s a hgher-order learnng approach. The hgher-order learnng approaches [3], [4], [13] explots the mplct relatons that stems from the basc structure of the natural language. If the terms or words are expressed together n a meanngful context such as a sentence or document, they obvously have a semantc relatonshp and these relatonshps are transtve whch leads to the hgher-order cooccurrences. These relatons, specfcally the hgher-order cooccurrence relatons lnk the terms n dfferent contexts such as documents therefore blurrng the documents borders. In our case, hgher-order co-occurrence relatons can lnk the terms n unlabeled documents to the ones n labeled documents. More specfcally, S3HOS s bult on smlar graph based data representaton of the HOS [13], whch allows semantcs n hgher-order co-occurrence paths between terms (words) to be

2 exploted. Apart from the several graph-based technques proposed n the lterature to dffuse class labels from labeled documents to the unlabeled documents, we propose a novel and natural way of estmatng class condtonal probabltes for the terms n unlabeled documents wthout need to label them frst. The approach allow estmatng class condtonal probabltes even for the terms n unlabeled documents and mprove the estmaton of terms n the labeled documents at the mean tme. We expermentally show that S3HOS can promnently mprove the parameter estmaton and ncrease the classfcaton accuracy partcularly when the amount of the labeled data s scarce and unlabeled data s plentful. The remander of ths paper s organzed as follows: Secton 2 presents background nformaton about the studes of hgherorder co-occurrence, Secton 3 presents and analyzes the proposed classfer algorthm, Secton 4 presents expermental setup and ntroduces datasets, Secton 5 presents experment results ncludng some opnons and the last secton presents a concluson and future work. II. BACKGROUND Tradtonal machne learnng algorthms assume that nstances are ndependent and dentcally dstrbuted (IID) [1]. There are several studes whch explot explct lnk nformaton n order to overcome the shortcomngs of IID approach [1], [2] for supervsed learnng. However, the use of explct lnks has a sgnfcant drawback; n order to classfy a sngle nstance, an addtonal context needs to be provded. In addton to these, there are many graph-based sem-supervsed models, whch reles on the explct relatonshps generated by usng a dstance or smlarty metrc between documents [20], [21], [23], [24]. Smlarly, these models are hghly senstve to the adjacency graph constructon from documents and the regularzaton parameter parameters [22]. Latent Semantc Indexng (LSI) algorthm [6] s a wdely used technque n Text Mnng and Informaton Retreval. It has been shown that LSI takes advantage of mplct hgher-order (or latent) structure n the assocaton of terms and documents. Hgher-order relatons n LSI capture latent semantcs [7]. There are several LSI based classfers. Among these, n [8] the authors propose a LSI based k-nearest Neghborhood (LSI k- NN) algorthm n a sem-supervsed settng for short text classfcaton (SS-LSI-kNN). In several prevous studes, LSI motvated approaches are developed whch explctly make use of hgher-order relatons. These nclude a novel Bayesan framework called Hgher-Order Naïve Bayes (HONB) [3], [4] and a novel data drven space transformaton that allows vector space classfers to take advantage of relatonal dependences captured by hgher-order paths between features [3] and Hgher-Order Support Vector Machnes (HOSVM) [5]. In general, the Naïve Bayes parameter estmaton drastcally suffers from sparsty due to the very large number of parameters to estmate n the text classfcaton. The number of parameters corresponds to ( V C + C ) where V denotes the dctonary and C denotes the set of class labels [10]. The sparsty of the text data ncreases the mportance of smoothng methods snce they are ntended to dstrbute a certan amount of probablty mass to the prevously unseen events. Based on the smlar prncples, Hgher-Order Smoothng (HOS) for Naïve Bayes text classfcaton takes the explotng the hgher-order relatons concept one step further and explot the relatonshps between nstances of dfferent classes n order to mprove the parameter estmaton [13]. The HOS s bult on a graph-based data representaton from the prevous algorthms n hgher-order learnng framework such as HONB [4]. Whle, n the HONB, hgher-order paths are extracted n the context of a partcular class, n HOS, hgher-order paths from the whole tranng set ncludng all classes of documents. Consequently, t s possble to explot the relatons between the terms and the documents n the dfferent classes. In other words, the basc unts for parameter estmaton, the hgher-order paths, not only move beyond document boundares but also the class boundares to explot the latent semantc nformaton between terms. In ths study, we develop the concept of hgher-order paths even further to nclude unlabeled documents along wth labeled dataset. III. APPROACH Hgher-order paths smultaneously capture term co-occurrences wthn documents as well as term sharng patterns across documents, and n dong so provde a much rcher data representaton than the tradtonal feature vector form [3]. A hgherorder co-occurrence path s shown n Fg.1. Ths fgure depcts three documents, d 1, d 2 and d 3, each contanng two terms t 1, t 2 n d 1, t 2, t 3 n d 2, and t 3, t 4 n d 3. Under the documents a small graph of four nodes and three edges s llustrated. Ths chan represents a hgher-order path that lnks term t 1 wth term t 4 through t 2 and t 3. Ths s an example of thrd-order path snce three lnks, or hops, connect t 1 wth term t 4. Smlarly, there s a secondorder path between t 1 and t 3 through t 2. The t 1 co-occurs wth t 2 n document d 1, and t 2 co-occurs wth t 3 n document d 2. Even f terms t 1 and t 3 never co-occur n any of the documents n a corpus, the regularty of these second order paths may reveal latent semantc relatonshps such as synonymy [13]. d 1 t 1 t 2 d 2 t 2 t 3 Fg. 1. Hgher-Order co-occurrence. Adopted from [7] In [13] a tr-partte graph s used to represent terms, documents, and class labels. In ths study we use a smlar tr-partte graph structure. However, n our case, the graph s created from the combnaton of labeled and unlabeled sets of documents. Therefore, as can be seen n Fg. 2, the set of terms and the set of documents consst of two regons demonstratng f they are d 3 t 3 t 4 d 3 t 1 d 1 t 2 d 2 t 3 D2 t 4

3 from labeled or the unlabeled part of the dataset. The nodes representng documents consst of two separate node sets of labeled documents D L and unlabeled documents D U. Smlarly, the nodes representng terms also dvded nto two sets T U and T L. T U T L V w t 1 t 2 d t n t n+1 c 1 c s V C Fg. 2. Trpartte graph representaton of labeled (DL) and unlabeled (DU) sets of documents, terms, and class labels, showng a second-order co-occurrence path between a term and a class label. In general we use the tr-partte graph structure that s depcted n Fg. 2 and the followng defntons n the followng sectons. D U : the set of unlabeled documents D L : the set of labeled documents T U : the set of terms that occurs n unlabeled documents T L : the set of terms that occurs n labeled documents T OU : the set of terms that only occurs n unlabeled documents; T OU = T U T L T OL : the set of terms that occurs only n labeled documents T UL: the set of terms that occurs both n labeled and unlabeled documents; T UL = T U T L We assume that D U D L= and T UL. (t n,c m) 1 s the number of frst-order paths between t n and class label m (c m). These are bascally the co-occurrences, n bnary form whch correspond to the number of documents n class m that ncludes term t n. (t n,c m) 2 ndcates the number of second-order paths between t n and class label m (c m). These are hgher-order co-occurrences. t 1 d 1 t n d r+1 c 1 s an example second-order co-occurrence path between t 1 and c 1 whch s ndcated by bold dashed V D d 2 d 3... d r t z d r+1... d m D U D L lnes n Fg.2. If we look carefully to ths example, t s a specal case whch lnks a term that exst only n unlabeled documents to a class label snce t 1 T OU and t n T UL and d 1 D U and d r+1 D L. By usng ths knd of relatons we could be estmate parameters for the terms n unlabeled documents whch s otherwse not possble wthout labelng them frst. In S3HOS, smlar to the HOS [13], the parameters of the Naïve Bayes can be calculated from hgher-order paths (.e second-order paths) nstead of documents as n Eq. 1 and 2. P 1 (t n c m ) = (t n,c m ) 1 T (t,c m ) 1 P 2 (t n c m ) = (t n,c m ) 2 T (t,c m ) 2 In order to fuse the power of frst-order and second-order paths, we can combne probablty estmates from frst-order paths and second-order paths usng the lnear nterpolaton as n Eq. 3. If the λ=1 we use only the second-order paths n parameter estmates. On the other hand f λ=0 we use only the frst-order co-occurrences. It s mportant to note that λ also controls the effect of labeled documents n the overall estmate snce only the terms n labeled documents have potental to have frst-order cooccurrence wth class labels. In other words, terms that occur only n unlabeled documents can t have a frst-order co-occurrence wth a class label. P(t n c m ) = (1) (2) (1 λ)(t n,c m ) 1 +λ(t n,c m ) 2 +s T ( (1 λ)(t,c m ) 1 +λ(t,c m ) 2 )+2s The s s the smoothng parameter n order to avod zero probabltes. We assgn a very small number (0<s<<1) n our experments. The second parameter of the Naïve Bayes; class pror probablty for class m can be calculated usng Eq. 4. We assume that each class ncludes at least one document. P(c m ) = T (1 λ)(t,c m ) 1 +λ(t,c m ) 2 C T (1 λ)(t,c j ) 1 +λ(t,c j ) j 2 Ths new formulaton makes t possble to calculate class condtonal probabltes for some of the terms that do not occur n documents of a partcular class as well as the terms does not even occurs n the documents of the labeled dataset. Furthermore, the use of second-order paths provdes more evdence for the terms that exst n a partcular class by makng use of the terms and documents n the unlabeled dataset. The hgher-order paths capture the term co-occurrences wthn a class of documents, term relaton patterns across classes, and term relaton patterns across labeled and unlabeled documents. Consequently, The Nave Bayes type probablty estmaton extracts patterns from a much rcher data representaton than the tradtonal vector space representaton. IV. EXPERIMENT SETUP We use three dfferent datasets ncludng wdely used benchmark datasets n text classfcaton doman. The frst dataset s a (3) (4)

4 wdely used dataset n the text classfcaton lterature named WebKB 1. The WebKB dataset ncludes web pages collected from computer scence departments of dfferent unverstes, organzed n seven categores. We use four-class verson of the WebKB dataset, whch s used n prevous works [13]. We call ths dataset as WebKB4 n the followng sectons. WebKB4 dataset has a hghly skewed class dstrbuton. The second dataset, called 1150haber, ncludes 1150 news artcles n fve categores, (economy, magazne, health, poltcs and sports) collected from Turksh onlne newspapers [11]. Our thrd dataset; mdb s a move revews dataset that s commonly used n sentment analyss studes [14] snce each document s labeled as negatve or postve. The class label, document and vocabulary statstcs are gven n Table 1, where C s the number of classes, D s the number of documents, and V s the vocabulary sze.e. the number of dstnct terms. All the datasets are preprocessed usng stop word removal, stemmng and reducng the dmensonalty to 2000 terms usng Informaton Gan (IG). TABLE I. DESCRIPTIONS OF THE DATASETS BEFORE PREPROCESSING Data Set C D V WebKB4 1150Haber mdb , ,000 16,116 11,038 16,679 We also provde term based statstcs for each dataset n Table 2. As can be seen from ths table, WebKB4 dataset dverges from others wth much longer document szes and the largest devaton of document lengths. These knd of dvergences may have notable effect on the results of our model snce the hgherorder path based models fundamentally rely on the quantty and the qualty of the shared terms between documents. TABLE II. THE TERM BASED CHARACTERISTICS OF THE DATASETS Terms n documents Term length Dataset Avg StdDev Avg StdDev Sparsty WebKB Haber mdb All these datasets are fully labeled.e. each and every document has a sngle class label. In our experments, n order to measure the performance of sem-supervsed algorthms we dvded a dataset nto three chunks; tranng set, unlabeled set and test set by usng several dfferent percentages. Class labels of the unlabeled set are removed. The test set percentage s kept constant 20%. The supervsed algorthms are traned wth tranng set and evaluated usng test set, whle the sem-supervsed algorthms are traned wth both the tranng set and unlabeled set (ncludes no class labels), and evaluated wth the test set, lkewse. These percentages can be seen n Table 3. TABLE III. LABELED AND UNLABELED DATA SPLIT PERCENTAGES Tranng Unlabeled Test Set % Set % Set % Ten folds are created for each dataset and each of sx levels by randomly selectng tranng set percentage and the unlabeled set percentage of the data, and leavng the rest 20% for the test. The stratfed samplng s used whle dvdng the datasets.e. the class dstrbutons s preserved. Ths approach s smlar to [9] [12] [13]. Smlar to the [13], the experment results are presented usng the average accuracy and standard devatons (StdDev) of these 10-fold result. We employ up to second-order paths based on the expermental results of prevous studes [3], [4], [7], [13]. We use several common sem-supervsed text classfers. These sem-supervsed algorthms nclude sem-supervsed Naïve Bayes (EM-NB) [18], and sem-supervsed Latent Semantc Indexng k-nearest Neghborhood (SS-LSI-kNN) [8] [4] [13]. We use the exact same parameter settng for the LSI-kNN ; The number of neghbors s set to 25 and k parameter of LSI s automatcally optmzed for each dataset and tranng set percentage by usng the Kaser approach. The SS-LSI-kNN algorthm s the sem-supervsed varant of the LSI-kNN algorthm used n [13]. Ths s smply acheved by provdng unon of labeled dataset and unlabeled dataset as an nput to the LSI algorthm. However the knn part of the algorthm only consders the labeled nstances n the LSI space whch s generated usng the unon of labeled and large number of unlabeled nstances. Due to the extremely computatonal and space hgh-complexty of the LSI algorthm and the extensve number of experments we conduct (10 folds for each of the 6 tranng set percentages) for each dataset, we couldn t obtan SS-LSI-kNN results for our largest datasets of WebKB4. Our computatonal resources smply do not allow them to be fnshed on a reasonable tme frame. V. EXPERIMENT RESULTS AND DISCUSSION In ths secton we provde a detaled emprcal assessment of the proposed algorthm; Sem-supervsed Semantc Hgher-Order Smoothng (S3HOS) wth several sem-supervsed algorthms. Although we calculate qute a few dfferent evaluaton metrcs such as F-measure (F1) and the area under the ROC curve (AUC), we use accuracy as our man evaluaton metrc n ths secton as t s the most common approach. 1

5 In the followng fgures the algorthm we propose n ths study s ndcated wth S3HOS accompaned by the λ (lambda) value whch shows how the frst-order and second-order paths are combned n parameter estmatons. The values of λ are reported. λ = 1 means that only the second-order paths are used. On the other hand, λ = 0.5 means that frst-order and secondorder paths are combned wth equal weghts. Please see approach secton for the detals of the combnaton method. The fgures compares S3HOS wth the sem-supervsed algorthms; sem-supervsed Naïve Bayes (EM+MNB) and SS-LSI-kNN. We consder EM+MNB to be our baselne snce t too s a Naïve Bayes based sem-supervsed algorthm. We start wth the mdb dataset snce we got the most nterestng results from ths dataset as can be seen n Fg.3. We can see that lambda parameter has a crtcal nfluence on the performance of S3HOS on ths dataset. S3HOS wth λ =1 acheves 88% accuracy that s more than 10% dfference from ts closest compettor when the tranng dataset sze s only 1% and unlabeled data sze s 79%. Ths s the hghest gan we see among all our expermental results. Even the S3HOS, λ=0.5 performs better than our base-lne EM+MNB through out the chart. Interestngly, the performance of S3HOS, λ=1 (second-order paths only) seems to decrease slghtly as we ncrease the labeled dataset percentage. SS-LSI-kNN performs worse than other algorthms. Ths s usually the case throughout the experments. Our second dataset s 1150haber and n Fg. 4, the results are agan qute nterestng. The S3HOS λ=1 (second-order paths only) trumphs an accuracy of more than 90% only usng the %1 of the data as labeled tranng set and usng 79% of the data as unlabeled dataset. The accuracy reaches almost 95% when the tranng dataset sze ncreased to 5%. Fg. 4. Accuracy comparson of S3HOS wth sem-supervsed algorthms on 1150haber dataset. S3HOS surpass our baselne sem-supervsed algorthm EM+MNB on WebKB4 dataset as can be seen n Fg. 5. Interestngly, the lambda parameter seem to have a dfferent effect n here. In other words, frst-order paths seem to play a more mportant role n ths dataset. We attrbute ths phenomenon to the hghly skewed class dstrbuton of the dataset. Furthermore, ths dataset has the bgger documents compare to the others and the dfferences between the documents of dfferent classes are relatvely smaller. As a result, hgher-order path may be ntroducng much more nose to the algorthm compare to the others. Fg. 3. Accuracy comparson of S3HOS wth sem-supervsed algorthms on mdb dataset. Fg. 5. Accuracy comparson of S3HOS wth sem-supervsed algorthms on WebKB4 dataset. Overall, our experments show that S3HOS outperforms our baselne sem-supervsed model EM+MNB and SS-LSI-kNN (we could not provde results of the latter one for our larger dataset WebKB4 due to ts exceptonally hgh complexty).

6 VI. CONCLUSIONS AND FUTURE WORK We present a novel and a natural approach to sem-supervsed learnng for text classfcaton based on the hgher-order semantc smoothng (HOS) method that s developed n our prevous study. We name the proposed method as Sem-Supervsed Semantc Hgher Order Smooth-ng (S3HOS). The S3HOS s bult on a graph based data representaton of the documents, terms and class labels, whch allows semantcs n hgher-order co-occurrence paths between terms (and also the class labels) to be exploted. There are several graph-based technques proposed n the lterature to dffuse class labels from labeled documents to the unlabeled documents. In ths study we propose a novel and natural way of estmatng class condtonal probabltes for the terms n unlabeled documents wthout need to label them frst. The approach allow estmatng class condtonal probabltes even for the terms n unlabeled documents and mprove the estmaton of terms n the labeled documents at the mean tme. Ths new formulaton makes t possble to calculate class condtonal probabltes for some of the terms that do not oc-cur n documents of a partcular class as well as the terms does not even occurs n the documents of the labeled dataset. Furthermore, the use of second-order paths provdes more evdence for the terms that exst n a partcular class by makng use of the terms and documents n the unlabeled dataset. Hgher-order paths capture the term co-occurrences wthn a class of documents, term relaton patterns across classes, and term relaton patterns across labeled and unlabeled documents. Consequently, The Nave Bayes type probablty estmaton extracts patterns from a much rcher data representaton than the tradtonal vector space representaton. We expermentally show that S3HOS can promnently mprove the parameter estmaton and ncrease the classfcaton accuracy partcularly when the amount of the labeled data s scarce and unlabeled data s plentful. ACKNOWLEDGMENT Ths work s supported n part by The Scentfc and Technologcal Research Councl of Turkey (TÜBİTAK) grant number 111E239. Ponts of vew n ths document are those of the authors and do not necessarly represent the offcal poston or polces of the TÜBİTAK. REFERENCES [1] Taskar B, Abbeel P, Koller D. Dscrmnatve Probablstc Models for Relatonal Data. In Proc. Uncertanty n Artfcal Intellgence (UAI02), Alberta,Canada, August 1-4, 2002, pp [2] Getoor L, Dehl C P. Lnk Mnng: A Survey. ACM SIGKDD Exploratons Newsletter, 2005, 7(2), pp [3] Ganz M C, Lytkn N, Pottenger W M. Leveragng Hgher Order Dependences between Features for Text Classfcaton. In Proc. European Conference on Machne Learnng and Proncples and Practce of Knowledge Dscovery n Databases (ECML/PKDD), Bled, Slovena, September 2009, pp [4] Ganz M C, George C, and Pottenger W M. Hgher Order Naïve Bayes: A Novel Non-IID Approach to Text Classfcaton. IEEE Transactons on Knowledge and Data Engneerng, 2011, 23(7), pp [5] Lytkn N. Varance-based clusterng methods and hgher order data transformatons and ther applcatons. Ph.D. Thess, Rutgers Unversty, NJ, U.S.A. (2009) [6] Deerwester S C, Dumas S T, Landauer T K, Furnas G W, Harshman, R A. Indexng by Latent Semantc Analyss. Journal of the Amercan Socety for Informaton Scence, 41(6), 1990, pp [7] Kontostaths A, Pottenger W M. A Framework for Understandng LSI Performance. Journal of the Informaton Processng and Management, 42(1), 2006, pp [8] Sarah Z, Hrsh H. Transductve LSI for Short Text Classfcaton Problems. In Proc. of the 17th Internatonal Florda Artfcal Intellgence Research Socety Conference (FLAIRS), Mam Beach, FL, May 2004, PP [9] McCallum A, Ngam K. A Comparson of Event Models for Nave Bayes Text Classfcaton. In Proc. AAAI 1998 Workshop on Learnng for Text Categorzaton, Madson, Wsconsn, July 1998, pp [10] McCallum A, Ngam K. Text classfcaton by bootstrappng wth keywords, EM and shrnkage. In Workng Notes of ACL 1999 Workshop for the Unsupervsed Learnng n Natural Language Processng (ACL), Maryland, USA, pp [11] Amasyalı M F, Beken A. Türkçe Kelmelern Anlamsal Benzerlklernn Ölçülmes ve Metn Sınıflandırmada Kullanılması (Measurement of Turksh Word Semantc Smlarty and Text Categorzaton Applcaton). In Proc. IEEE Snyal İşleme ve İletşm Uygulamaları Kurultayı (SIU) (Sgnal Processng and Communcatons Applcatons Conference), Kocael, Turkey, Aprl [12] Renne J D, Shh L, Teevan J, Karger D R. Tacklng the poor assumptons of nave bayes text classfers. In Proc. ICML 2003, August, pp [13] Poyraz, Mtat, Zeynep Hlal Klmc, and Murat Can Ganz. "Hgher- Order Smoothng: A Novel Semantc Smoothng Method for Text Classfcaton." Journal of Computer Scence and Technology 29.3 (2014): [14] Pang, B., & Lee, L. (2004, July). A sentmental educaton: Sentment analyss usng subjectvty summarzaton based on mnmum cuts. In Proceedngs of the 42nd annual meetng on Assocaton for Computatonal Lngustcs (p. 271). Assocaton for Computatonal Lngustcs. [15] Zhu, X. Sem-Supervsed Learnng Lterature Survey, Unversty of Wsconsn Madson, Computer Scences Techncal Report, Last Modfed on on December 14, 2007 [16] O Chapelle, B Schölkopf, A Zen, Sem-supervsed learnng. MIT Press, ISBN , [17] Zhu, X. (2010). Sem-supervsed learnng. In Encyclopeda of Machne Learnng(pp ). Sprnger US. [18] Ngam, K., McCallum, A., & Mtchell, T. (2006). Sem-supervsed text classfcaton usng EM. Sem-Supervsed Learnng, [19] Krshnapuram, B., Wllams, D., Xue, Y., Carn, L., Fgueredo, M., & Hartemnk, A. J. (2004). On sem-supervsed classfcaton. In Advances n neural nformaton processng systems (pp ). [20] Belkn, M., Nyog, P., & Sndhwan, V. (2006). Manfold regularzaton: A geometrc framework for learnng from labeled and unlabeled examples. The Journal of Machne Learnng Research, 7, [21] Wang, F., & Zhang, C. (2008). Label propagaton through lnear neghborhoods. Knowledge and Data Engneerng, IEEE Transactons on, 20(1), [22] de Sousa, C. A. R., Rezende, S. O., & Batsta, G. E. (2013). Influence of graph constructon on sem-supervsed learnng. In Machne Learnng and Knowledge Dscovery n Databases (pp ). Sprnger Berln Hedelberg. [23] Lebchot, B., Kvma k, I., Françosse, K., & Saerens, M. (2014). Semsupervsed Classfcaton Through the Bag-of-Paths Group Betweenness. Neural Networks and Learnng Systems, IEEE Transactons on, 25(6), [24] Subramanya, A., & Talukdar, P. P. (2014). Graph-Based Sem- Supervsed Learnng. Synthess Lectures on Artfcal Intellgence and Machne Learnng, 8(4),

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto