A Hybrid Text Classification System Using Sentential Frequent Itemsets
|
|
- Stuart York
- 5 years ago
- Views:
Transcription
1 A Hybrd Text Classfcaton System Usng Sentental Frequent Itemsets Shzhu Lu, Hepng Hu College of Computer Scence, Huazhong Unversty of Scence and Technology, Wuhan , Chna Abstract: Text classfcaton technques mostly rely on sngle term analyss of the document data set, whle more concepts especally the specfc ones are usually conveyed by set of terms. To acheve more accurate text classfer, more nformatve feature ncludng frequent co-occurrng words n the same sentence and ther weghts are partcularly mportant n such scenaros. In ths paper, we propose a novel approach usng sentental frequent temset, a concept comes from assocaton rule mnng, for text classfcaton, whch vews a sentence rather than a document as a transacton, and uses a varable precson rough set based method to evaluate each sentental frequent temset s contrbuton to the classfcaton. Experments over the Reuters corpus are carred out, whch valdate the practcablty of the proposed system. Key-Words: text classfcaton, sentental frequent temsets, varable precson rough set model.. Introducton In an effort to keep up wth the tremendous growth of the World Wde Web, many research projects were targeted on how to organze such nformaton n a way that wll make t easer for the end users to fnd the nformaton they want effcently and accurately. Informaton on the Web s mostly present n the form of text document, and that s the reason content-based document management task( collectvely known as nformaton retreval IR), n the last 0 years, have ganed a promnent statues n the nformaton systems feld. Text classfcaton(tc also known as text categorzaton, or topc spottng), the actvty of labelng natural language texts wth thematc categores from a predefned set, s one such task. TC, becomng a major subfeld of the nformaton systems dscplne n the early 90s, s now beng appled n many contexts, rangng from document ndexng based on a controlled vocabulary, to document flterng, automated metadata generaton, word sense dsambguaton, populaton of herarchcal catalogue of Web resources, and n general any applcaton or selectve and adaptve document dspatchng. Recent studes n the data mnng communty proposed new methods for classfcaton employng assocaton rule mnng[,2]. All these current assocatve classfer, to our best knowledge, explot document-level co-occurrng words, whch are groups of words co-occurrng frequently n the same document[3,4]: tranng documents are modeled as transactons where tems are words from the document. Frequent words (temsets) are then mned from such transactons to catch document semantcs and generate IF-THEN rules accordngly. However, assumng document s the unt representng an entre dea, the basc semantc unt n a document s actually the sentence n t. Words co-occurrng n the same sentence have semantc assocaton more or less, and convey more local nformaton than the set of words scatterng n several sentences of a document. Accordng to above observatons, n ths paper, we propose a system for text classfcaton based on two key concepts. The frst s the document DB model whch treats sentence rather document as the transacton to mne the sentental frequent temset (SFI) as the feature of that document. The second concept s usng varable precson rough set model based method to evaluate each SFI s contrbuton to the classfcaton. The system conssts of four components:. A document restructurng scheme that clean nosy nfo n the document and map the orgnal document nto a document DB n whch sentence s the transacton where tems are words n the sentence. 2. A SFIs generator usng Apror algorthm to mne sentental frequent temsets, employed as the feature of the matrx document, n the tranng documents DB. 3. A topc template generator that prune the SFIs and usng the remanng ones to construct topc templates. 4. A classfer that sore each SFI s weght n the test document and topc templates usng our novel weghtng scheme and measure the smlarty between them.. The ntegraton of these four components proved to be of superor performance to tradtonal text classfcaton methods. Although the whole system performance s qute good, each component could be
2 used ndependent of the other. The overall system desgn s llustrated n Fg.. The rest of ths paper s organzed as the follows: Secton 2 ntroduces some prelmnary knowledge and state the problem formally. Secton 3 presents the steps of data preparaton. Secton 4 ntroduces the document DB model and sentental frequent temsets mnng process. Secton 5 ntroduces SFI prunng method. Secton 6 presents our proposed SFI weghtng scheme and SFI-based smlarty measure. Secton 7 dscusses the expermental results. Fnally, we conclude and dscuss future work n the last secton. Tranng Documents Document Constructor DB SFI Mner Topc Template Generator Unlabeled Documents Classfer Fgure. Text classfcaton system desgn 2. Prelmnary and Problem Defnton 2. Text categorzaton Text categorzaton s the task of assgnng a Boolean value to each par d j, c D C, where D s a doman of documents and C { c,..., c c } s a set of predefned categores. A value of T assgned to d j, c ndcates a decson to fle d j under c. More formally, the task s to approxmate the unknown target functon Φ : D C { T, F} (that descrbes how documents ought to be classfed) by means of a functon Φ : D C { T, F} called the classfer (also known as rule, or hypothess, or model) such that Φ and Φ concde as much as possble. Most of researches n text categorzaton come from the machne learnng and nformaton retreval communtes. Rocchl s algorthm[0] s the classcal method n nformaton retreval, beng used n routng and flterng documents. Researchers tackled the text categorzaton n many ways. Classfer based on probablstc methods have been proposed startng wth the frst presented n lterature by Maron n 96 and contnung wth naïve-bayes[] that proved to perform well. ID3 and C4.5 are well-known packages whose cores are makng use of decson tree to buld automatc classfers[2, 3, 4]. K-nearest neghbor (k-nn) s another technque used n text categorzaton[5]. Another method to construct a text categorzaton system s by an nductve rule learnng method. Ths type of classfers s represented by a set of rules n dsjunctve normal form that best cover the tranng set[6, 7, 8]. As reported n [9] the use of bgrams mproved the text categorzaton accuracy as opposed to ungrams use. In addton, n the last decades neural networks and vector machnes (SVM) were used n text categorzaton and they proved to be powerful tools[20, 2, 4]. 2.2 Varable precson rough set model Classfcaton s the core foundaton of rough set theory. In Pawlak s rough set model there s a lmt that the classfcaton s completely correct or wrong, namely, the defntons of lower and upper approxmatons are crsp, whch wll not be applcable to some complcate classfcatons. Based on majorty ncluson relaton Zarko [7] presented a generalzed rough set model, named as varable precson rough set, to overcome the lmtaton. Gven, Y U, we defne the ncluson of to Y, denoted as C (, Y ), by: Y /, > 0 C (, Y) () 0, 0 IS < U, A, V, f > s the nformaton system of dsclosure, where A s the set of attrbutes,
3 A { a, a2,..., a k }. V s the doman of values of A. f s an nformaton functon f : U A V. In the text classfcaton, U s the text collectons and A s the feature set. V s doman of the weght values of feaures n A. R s ndscernblty relaton defned on U, U / R {, 2,..., N }. Is a famly of R_equvalence classes. U s a subset of nterest, and we defne α lower approxmaton and α upper approxmaton by: Rα { U / B C(, ) α} (2) Rα { U / B C(, ) > α} Accordngly, s α boundary regon s defned by: BNDα { U / B α < C(, ) < α} (3) Where α [ 0.5,]. It s easy to show ths model s equvalent to Pawlak s model when α. Ths generalzaton smoothes the boundary of lower and upper approxmatons. In the orgnal rough set model, the classfcaton of the data wth respect to the relatonshp wth the target event s developed by usng three regons: the postve regon n whch an event would occur wth certanty, the negatve n whch an event would not occur wth certanty, and a boundary regon n whch an event mght or mght not occur. The varable precson rough set model defnes the postve and negatve regons as areas where the approxmate classfcaton wth respect to target event wth an error frequency less than some predefned level s possble. In other words, the postve regon then becomes a regon where the event occurred most of the tme, and negatve regon s the regon where the event occurred nfrequently. 3. Data Preparaton In our approach, to convert text of document nto our proposed document DB model whch wll be ntroduced n secton 4., some data preprocessng measures are necessary to be taken to each document. A sentence boundary detector algorthm was developed to locate sentence boundares n the documents. The algorthm s based on a fnte state machne lexcal analyzer wth heurstc rules for fndng the boundares. About 97 percent of the actual boundares are correctly detected. The resultng documents contan very accurate sentence separaton, wth almost neglgble nose. Fnally, to weed out those words that contrbute lttle to buldng the classfer and to reduce the hgh dmensonalty of the data, a document cleanng step s performed to remove stop-words that have no sgnfcance, and to stem the words usng the popular Porter Stemmer algorthm[5]. The subsequent phase conssts of dscoverng SFI from each document DB. 4. Document Database and SFI mnng 4. Document DB Model Sentence s a grammatcal unt that s syntactcally ndependent and has a subject that s expressed or understood. And the central meanng of a document s stated by organzng the basc dea of sentences. Focusng on mnng the local context nformaton n the sentences, we propose a document DB model. In document database model, a word s vewed as an tem, a natural sentence s vewed as a transacton, and a document s vewed as a transacton database. The detaled work flow of constructng document DB s llustrated n Fg. 2. The work presented here takes t a step further toward an effcent way of mnng local context nformaton. Documents Sentence Segmenter Stop words Remover Stemer Encoder Document DB Fgure 2.Process of document DB constructon
4 4.2 SFI Mnng After mappng each document as a transacton DB, we employ the apror algorthm to extract frequent occurrng sets of terms n sentences of each document and use them as that document s characterstc. Compared to documental frequent co-occurrng words, sentental frequent words convey more local context nformaton. The algorthm s descrbes n more detal n fgure 3. In Algorthm step(2) generates the frequent -temset. In steps(3-3) all the k-frequent temsets are generated and merged wth the category n C. The sentence space s reduced n each teraton by elmnatng the transactons that do not contan any of the frequent temsets. Ths step s done by FlterTable( S, F ) functon. The sentental frequent temsets dscovered n ths stage of the process are further processed to buld the topc templates. Fgure 3. Algorthm: fnd sentental frequent temsets n the gven document DB 5. Varable Precson Rough Set Model Based SFI Evaluaton Method By mergng SFIs of documents whch belong to the same category, we get the features of that category spontaneously. We wll use these frequent temsets to construct each category s topc template. For the number of SFI concernng wth a category could be very large, how to calculate each SFI s global weght, SFI s contrbuton to the classfcaton, s the key problem. We propose a weghtng scheme based on varable precson rough set model to evaluate each SFI s global weght, on whch we can select the SFIs for each topc template. Let F {, 2,..., C } be a partton of D, whch s the classfcaton of tranng document accordng to a set of predefned categores C c,..., c }. R A s a SFI as the condton { c
5 attrbutes subset here. Accordng to the Pawlak s rough set model wedefne R F and R F are defned by: RF { R, R2,..., R } (4) RF { R, R2,..., R } Correspondngly, n varable precson rough set model R α F and Rα F are defned by: Rα F { Rα, Rα 2,..., Rα n} (5) Rα F { Rα, Rα 2,..., Rα n} A measure was ntroduced to calculate the mprecson of ths classfcaton, whch s named as approxmate classfcaton qualty defned by: γ R ( F) R U α Approxmate qualty denotes the rato of objects that can be classfed to the F_equvalence classes wth certanty by SFI R. In other words, γ (F R ) measure the consstency degree between the classfcaton by R and F, whch may be nterpreted as the contrbuton to classfcaton that SFI makes. If γ (F R ) of all SFIs are calculated and ordered n ascendng, we can obtan a concse representaton of data by cuttng the features whose classfcaton qualty value s lower than a threshold that users have predefned. (6) 6. A SFI-Based Smlarty Measure As mentoned earler, sentental frequent temsets convey local context nformaton, whch s essental n rankng accurately a document s approprateness to categores. Towards ths end, we devsed a scheme to calculate weght of SFI n test document and topc templates and the cosne measure based on the weght s used to performed the classfcaton. Ths SFI weghtng scheme s a functon of three factors: the length of the SFI l, the frequences of the SFI n both document f, and the levels of sgnfcance (global weght ) of the SFI γ, whch s presented n secton 5. w f l γ (7) j j j j Frequency of SFI s an mportant factor n the measure. The more frequent the SFI appears n the document or the topc template, the more mportant the nformaton conveyed by the SFI s. Smlarly, The longer the SFI s, the more specfc the nformaton conveyed by the SFI s. The smlarty of the test document topc c s calculated wth cosne measure: sm ( c, d ) SFI j N wk wjk k N N 2 2 wk wjk k k (8) d j and the 6. Combnng Sngle-Term and SFI Smlartes If the smlarty between document and topc s solely based on matchng frequent temsets, and no sngle-terms at the same tme, related documents could be judged as nonsmlar f they do not share enough SFIs(a typcal case.) Shared SFIs provde mportant local context matchng, but sometmes smlarty based on SFIs only s not suffcent. To allevate ths problem, and to produce hgh qualty classfcaton, we combne sngle-term smlarty measure wth our temset-based smlarty measure. We used the cosne correlaton smlarty measure[6],[7], wth TF_IDF term weghts, as the sngle-term measure. The cosne measure was chosen due to ts wde use n the text classfcaton lterature, and snce t s descrbed as beng able to capture human categorzaton behavor well. The TF-IDF weghtng s also a wdely used term weghtng scheme. The combnaton of the term-based and the SFI-based smlarty measures s a weghted average of the two quanttes, and s gven by (9). The reason for separatng sngle-terms and SFIs n the smlarty equaton, as opposed to treatng a sngle-term as one-word-temset, s to evaluate the blendng factor between the two quanttes, and see the effect of SFIs n smlarty as opposed to sngle-terms. sm( c, d ) α sm ( c, d ) + ( α) sm ( c, d ) (9) j SFI j t j where α s a value n the nterval [0,] whch determnes the weght of the SFIs smlarty measure, or, as we call t, the Smlarty Blend Factor. Accordng to expermental results dscussed n Secton 7, we found that a value between 0.6 and 0.8 for α results n the maxmum mprovement n classfcaton qualty. 7. Expermental Results In order to test the effectveness of the text classfcaton system, we conducted a set of experments usng our proposed document DB model, varable precson rough set model based SFI prunng method, SFI weghtng scheme, and smlarty measure. 7. Text Corpora
6 Our set of evaluate experment was conducted on the well-known Reuters-2578 collectons, whch are usually splt nto two parts: tranng set for buldng the classfer and testng set for evaluatng the effectveness of the system. There are many splts of Reuters collecton; we select the ModApte verson. Ths splts leads to a corpus of 2,202 documents consstng of 9,603 tranng documents and 3,299 testng documents. All these documents belong to 35 topcs. However, only 93 topcs have more than one document n the tranng set and 82 topcs have less than 00 documents [8]. Obvously, the performances n the categores wth just a few documents would be very low, especally for those that do not even have a document n the tranng set. Among the documents there are some that have no topc assgned to them. We chose to gnore such documents snce no knowledge can be derved from them. Fnally we select ten categores wth largest number of correspondng tranng documents to test our system. Because other researchers often employ the smlar strategy, we can compare our expermental results wth the work of other researchers convenently. There are 6488 tranng documents and 2545 testng documents n these ten retaned categores. 7.2 Evaluaton Measures In order to assess the performance of our approach, we adopted some qualty measures wdely used n the text mnng lterature for the purpose of text classfcaton. The frst tow measures are Precson and Recall. The terms used to express precson and recall are gven n the contngency Table. Estmates of precson wth respect to c and recall wth respect to c may be thus obtaned as P (0) + FP R () + FN For obtanng estmates of P and R, two dfferent methods are adopted: mcroaveragng: P and R are obtaned by summng over all ndvdual decsons: P + FP µ (2) ( + FP) R + FN µ (3) ( + FN ) Where µ ndcates mcroaveragng. The global contngency table(table 2) s thus obtaned by summng over category-specfc contngency tables; macroaveragng: precson and recall are frst evaluated locally for each-category, and then globally over the results of the dfferent categores: P R M M P C R (4) (5) C Where M ndcates macroaveragng. Another measure taken here s Break-Even Pont (BEP), that s, the value at whch precson equals recall. Category c Classfer Judgments YES NO Expert judgments YES FN NO FP TN Table. The Contngency Table for Category Category C { c,..., c C } Classfer Judgments YES Expert judgments NO c YES FP FP NO FN FN TN TN Table 2. The Global Contngency Table 7.3 Expermental Results In order to better understand the effect of the SFI-based smlarty measure on classfcaton qualty, we carry out a set of experments on the text corpora menton n secton 7. and compare the expermental results the most well-known method.
7 Table 3(the results for the other classfcaton systems are reported as gven n [9]) shows a comparson between our classfer and the other well-known methods. The measure used here are precson/recall-breakeven pont, mcro-average and macro-average on the ten most populated Reuters categores. Our system proves to outperform most of the conventonal method, although Its performance s not every good for three categores,.e., gran, money-fx, trade. Table 3. Precson/Recall-breakeven pont on ten most populated Reuters categores for SFI-BC and most known classfers 8. Concluson and Future Work Text classfcaton s a key test-mnng problem, whch s useful to a great number of text-based applcatons. We presented a system composed of four components n an attempt to mprove the text classfcaton problem. The frst component cleans the data and maps document as the document DB. The second component uses apror algorthm to mne the sentental frequent temsets from document DB and use them as the feature of the correspondng document. The thrd component s the topc template generator. We propose a varable precson rough set abased method to evaluate each SFI s contrbuton to the classfcaton. The fourth component s the SFI-based smlarty measure. By carefully examnng the factors affectng the classfcaton, we devsed a SFI-based smlarty measure that s capable of accurate calculaton of smlarty between test document and topc template. The merts of such a desgn are that each component could be utlzed ndependent of the other. But, we have confdence that the combnaton of these components leads to better result. The expermental results show that the SFI based classfer performs well and ts effectveness s comparable to most well-know text classfers. There are a number of future research drectons to extend and mprove ths work. One drecton that ths work mght contnue on s to mprove on the accuracy of SFI-BC. Although the current scheme proved more accurate than tradtonal methods, there s stll room for mprovement. Another drecton s to mprove the feature selecton qualty. Some other feature selecton technques, such as latent semantc analyss whch could gve an nsght on the dscrmnatve feature among classes maybe s the complement of our strategy. Although the work presented here s amed at text classfcaton, t could be easly adapted to Web document as well. However, t wll have to take semstructure of Web document nto account. Our ntenton s to develop a Web document classfcaton system wth our approach. References [] W. L and J. Pe. CMAR: Accurate and effcent classfcaton based on multple class-assocaton rules. In IEEE nternatonal Conference on Data Mnng (ICDM 0), San Jose, Calforna, Novermber 29-December [2] B. Lu, W. Hsu, and Y. Ma. Integratng classfcaton and assocaton rule mnng. In ACM Int. Conf. on Knowledge Dscovery and Data Mnng (SIGKDD 98), pages 80-86, New York Cty, NY,
8 August 998. [3] M.Antone and O.R.Zaane. Text Document Categorzaton by Term Assosaton. In Proc. of IEEE Intl. Conf. on Data Mnng, pages 9-26, [4] D.Meretaks, D.Fragoutds, H.Lu and S.Lkothanass. Scalable Assocaton-based Text Classfcaton. In Proc. of ACM CIKM, 2000 [5] M.F. Poter, An Algorthm for Suffx Strppng, Program, vol.4, no.3, pp-30-37, July 980. [6] G..Salton, A. Wong, and C. Yang, A Vector Space Model for Automatc Indexng, Comn. ACM, vol. 8, no., pp , Nov. 75. [7] G. Salton, Automatc Text Processng: The Transformaton, Analyss and Retreval of Informaton by computer. Readng, Mas: Addson Wesley, [8] O.R.Zaïane and M.L.Antone. Classfyng text documents by assocaton terms wth text categores. In Thrteen Australasan Database Conference(ACD 02), pages , Melbourne, Australa, January [9] T. Joachms. Text categorzaton wth support vector machnes: learnng wth many relevant features. In 0 th European Conference on Machne Learnng (ECML-98), pages 37-42, 998. [0] D. A. Hull. Improvng text retreval for the routng problem usng latent semantc ndexng. In 7 th ACM nternatonal Conference on Machne learnng (ECML-98), pages 37-42, 998. [] D.Lews. Naïve (bayes) at forty: The ndependence assumpton n nformaton retreval. In 0 th Conference on Machne Learnng (ECML-98), pages 4-5, 998. [2] W. Cohen and H. Hrsch. Jons that generalze: text classfcaton usng whrl. In 4 th Internatonal Conference on Knowledge Dscovery and Data Mnng (SgKDD 98), pages 69-73, New York Cty, USA, 998. [3] W. Cohen and Y. Snger. Context-senstve learnng methods for text categorzaton. ACM transacton on Informaton systems, 7(2):46-73, 999. [4] T. Joachms. Text categorzaton wth support vector machnes: learnng wth many relevant feature. In 0 th European Conference on Machne Learnng(ECML-98), pages 37-42, 998. [5] Y.Yang. An evaluaton of statstcal approaches to text categorzaton. Techncal Report CUM-CS-97-27, Carnege mellon Unversty, Aprl 997. [6] I.Mounlner and J.G.. Ganasca. Applyng an exstng machne learnng algorthm to text xategorzaton. In S.Wermter, E.Rloff, and G.Scheler, edtors, Connectonst statstcal, and symbolc approaches to learnng for natural language processng. Sprnger Verlag, Hedelberg, Germany, 996. Lecture Notes for Computer Scence seres, number 040. [7] H.L and K. Yamansh. Text classfcaton usng esc-based stochastc decson lsts. In 8 th ACM nternatonal Conference on Informaton and Knowledge Management(CIKM-99), pages 22-30, Kansas Cty, USA,999. [8] C.Apte, F.Damerau, and S. Wess. Automated learnng of decson rules for text categorzaton. ACM Transactons on Informaton System, 2(3):232-25, 994. [9] C. M. Tan, Y. F. Wang, and C. D. Lee. The use of bgrams to enhance text categorzaton. Journal of Informaton Processng and Management, [20] M. Ruz and P. S:nvasan. Neural networks for text categorzaton. In 22 nd ACM SIGIR nternatonal Conference on Informaton Retreval, pages , Berkeley, CA, USA, August 999. [2] Y. Yang and. Lu. A re-examnaton of text categorzaton methods. In 22ACM nternatonal Conference on Research and Development n Informaton Retreval (SIGIR-99), pags 42-49, Berkeley, US, 999.
Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationUB at GeoCLEF Department of Geography Abstract
UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department
More informationThe Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationExperiments in Text Categorization Using Term Selection by Distance to Transition Point
Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationPruning Training Corpus to Speedup Text Classification 1
Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan
More informationAvailable online at Available online at Advanced in Control Engineering and Information Science
Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced
More informationPerformance Evaluation of Information Retrieval Systems
Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationAn Image Fusion Approach Based on Segmentation Region
Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua
More informationA Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems
A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationCSCI 5417 Information Retrieval Systems Jim Martin!
CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationFINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK
FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationKeywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines
(IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationDetermining Fuzzy Sets for Quantitative Attributes in Data Mining Problems
Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationQuery Clustering Using a Hybrid Query Similarity Measure
Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan
More informationConcurrent Apriori Data Mining Algorithms
Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng
More informationDeep Classification in Large-scale Text Hierarchies
Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationWeb Document Classification Based on Fuzzy Association
Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA charuech@mam.edu, shyu@mam.edu
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationArabic Text Classification Using N-Gram Frequency Statistics A Comparative Study
Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu
More informationImproving Web Image Search using Meta Re-rankers
VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute
More informationApplication of k-nn Classifier to Categorizing French Financial News
Applcaton of k-nn Classfer to Categorzng French Fnancal News Huazhong KOU, Georges GARDARIN 2, Alan D'heygère 2, Karne Zetoun PRSM Laboratory, Unversty of Versalles Sant-Quentn 45 Etats-Uns Road, 78035
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationDocument Representation and Clustering with WordNet Based Similarity Rough Set Model
IJCSI Internatonal Journal of Computer Scence Issues, Vol. 8, Issue 5, No 3, September 20 ISSN (Onlne): 694-084 www.ijcsi.org Document Representaton and Clusterng wth WordNet Based Smlarty Rough Set Model
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationEnhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques
Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationA Novel Term_Class Relevance Measure for Text Categorization
A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure
More informationIssues and Empirical Results for Improving Text Classification
Issues and Emprcal Results for Improvng Text Classfcaton Youngoong Ko 1 and Jungyun Seo 2 1 Dept. of Computer Engneerng, Dong-A Unversty, 840 Hadan 2-dong, Saha-gu, Busan, 604-714, Korea yko@dau.ac.kr
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15
CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationHierarchical Semantic Perceptron Grid based Neural Network CAO Huai-hu, YU Zhen-wei, WANG Yin-yan Abstract Key words 1.
Herarchcal Semantc Perceptron Grd based Neural CAO Hua-hu, YU Zhen-we, WANG Yn-yan (Dept. Computer of Chna Unversty of Mnng and Technology Bejng, Bejng 00083, chna) chhu@cumtb.edu.cn Abstract A herarchcal
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationDescription of NTU Approach to NTCIR3 Multilingual Information Retrieval
Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan
More informationCLASSIFICATION OF ULTRASONIC SIGNALS
The 8 th Internatonal Conference of the Slovenan Socety for Non-Destructve Testng»Applcaton of Contemporary Non-Destructve Testng n Engneerng«September -3, 5, Portorož, Slovena, pp. 7-33 CLASSIFICATION
More informationIncremental Learning with Support Vector Machines and Fuzzy Set Theory
The 25th Workshop on Combnatoral Mathematcs and Computaton Theory Incremental Learnng wth Support Vector Machnes and Fuzzy Set Theory Yu-Mng Chuang 1 and Cha-Hwa Ln 2* 1 Department of Computer Scence and
More informationSemantic Image Retrieval Using Region Based Inverted File
Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:
More informationProper Choice of Data Used for the Estimation of Datum Transformation Parameters
Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and
More informationCUM: An Efficient Framework for Mining Concept Units
CUM: An Effcent Framework for Mnng Concept Unts P.Santh Thlagam Ananthanarayana V.S Department of Informaton Technology Natonal Insttute of Technology Karnataka - Surathkal Inda 575025 santh_soc@yahoo.co.n,
More informationA User Selection Method in Advertising System
Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationCAN COMPUTERS LEARN FASTER? Seyda Ertekin Computer Science & Engineering The Pennsylvania State University
CAN COMPUTERS LEARN FASTER? Seyda Ertekn Computer Scence & Engneerng The Pennsylvana State Unversty sertekn@cse.psu.edu ABSTRACT Ever snce computers were nvented, manknd wondered whether they mght be made
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationSyntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel
Syntactc Tree-based Relaton Extracton Usng a Generalzaton of Collns and Duffy Convoluton Tree Kernel Mahdy Khayyaman Seyed Abolghasem Hassan Abolhassan Mrroshandel Sharf Unversty of Technology Sharf Unversty
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationBOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET
1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School
More informationFace Recognition Based on SVM and 2DPCA
Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationData Mining: Model Evaluation
Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct
More informationDeep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies
Deep Classfer: Automatcally Categorzng Search Results nto Large-Scale Herarches Dkan Xng 1, Gu-Rong Xue 1, Qang Yang 2, Yong Yu 1 1 Shangha Jao Tong Unversty, Shangha, Chna {xaobao,grxue,yyu}@apex.sjtu.edu.cn
More informationReliable Negative Extracting Based on knn for Learning from Positive and Unlabeled Examples
94 JOURNAL OF COMPUTERS, VOL. 4, NO. 1, JANUARY 2009 Relable Negatve Extractng Based on knn for Learnng from Postve and Unlabeled Examples Bangzuo Zhang College of Computer Scence and Technology, Jln Unversty,
More informationEfficient Text Classification by Weighted Proximal SVM *
Effcent ext Classfcaton by Weghted Proxmal SVM * Dong Zhuang 1, Benyu Zhang, Qang Yang 3, Jun Yan 4, Zheng Chen, Yng Chen 1 1 Computer Scence and Engneerng, Bejng Insttute of echnology, Bejng 100081, Chna
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationBAYESIAN MULTI-SOURCE DOMAIN ADAPTATION
BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,
More informationType-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data
Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES
More informationJournal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray
More informationData Preprocessing Based on Partially Supervised Learning Na Liu1,2, a, Guanglai Gao1,b, Guiping Liu2,c
6th Internatonal Conference on Informaton Engneerng for Mechancs and Materals (ICIMM 2016) Data Preprocessng Based on Partally Supervsed Learnng Na Lu1,2, a, Guangla Gao1,b, Gupng Lu2,c 1 College of Computer
More informationMachine Learning. Topic 6: Clustering
Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess
More informationAlignment Results of SOBOM for OAEI 2010
Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn,
More informationX- Chart Using ANOM Approach
ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are
More informationLearning-Based Top-N Selection Query Evaluation over Relational Databases
Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationMachine Learning 9. week
Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below
More informationBioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.
[Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationSelecting Query Term Alterations for Web Search by Exploiting Query Contexts
Selectng Query Term Alteratons for Web Search by Explotng Query Contexts Guhong Cao Stephen Robertson Jan-Yun Ne Dept. of Computer Scence and Operatons Research Mcrosoft Research at Cambrdge Dept. of Computer
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationImpact of a New Attribute Extraction Algorithm on Web Page Classification
Impact of a New Attrbute Extracton Algorthm on Web Page Classfcaton Gösel Brc, Banu Dr, Yldz Techncal Unversty, Computer Engneerng Department Abstract Ths paper ntroduces a new algorthm for dmensonalty
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationHelsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)
Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute
More informationAnnouncements. Supervised Learning
Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples
More informationAutomatic Text Categorization of Mathematical Word Problems
Automatc Text Categorzaton of Mathematcal Word Problems Suleyman Cetntas 1, Luo S 2, Yan Png Xn 3, Dake Zhang 3, Joo Young Park 3 1,2 Department of Computer Scence, 2 Department of Statstcs, 3 Department
More informationUsing Neural Networks and Support Vector Machines in Data Mining
Usng eural etworks and Support Vector Machnes n Data Mnng RICHARD A. WASIOWSKI Computer Scence Department Calforna State Unversty Domnguez Hlls Carson, CA 90747 USA Abstract: - Multvarate data analyss
More informationMachine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)
Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes
More informationRelevance Feedback Document Retrieval using Non-Relevant Documents
Relevance Feedback Document Retreval usng Non-Relevant Documents TAKASHI ONODA, HIROSHI MURATA and SEIJI YAMADA Ths paper reports a new document retreval method usng non-relevant documents. From a large
More informationDecision Strategies for Rating Objects in Knowledge-Shared Research Networks
Decson Strateges for Ratng Objects n Knowledge-Shared Research etwors ALEXADRA GRACHAROVA *, HAS-JOACHM ER **, HASSA OUR ELD ** OM SUUROE ***, HARR ARAKSE *** * nsttute of Control and System Research,
More informationA Method of Hot Topic Detection in Blogs Using N-gram Model
84 JOURNAL OF SOFTWARE, VOL. 8, NO., JANUARY 203 A Method of Hot Topc Detecton n Blogs Usng N-gram Model Xaodong Wang College of Computer and Informaton Technology, Henan Normal Unversty, Xnxang, Chna
More informationImplementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status
Internatonal Journal of Appled Busness and Informaton Systems ISSN: 2597-8993 Vol 1, No 2, September 2017, pp. 6-12 6 Implementaton Naïve Bayes Algorthm for Student Classfcaton Based on Graduaton Status
More informationThe Shortest Path of Touring Lines given in the Plane
Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He
More informationOnline Text Mining System based on M2VSM
FR-E2-1 SCIS & ISIS 2008 Onlne Text Mnng System based on M2VSM Yasufum Takama 1, Takash Okada 1, Toru Ishbash 2 1. Tokyo Metropoltan Unversty, 2. Tokyo Metropoltan Insttute of Technology 6-6 Asahgaoka,
More informationKeyword-based Document Clustering
Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of
More informationCross-Language Information Retrieval
Feature Artcle: Cross-Language Informaton Retreval 19 Cross-Language Informaton Retreval Jan-Yun Ne 1 Abstract A research group n Unversty of Montreal has worked on the problem of cross-language nformaton
More informationNetwork Intrusion Detection Based on PSO-SVM
TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*
More informationFuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework
Fuzzy Weghted Assocaton Rule Mnng wth Weghted Support and Confdence Framework M. Sulaman Khan, Maybn Muyeba, Frans Coenen 2 Lverpool Hope Unversty, School of Computng, Lverpool, UK 2 The Unversty of Lverpool,
More informationA CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION
A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION 1 FENG YONG, DANG XIAO-WAN, 3 XU HONG-YAN School of Informaton, Laonng Unversty, Shenyang Laonng E-mal: 1 fyxuhy@163.com, dangxaowan@163.com, 3 xuhongyan_lndx@163.com
More informationA Multi-step Strategy for Shape Similarity Search In Kamon Image Database
A Mult-step Strategy for Shape Smlarty Search In Kamon Image Database Paul W.H. Kwan, Kazuo Torach 2, Kesuke Kameyama 2, Junbn Gao 3, Nobuyuk Otsu 4 School of Mathematcs, Statstcs and Computer Scence,
More information