Online Text Mining System based on M2VSM
|
|
- Antonia Lynch
- 5 years ago
- Views:
Transcription
1 FR-E2-1 SCIS & ISIS 2008 Onlne Text Mnng System based on M2VSM Yasufum Takama 1, Takash Okada 1, Toru Ishbash 2 1. Tokyo Metropoltan Unversty, 2. Tokyo Metropoltan Insttute of Technology 6-6 Asahgaoka, Hno, Tokyo , Japan emal: ytakama@sd.tmu.ac.p Abstract Ths paper proposes an onlne text mnng system that s developed based on M2VSM (Meta keyword-based Modfed VSM. When conventonal vector space model (VSM s appled to document clusterng, t s dffcult to adust the granularty of cluster n terms of topc. In order to solve the problem, M2VSM s proposed as an extended VSM so that t can consder meta keywords such as adectves and adverbs, as addtonal value of ndexng terms. The smlarty between documents s calculated by consderng the matchng of meta keywords for each ndex term, whch makes t possble to cluster documents wth varous granulartes n terms of topc. The onlne text mnng system s developed wth MUSASHI, whch s one of the most popular open source data mnng tools. By usng the system, users can perform a seres of text mnng process onlne, ncludng preprocessng, feature selecton, clusterng, and vsualzaton of results. Expermental results show that clusterng results by M2VSM match the results by test subects n both rough and detaled clusterng. It s also shown that the system can process database contanng 5,000 documents wthn 7 mnutes. I. INTRODUCTION We can fnd huge databases easly on the Web n recent years, because of breakthroughs n technque for nformaton acquston and dramatcally low-prcng of the mass storage devces. The volume of such databases has been already beyond human s ablty of nformaton processng, and ntellgent support by nformaton technologes ncludng nformaton retreval and data mnng are requred. Varous knds of data mnng and nformaton retreval technques have been developed based on vector space model (VSM because of ts several advantages. One of them s the ablty to rank the documents n order of the expectaton that documents are approprate to a user s query. However, conventonal VSM s dffcult to adust the granularty of cluster n term of a topc. For example, when VSM s appled to document database of a specfc feld such as the feld of medcne, the documents tend to form dense clusters n the vector space because of hgh smlarty between them; therefore ther performance decreases [3]. Employng addtonal words as ndex terms s one of the usual solutons, because ncreasng the number of dmenson by ncreasng the number of ndex terms can make the vector space sparse. However, ths sometmes leads to problems such as curse of dmensonalty, whch prevents the expresson of the accurate relatonshp between documents. Furthermore, clusters found n such sparse space do not tend to have correspondng topc, whch makes t hard to nterpret for humans. In order to solve above-mentoned problems, M2VSM (Meta keyword-based Modfed VSM has been proposed by extendng conventonal VSM [3, 4]. The M2VSM makes use of such meta keywords as addtonal value of ndexng terms, and the smlarty between documents s calculated by consderng the matchng of meta keywords for each ndex term. Ths paper proposes a text mnng system that s developed based on M2VSM. It s desgned for analyzng large volume of documents, from preprocessng such as ndex terms / meta keywords selecton, to document clusterng. It s developed wth MUSASHI, whch s one of the most popular open source data mnng tools. By usng the system, users can perform a seres of text mnng process onlne, ncludng preprocessng, feature selecton, clusterng, and vsualzaton of results. Expermental results show that M2VSM can generate clusters that match those generated by test subects, n both of rough and detaled clusterng. It s also shown that the developed system can analyze 5,000 documents wthn 400 seconds, whch means t s sutable for practcal use n terms of processng speed. II. M2VSM A. Vector Space Model The VSM has been wdely used n the tradtonal nformaton retreval feld. The VSM model creates a mult-dmensonal space, n whch both documents and queres are represented by vectors. For a fxed collecton of documents, a N w -dmensonal vector s generated for each document and query from sets of terms assocated weghts, where N w s the number of ndexng terms n the document collecton. Then the smlarty between documents ncludng query s calculated by cosne measure. In VSM, weght w assocated wth the term t n document D s often calculated by TFIDF (Term Frequency Inverse Document Frequency measure [11], whch s calculated by Eq.. m N D TFIDF( t, D = log, M DF( t where m represents the number of occurrences (frequency of term t n document D, M represents the total frequency of ndexng terms n D, N D s the total number of documents and DF(t s the number of documents contanng t. The smlarty sm(d, D between documents D and D s defned as cosne value of document vectors (Eq.. 739
2 Nw = n = 1 wnwn sm( D, D. D D B. Outlne of M2VSM As mentoned n Secton I, when the conventonal VSM s appled to cluster documents, t s dffcult to adust the granularty of cluster n terms of a topc. In partcular, VSM s not good at dvdng documents n terms of detaled topc. Therefore, when t s appled to a database n a specfc feld such as the feld of medcne, t can crowd the documents n the vector space [3, 4]. Furthermore, even though herarchcal clusterng such as AHC [1] s employed, obtaned herarchy does not always correspond to topcal herarchy lke Web drectory servces. One of the reasons causng ths problem s the exstence of the ndexng terms appearng n many documents, because they have the general meanngs n the feld. Therefore ncreasng the number of the ndexng terms dose not only resolve the problem but also causes the curse of dmensonalty at worst. The M2VSM assumes f same ndexng terms have the dfferent meta keywords (adectves, adverbs, etc. as ts modfer n a dfferent document, each document refers to the dfferent topcs. In other words, ndex terms ndependently represent a topc n general sense, whereas ndex terms combned wth meta keywords represent a topc n more detal. That s, meta keywords gve addtonal value of ndexng terms. Gven the collecton of meta keywords S M, we defne the smlarty as Eq. -(5, Nw α n n wnwn sm( D D = = 1,, α α D D N W α D = α w n n= 1 k n 2 n, (4 α = α, k = ME ME, (5 n where ME n (a subset of S M represents the set of meta keywords of ndexng term t n n a document D, and α n reflects the degree of co-occurrng meta keywords of t n between D and D nto document smlarty calculaton. The α (>1 s parameter for adustng the effect of meta keywords, whch s set to 3 n the experments of ths paper. In prevous study [3, 4], the range of α n s defned as [0,1], whch means the exstence of meta keyword appearng ether document (.e., ether D or D decreases the smlarty. Compared wth the prevous study, the α n ( 1 reflects the nfluence of meta keywords more postvely nto smlarty calculaton. n C. Selecton of Meta Keyword In Sec. 2-B, adectves and adverbs are referred to as meta keywords. In partcular, we select meta keywords from adectves, adverbs, adnomnal nouns, and adectve verbs. These parts of speech are used for descrbng the characterstcs or state of target obect, sentment, emoton, etc., whch are sutable as meta keywords. Ths paper focuses on processng documents wrtten n Japanese. Bascally, a meta keyword of an ndex term t s defned as ether adectve, adverb, adnomnal noun, or adectve verb, whch has modfcaton relaton wth t. In ths paper, nouns are used as ndex terms unless those are used as adnomnal nouns. It s noted that the text mnng system n Sec. III can nteractvely specfy the part of speech for ndex terms and meta keywords. In order to dentfy ndex terms and meta keywords from Japanese documents, ths paper employs Japanese dependency parser Cabocha [6]. III. TEXT MINING SYSTEM BASED ON M2VSM Fg. 1 shows the system archtecture of the developed text mnng system that s based on M2VSM. Current verson of the system can only handle Japanese documents. It conssts of 3 processng components: preprocessng, ndexng / meta keyword selecton, clusterng, and vsualzaton. Gven a set of documents that are to be analyzed, preprocessng component performs morphologcal analyss, syntactc and dependency parsng, removal of words belongng to the part of speech that s not used as ndexng terms or meta keywords, and reunon of words that are excessvely segmented. The result s stored n a database n order to speedng up the subsequent processng. Document Set Selecton Method Selecton Preprocessng Indexng / Meta-keyword selecton Clusterng Data Results Vsualzaton Cluster Selecton Fg. 1. System archtecture of text mnng system based on M2VSM In the next step, a set of ndex terms as well as that of meta keywords, whch are used for smlarty calculaton by M2VSM, are selected. Ths step s performed nteractvely wth the help of a user. In the thrd step, document clusters are generated from the target document set based on the document-document smlarty calculated by M2VSM. The system employs sngle pass clusterng [8, 9] n order to process large number of documents wthn a reasonable tme. When performng clusterng, three smlarty measure can be appled; sngle-lnkage, complete-lnkage, and average-lnkage method. It s also possble to calculate the smlarty based on ordnary VSM. The result of clusterng s presented to a user wth ether table format or usng nformaton vsualzaton [5, 10]. 740
3 The system s mplemented wth usng MUSASHI [2, 7, 12], whch s a famous open source data mnng tool. MUSASHI provdes a set of commands for processng vast amount of data as shown n Table 1. It s expected that usng MUSASHI makes t possble to develop stable and effcent system n relatvely short development tme. Table 1. Part of command set of MUSASHI Command xtagg xtbar xtcat xtcomb xtcount xtcut xtcorrelaton Bref descrpton Aggregaton of records Generaton of bar graph (SVG format Merge of multple XML tables Calculaton of combnaton Countng the number of rows Selecton of tems Calculaton of correlaton coeffcent A. Selecton of ndex terms / meta keywords In the developed system, a user can nteractvely select ndex terms and meta keywords that are to be used for the analyss. Frst, a user selects ndex terms, and then selects meta keywords from the remanng words. When selectng ndex terms, words are fltered based on the part of speeches specfed by a user. The result s further fltered by specfyng mnmum and maxmum df values (DF(t n Eq.. In order to specfy approprate df values, the hstogram of df values s presented by a user, as shown n Fg. 2. Fnally, a user can examne each of the words obtaned by those flterng processes, and remove unwanted words. summary of a clusterng result, and that showng the detal of a cluster. As one of the advantages of M2VSM s that t can perform both rough and detaled clusterng as dscussed n Sec. II-B, the developed system can perform several clusterng processes wth dfferent thresholds n the same tral. The table shows the summary of clusterng results. For each clusterng result, used threshold for clusterng, the number of obtaned clusters, and the number of documents n each cluster s presented. By selectng one or more nterestng clusterng results, summary of the results s shown as the table. The table contans the number of documents, the numbers of ndex terms and meta keywords, frequences of ndex terms and meta keywords for each cluster. By selectng one or more nterestng clusters, detaled nformaton about the clusters s shown as the table. The table contans typcal ndex terms together wth correspondng meta keywords, and typcal documents for the selected clusters. Typcal ndex terms and meta keywords are selected based on ther frequences, and up to 5 documents close to the centrod of a cluster are selected as typcal documents of the cluster. Other detaled nformaton about a cluster, such as the relatonshp between ndex terms, that between ndex terms and meta keywords, that between meta keywords, and that between cluster centrods and ndex terms, are also dsplayed by Keyword Map as shown n Fg. 3. Keyword Map treats an ndex term, meta keyword, and cluster centrod as a node, whch s arranged accordng to the relatonshps wth other nodes so that related nodes can form a cluster on the map. Keyword Map employs sprng model for drawng a map. Flterng by DF values # of Index terms Fg. 2. Hstogram of DF values DF The selecton of meta keywords s performed n smlar way, except that typcal ndex terms are presented for each canddate meta keyword n the last step. A user can select meta keywords by examnng ther relatons wth the correspondng ndex terms. B. Output of clusterng results The result of clusterng s presented to a user wth two types of formats: a table format and vsualzaton by keyword map [5]. A table format s generated as HTML fles, whch a user can vew wth ordnary HTML browser. There are three types of tables; a table comparng the clusterng results wth dfferent thresholds, that showng the Fg. 3. Analyzed result presented by Keyword Map IV. EXPERIMENTS A. Performance of M2VSM Experments are performed wth three document sets wrtten n Japanese. The purpose of the experments s to show the effectveness of the proposed M2VSM aganst conventonal VSM n document clusterng n two levels: clusterng by general topc (rough clusterng and by detaled topc (detaled 741
4 clusterng. For that purpose, comparson between clusterng results by M2VSM, VSM, PVSM (Phrase-based VSM, and test subects are performed. The PVSM s a smple extenson of VSM, n whch phrase (combnaton of a noun and ts meta keywords s used as an ndex term nstead of ndependent word. It s expected that PVSM could generate clusters correspondng to more detaled topcs than normal VSM. Documents used for the experments are edtoral artcles of 7 Japanese newspaper companes: Asah Shmbun 1, Yomur Shmbun 2, Nkke Shmbun 3, Kobe Shmbun 4, Chugoku Shmbun 5, Hokkado 6 Shmbun, and Kahoku Shmpo 7. Total number of artcles, whch were collected from June 1, 2005 to December 29, 2005, s 2,298. In order to reduce the burden of test subects, we frst appled M2VSM, VSM, and PVSM to the collected document sets, and found small subset of documents that belong to the same cluster by any of 3 methods. By addng some nose artcles to those subsets, we obtaned the 3 document set A, B, and C, each of whch contans 20 documents. That s, each document set forms a sngle cluster under general topc, but are dvded nto several clusters under specfc topcs. In ths process, sngle lnkage method s appled and the same threshold s used for the 3 methods. The topcs of the document set are as follows. Here, RC (rough cluster means the topc of the entre document set, and DC (detaled clusters means the topcs of clusters when a document set s dvded n detal. - Set A: (RC nternatonal ssues, (DC sx-party talks, postwar perod, Iran - Set B: (RC North Korea, (DC Japan-North Korea talks, sx-party talks - Set C: (RC IT, (DC Rakuten-TBS problem, meda and the Internet Ten test subects are asked to cluster each of the 3 document sets n two levels. Frst, they are asked to roughly dvde the documents n terms of topc. Then, the obtaned clusters are further dvded nto clusters n terms of more detaled topc. There s no constrant on the number of clusters n each level. The clusterng results by M2VSM, VSM, and PVSM and those by test subects are compared wth the followng measure. dp( method, subect Match( method, subect, D =, (6 C D 2 where method ndcates ether M2VSM, VSM, or PVSM, and subect ndcates a subect (=1,,10. The D s a document set (A, B, or C, d p (method,subect s the number of document pars, whch are clustered n the same way by both of method and subect. For example, let us consder the case where a document set contans 3 documents {1, 2, 3} and one method dvdes t as {1, 2} and {3}, and a subect dvdes t as {1, 2, 3}. In ths case, total number of document pars (.e. denomnator n Eq. (6 s 3 ( 1-2, 1-3, and 2-3, and only the par 1-2 s clustered n the same way (.e. belongng to the same cluster by both of them, the matchng score s 1/3=0.33. When the clusterng results of both a method and a subect are completely the same, Eq. (6 s equal to. Each method s compared wth 10 test subects wth Eq. (6. Table 2, 3, and 4 summarze the comparson result for document set A, B, and C, respectvely. In these tables, the left column (RC for each method shows the result for rough clusterng, and the rght column (DC s the result for detaled clusterng. That s, the sngle-lnkage method s appled to the document set and rough clusterng s performed by cuttng the obtaned dendrogram wth low smlarty threshold, whereas detaled clusterng s performed wth hgh smlarty threshold. Thresholds are determned for each method so that the average score (Eq. (6 over test subects can be as hgh as possble. In the tables, NUM shows the number of clusters generated by each method, TH s used threshold, Avg., Max, and Mn are average, max, mnmum matchng score over 10 test subects, respectvely (number n parentheses s the rank among three methods. Table 2. Expermental results for set A NUM TH Avg Max Mn Table 3. Expermental results for set B NUM TH Avg Max Mn It can be seen from the tables that M2VSM and VSM obtan the same results for all data sets n the case of rough clusterng. However, when detaled clusterng s performed, the performance of VSM s lower than M2VSM. PVSM tends to outperform VSM when detaled clusterng s performed, but ts performance tends to be worst than other 2 methods n rough clusterng. The most mportant thng s that the proposed M2VSM can obtan the best results for all 3 data sets, n both rough and detaled clusterng. These results show M2VSM s 742
5 capable of adustng the granularty of clusters n terms of a topc. Table 4. Expermental results for set C NUM TH Avg Max Mn B. Evaluaton of M2VSM-based Text Mng System The performance of the developed text mnng system s evaluated n terms of processng tme. Documents used for the experments are edtoral artcles of 7 Japanese newspaper companes 1-7. Total number of artcles s 5,672, whch were collected from May 1, 2005 to Jan 31, Table 5 shows the used parameter values and the specfcaton on whch the system was run. The processng tme throughout the analyss process,.e., from preprocessng to document clusterng s measured whle varyng the number of documents from 200 to 5,000. It s noted that the tme of user s nteractng wth the system s omtted from the processng tme. Fg. 3 shows the relatonshp between processng tme and the number of documents. It can be seen that the developed system can process 5,000 documents wthn 400 seconds. Table 5. Parameters used for experment # of ndex terms # of meta keywords Clusterng Method 2, Average lnkage 0.7 CPU Cache sze Memory Swap Pentum GHz Tme (s Threshold for clusterng 512KB 755MB 1.5GB # of documents Fg. 3. Relatonshp between the number of documents and processng tme V. CONCLUSIONS The M2VSM, a modfed VSM based on meta keywords s proposed for clusterng documents wth varous granulartes n terms of topc. The M2VSM makes use of adectves, adverbs, adnomnal nouns, and adectve verbs as meta keywords for ndex terms, and calculate document smlarty whle consderng the effect of meta keywords. Experments are performed wth the sets of edtoral artcles of Japanese newspapers, and the results show obtaned clusterng results can correspond to the results by test subects n the case of both rough and detaled clusterng. A text mnng system s developed based on M2VSM, and the expermental result shows t has enough processng speed for practcal use. In the future study, we are gong to provde the system wth experts for a specfc doman, such as management engneer. Although only documents wrtten n Japanese s consdered n ths paper, M2VSM tself can be appled to documents wrtten wth other languages such as Englsh documents. As preprocessng,.e., meta keyword extracton should be dfferent from language to language, t should be studed for each languages. Our future study ncludes the applcaton of M2VSM to Englsh documents. In the experments reported n the paper, clusterng of only two levels, rough and detaled clusterng, s performed. It s also challengng to apply M2VSM to generaton of mult-level (more than 2 level topcal structure, such as Web drectory servces. REFERENCES [1] S. Chakrabart, Chapter 4: Smlarty and Clusterng, mnng the web, Morgan Kaufmann, pp , [2] Y. Hamuro, N. Katoh, K. Yada, MUSASHI: Flexble and Effcent Data Preprocessng Tool for KDD based on XML, Proceedngs of the Frst Internatonal Workshop on Data Cleanng and Preprocessng, pp.38-49, [3] T. Ishbash, and Y. Takama, Proposal of M2VSM and Its Comparson wth Conventonal VSM, AM2004, Vol ICS-128, pp. 1-6, [4] T. Ishbash, and Y. Takama, Proposal of M2VSM for Informaton Retreval n the Specfc Feld, SCIS&ISIS2004, THP-3-3, [5] T. Kanam, and Y. Takama, Interactve Keyword Map Equpped wth Keywords Arrangement Support Functons for Emphaszng User s Intenton, Trans. Informaton Processng Socety of Japan, Vol. 48, No. 3, pp , 2007 (wrtten n Japanese. [6] T. Kudo, and Y. Matsumoto, Japanese dependency analyss usng cascaded chunkng, Proc. Of 6th Conference on Natural Language Learnng, Vol. 20, pp. 1-7, [7] MUSASHI Mnng Utltes and System Archtecture for Scalable processng of HIstorcal data, [8] R. Papa, and J. Allan, On-lne New Event Detecton Usng Sngle-pass Clusterng, UMASS Computer Scence Techncal Report, UM-CS , [9] M. Sptters, and W. Kraa, TNO at TDT2001: Language Model-Based Topc Detecton, Topc Detecton and Trackng (TDT Workshop 2001, [10] Y. Takama, T. Kanam, and A. Matsumura, Applcaton of Keyword Map-based Relevance Feedback to Interactve Blog Search, AMT 2005, pp , [11] J. Thorsten, A probablstc Analyss of the Roccho Algorthm wth TFIDF for Text Categorzaton, n proceedng of the 14th Internatonal Conference on Machne Learnng, pp , [12] K. Yada, Y. Hamuro N. Katoh, T. Washo, I. Fusamoto, Data Mnng Orented CRM System Based on MUSASHI: C-MUSASHI, Proceedngs of Second Internatonal Workshop on Actve Mnng, pp.52-61,
Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationUB at GeoCLEF Department of Geography Abstract
UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department
More informationDescription of NTU Approach to NTCIR3 Multilingual Information Retrieval
Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationQuery Clustering Using a Hybrid Query Similarity Measure
Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationInformation Retrieval
Anmol Bhasn abhasn[at]cedar.buffalo.edu Moht Devnan mdevnan[at]cse.buffalo.edu Sprng 2005 #$ "% &'" (! Informaton Retreval )" " * + %, ##$ + *--. / "#,0, #'",,,#$ ", # " /,,#,0 1"%,2 '",, Documents are
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationPerformance Evaluation of Information Retrieval Systems
Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationSkew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach
Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research
More informationDocument Representation and Clustering with WordNet Based Similarity Rough Set Model
IJCSI Internatonal Journal of Computer Scence Issues, Vol. 8, Issue 5, No 3, September 20 ISSN (Onlne): 694-084 www.ijcsi.org Document Representaton and Clusterng wth WordNet Based Smlarty Rough Set Model
More informationImprovement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration
Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationLobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide
Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationA Knowledge Management System for Organizing MEDLINE Database
A Knowledge Management System for Organzng MEDLINE Database Hyunk Km, Su-Shng Chen Computer and Informaton Scence Engneerng Department, Unversty of Florda, Ganesvlle, Florda 32611, USA Wth the exploson
More informationAn Image Fusion Approach Based on Segmentation Region
Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua
More informationWeb Document Classification Based on Fuzzy Association
Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA charuech@mam.edu, shyu@mam.edu
More informationUSING GRAPHING SKILLS
Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationAvailable online at Available online at Advanced in Control Engineering and Information Science
Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationCOMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL
COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL Nader Safavan and Shohreh Kasae Department of Computer Engneerng Sharf Unversty of Technology Tehran, Iran skasae@sharf.edu
More informationEnhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques
Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More information3D vector computer graphics
3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres
More informationFeature Reduction and Selection
Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components
More informationRelevance Feedback Document Retrieval using Non-Relevant Documents
Relevance Feedback Document Retreval usng Non-Relevant Documents TAKASHI ONODA, HIROSHI MURATA and SEIJI YAMADA Ths paper reports a new document retreval method usng non-relevant documents. From a large
More informationKeyword-based Document Clustering
Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of
More informationLoad-Balanced Anycast Routing
Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance
More informationDetermining the Optimal Bandwidth Based on Multi-criterion Fusion
Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn
More informationThe Research of Support Vector Machine in Agricultural Data Classification
The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou
More informationSemantic Image Retrieval Using Region Based Inverted File
Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationA mathematical programming approach to the analysis, design and scheduling of offshore oilfields
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and
More informationA PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION
1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationSteps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices
Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationA Method of Hot Topic Detection in Blogs Using N-gram Model
84 JOURNAL OF SOFTWARE, VOL. 8, NO., JANUARY 203 A Method of Hot Topc Detecton n Blogs Usng N-gram Model Xaodong Wang College of Computer and Informaton Technology, Henan Normal Unversty, Xnxang, Chna
More informationDeep Classification in Large-scale Text Hierarchies
Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong
More informationSequential search. Building Java Programs Chapter 13. Sequential search. Sequential search
Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to
More informationVirtual Machine Migration based on Trust Measurement of Computer Node
Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on
More informationCourse Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms
Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques
More informationCombining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval
Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu
More informationFuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval
Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,
More informationMULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION
MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and
More informationLearning-Based Top-N Selection Query Evaluation over Relational Databases
Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **
More informationOn Some Entertaining Applications of the Concept of Set in Computer Science Course
On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,
More informationBioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.
[Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented
More informationA Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment
A Webpage Smlarty Measure for Web Sessons Clusterng Usng Sequence Algnment Mozhgan Azmpour-Kv School of Engneerng and Scence Sharf Unversty of Technology, Internatonal Campus Ksh Island, Iran mogan_az@ksh.sharf.edu
More informationDecision Strategies for Rating Objects in Knowledge-Shared Research Networks
Decson Strateges for Ratng Objects n Knowledge-Shared Research etwors ALEXADRA GRACHAROVA *, HAS-JOACHM ER **, HASSA OUR ELD ** OM SUUROE ***, HARR ARAKSE *** * nsttute of Control and System Research,
More informationClustering Algorithm of Similarity Segmentation based on Point Sorting
Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan
More informationA Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems
A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty
More informationPruning Training Corpus to Speedup Text Classification 1
Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan
More informationBrave New World Pseudocode Reference
Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be
More informationR s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes
SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More informationTECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.
TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationA Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China
for Database Clusterng Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal: 6085@qq.com Me Zhang Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal:64605455@qq.com Database clusterng
More informationCross-Language Information Retrieval
Feature Artcle: Cross-Language Informaton Retreval 19 Cross-Language Informaton Retreval Jan-Yun Ne 1 Abstract A research group n Unversty of Montreal has worked on the problem of cross-language nformaton
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationJournal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2860-2866 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A selectve ensemble classfcaton method on mcroarray
More informationLinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals
nkselector: A Web Mnng Approach to Hyperlnk Selecton for Web Portals Xao Fang and Olva R. u Sheng Department of Management Informaton Systems Unversty of Arzona, AZ 8572 {xfang,sheng}@bpa.arzona.edu Submtted
More informationUser Tweets based Genre Prediction and Movie Recommendation using LSI and SVD
User Tweets based Genre Predcton and Move Recommendaton usng LSI and SVD Saksh Bansal, Chetna Gupta Department of CSE/IT Jaypee Insttute of Informaton Technology,sec-62 Noda, Inda sakshbansal76@gmal.com,
More informationANALYSIS OF ADAPTIF LOCAL REGION IMPLEMENTATION ON LOCAL THRESHOLDING METHOD
Nusantara Journal of Computers and ts Applcatons ANALYSIS F ADAPTIF LCAL REGIN IMPLEMENTATIN N LCAL THRESHLDING METHD I Gust Agung Socrates Ad Guna 1), Hendra Maulana 2), Agus Zanal Arfn 3) and Dn Adn
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE
Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationSURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB
SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB V. Hotař, A. Hotař Techncal Unversty of Lberec, Department of Glass Producng Machnes and Robotcs, Department of Materal
More informationImage Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline
mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and
More informationQuerying by sketch geographical databases. Yu Han 1, a *
4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,
More informationRules for Using Multi-Attribute Utility Theory for Estimating a User s Interests
Rules for Usng Mult-Attrbute Utlty Theory for Estmatng a User s Interests Ralph Schäfer 1 DFKI GmbH, Stuhlsatzenhausweg 3, 66123 Saarbrücken Ralph.Schaefer@dfk.de Abstract. In ths paper, we show that Mult-Attrbute
More informationAn Efficient Genetic Algorithm with Fuzzy c-means Clustering for Traveling Salesman Problem
An Effcent Genetc Algorthm wth Fuzzy c-means Clusterng for Travelng Salesman Problem Jong-Won Yoon and Sung-Bae Cho Dept. of Computer Scence Yonse Unversty Seoul, Korea jwyoon@sclab.yonse.ac.r, sbcho@cs.yonse.ac.r
More informationCorner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity
Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent
More informationAlignment Results of SOBOM for OAEI 2010
Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn,
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationArabic Text Classification Using N-Gram Frequency Statistics A Comparative Study
Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu
More informationA Clustering Algorithm for Key Frame Extraction Based on Density Peak
Journal of Computer and Communcatons, 2018, 6, 118-128 http://www.scrp.org/ournal/cc ISSN Onlne: 2327-5227 ISSN Prnt: 2327-5219 A Clusterng Algorthm for Key Frame Extracton Based on Densty Peak Hong Zhao
More informationComplex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.
Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal
More informationVisual Thesaurus for Color Image Retrieval using Self-Organizing Maps
Vsual Thesaurus for Color Image Retreval usng Self-Organzng Maps Chrstopher C. Yang and Mlo K. Yp Department of System Engneerng and Engneerng Management The Chnese Unversty of Hong Kong, Hong Kong ABSTRACT
More informationClassic Term Weighting Technique for Mining Web Content Outliers
Internatonal Conference on Computatonal Technques and Artfcal Intellgence (ICCTAI'2012) Penang, Malaysa Classc Term Weghtng Technque for Mnng Web Content Outlers W.R. Wan Zulkfel, N. Mustapha, and A. Mustapha
More informationKeywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines
(IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak
More informationIntrinsic Plagiarism Detection Using Character n-gram Profiles
Intrnsc Plagarsm Detecton Usng Character n-gram Profles Efstathos Stamatatos Unversty of the Aegean 83200 - Karlovass, Samos, Greece stamatatos@aegean.gr Abstract: The task of ntrnsc plagarsm detecton
More informationAnalysis of Continuous Beams in General
Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationSolving two-person zero-sum game by Matlab
Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by
More informationConcept Forest: A New Ontology-assisted Text Document Similarity Measurement Method
Concept Forest: A New Ontology-asssted Text Document Smlarty Measurement Method James Z. Wang Wllam Taylor School of Computng Clemson Unversty, Box 340974 Clemson, SC 29634-0974, USA +1-864-656-7678 {jzwang,
More informationObject-Based Techniques for Image Retrieval
54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the
More informationMachine Learning. Topic 6: Clustering
Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess
More information