Web Document Classification Based on Fuzzy Association

Size: px
Start display at page:

Download "Web Document Classification Based on Fuzzy Association"

Transcription

1 Web Document Classfcaton Based on Fuzzy Assocaton Choochart Haruechayasa, Me-Lng Shyu Department of Electrcal and Computer Engneerng Unversty of Mam Coral Gables, FL 33124, USA Shu-Chng Chen Dstrbuted Multmeda Informaton System Laboratory School of Computer Scence Florda Internatonal Unversty Mam, FL 33199, USA Xuq L NSF/FAU Multmeda Laboratory, Florda Atlantc Unversty Boca Raton, FL 33431, USA xl@cse.fau.edu Abstract In ths paper, a method of automatcally classfyng Web documents nto a set of categores usng the fuzzy assocaton concept s proposed. Usng the same word or vocabulary to descrbe dfferent enttes creates ambguty, especally n the Web envronment where the user populaton s large. To solve ths problem, fuzzy assocaton s used to capture the relatonshps among dfferent ndex terms or eywords n the documents,.e., each par of words has an assocated value to dstngush tself from the others. Therefore, the ambguty n word usage s avoded. Experments usng data sets collected from two Web portals: Yahoo! ( and Open Drectory Proect (dmoz.org) are conducted. We compare our approach to the vector space model wth the cosne coeffcent. The results show that our approach yelds hgher accuracy compared to the vector space model. Keywords: Informaton Processng on the Web, Data Mnng, Document Classfcaton, Fuzzy Assocaton. 1. Introducton The World Wde Web (WWW) can be vewed as a dstrbuted database system, but wth two dfferent aspects. Frstly, WWW contans much larger amount of data than a typcal database system. WWW s often referred to as the world s largest dstrbuted database system wth the amount of data growng at an exponental rate [17]. These data can be of heterogeneous types such as text, mage, audo, and vdeo. Secondly, WWW nvolves a huge user populaton that s not restrcted to a certan demographc group or a geographc area. The result s the wde varaton n nformaton content and qualty. In addton, unle a typcal database system where the maorty of users only retreve the nformaton through some queres, WWW allows ts users to provde and share the nformaton publcly on the system. Wth the large amount of avalable nformaton on the Web, searchng for specfc nformaton or dscoverng any useful nformaton becomes a dffcult and challengng tas. To allevate ths problem, many data mnng technques have been appled nto the Web context. Ths s referred to as Web mnng [3]. Web mnng s defned as the dscovery and analyss of useful nformaton from WWW. Some of Web mnng technques nclude analyss of user access patterns [10][14], Web document clusterng [1][15], and classfcaton [2][4][5][16]. Document classfcaton or text categorzaton (as used n nformaton retreval context) s the process of assgnng a document to a predefned set of categores based on the document content. Document classfcaton can be appled as an nformaton flterng tool and can also be used to mprove the retreval results from a query process. To help the users search and browse for specfc nformaton on the Web, many of the well-nown Web portals such as Yahoo! [21] have organzed the nformaton, n form of Web documents, nto some predefned categores such as Arts & Humantes, Computers & Internet, and Entertanment. However, ths

2 approach of organzng Web documents requres human efforts and hence, s very subectve and does not scale well. In ths paper, a method of automatcally classfyng Web documents nto a set of categores usng the fuzzy assocaton concept s proposed. The fuzzy assocaton uses the concept of the Fuzzy Set theory [18] to model the vagueness n the nformaton retreval process. Examples of the research wors nvolvng the use of the fuzzy assocaton technque nclude [6], [7], [8], and [9]. The basc concept of fuzzy assocaton nvolves the constructon of a pseudothesaurus of eywords or ndex terms from a set of documents [7]. By constructng a pseudothesaurus, the relatonshp among dfferent ndex terms or eywords n the documents s captured,.e., each par of words has an assocated value to dstngush tself from other pars of words. Therefore, the ambguty n word usage s mnmzed. Several researches have been done n the area of document classfcaton or text categorzaton. Some of these researches perform experments usng only a document set from a specfc topc. For example, n [5], the document collecton, Reuters, whch s busness related, s used n ther experments. Other research wor such as [2], [4], and [16] focus on the Web documents. However, all of these researches use only a set of documents obtaned from a sngle Web drectory. For example, [2] and [16] use Yahoo! Drectory as ther data set, and [4] uses LooSmart ( s drectory. As mentoned earler, the process of organzng the Web drectores s based on human efforts and can be very subectve. Therefore, n ths paper, we apply our approach and perform our experments usng data sets collected from two dfferent Web portals: Yahoo! [21] and Open Drectory Proect [19]. In general, when dealng wth data n hgh multdmensonal space, the performance, n terms of storage space and executon tme, can be greatly affected by the hgh dmenson. Ths problem s generally nown as the curse of dmensonalty. For a document data set, ths problem also holds, snce a document collecton can contan mllons of dfferent ndex terms or eywords. A classcal document clusterng approach, vector space model [13], whch represents each document usng n- dmensonal vector (where n s the number of eywords) also suffers from ths problem. By usng the fuzzy assocaton technque n our approach, the dmenson of the eyword representaton for the categores can be reduced wthout much performance degradaton. Also the selecton of dfferent eywords n representng each category does not affect the performance as much compared to the vector space approach. The rest of the paper s organzed as follows. In the next secton, the concept of the fuzzy assocaton that has been appled n the area of nformaton retreval systems s ntroduced. In ths secton, our proposed fuzzy classfcaton model s also descrbed. In Secton 3, the expermental results and dscussons are gven. The paper s concluded n Secton Fuzzy assocaton for document classfcaton In ths secton, we frst revew the concept of the fuzzy assocaton that has been appled n the area of nformaton retreval systems. Then we descrbe our classfcaton model based on the fuzzy assocaton concept n detals Fuzzy assocaton n nformaton retreval Fuzzy set theory [18] deals wth the representaton of classes whose boundares are not well defned. The ey dea s to assocate a membershp functon wth the elements of the class. Ths functon taes values on the nterval [0, 1] wth 0 correspondng to no membershp n the class and 1 correspondng to full membershp. Membershp values between 0 and 1 ndcate margnal elements of the class. Thus, membershp n a fuzzy set s a noton ntrnscally gradual nstead of abrupt or crsp (as n conventonal Boolean logc). Fuzzy assocatve nformaton retreval (IR) mechansm s formalzed wthn the fuzzy set theory and based on the defnton of fuzzy assocaton. It captures the assocaton between the eywords to mprove the retreval results from tradtonal IR systems. By provdng the assocaton between the eywords, some addtonal documents that are not drectly ndexed by the eywords n the query can also be retreved. Defnton 1. A fuzzy assocaton between two fnte sets X={x 1,,x u } and Y={y 1,,y v } s formally defned as a bnary fuzzy relaton f: X Y [0,1], where u and v are the numbers of elements n X and Y, respectvely. The constructon of the assocaton between ndex terms or eywords s generally nown as the generaton of the fuzzy pseudothesaurus. In [7], a formal defnton and process of generatng fuzzy pseudothesaurus based on cooccurrences of eywords s gven. It can be summarzed as follows. Defnton 2. Gven a set of ndex terms, T={t 1,,t u }, and a set of documents, D={d 1,,d v }, each t s represented by a fuzzy set h(t ) of documents; h(t )={F(t,d ) d D}, where F(t,d ) s the sgnfcance (or membershp) degree of t n d.

3 Defnton 3. The fuzzy related terms (RT) relaton s based on the evaluaton of the co-occurrences of t and t n the set D and can be defned as follows. RT( t, t ) = mn( F( max( F( ), F( ), F( In [9], a smplfcaton of the fuzzy RT relaton based on the co-occurrence of eywords s gven as follow. n r =,, n + n n, Eq. 1, where r, represents the fuzzy RT relaton between eywords and, n, s the number of documents contanng both th and th eywords, n s the number of documents ncludng the th eyword, and n s the number of documents ncludng the th eyword. Next, the calculaton of the fuzzy RT relaton between eywords s appled n our classfcaton model Fuzzy classfcaton model The process of classfyng Web documents s explaned n detals as follows. Gven C = {C 1, C 2,, C m }, a set of categores, where m s the total number of categores, the frst step s to collect the tranng sets of Web documents, TD = {TD 1, TD 2,, TD m }, from each category n C. Ths step nvolves crawlng through the hypertext lns encapsulated n each document. Once the document collectons are obtaned, they are cleaned through the stemmng and stopword removal processes. Next, the most frequently occurred eywords from the document sets based on each category are extracted and put nto separate eyword sets, K = {K 1, K 2,, K m }. From these m sets of eywords, we combned them nto a set of all eywords, A = { 1, 2,, n },where n s the total number of all dstnct eywords representng the vector dmenson. Note that some of the eywords can appear n more than one category, but we only consder one nstance of these. Then we generate the eyword correlaton matrx M usng the fuzzy RT relaton equaton (gven n Eq. 1). The eyword correlaton matrx s an n n symmetrc matrx whose element, m, has the value on the nterval [0, 1] wth 0 ndcates no relatonshp and 1 ndcates full relatonshp between the eywords and. Therefore, m s equal to 1 for all =, snce a eyword has the strongest relatonshp to tself. )) )) To classfy the documents n the test data set nto dfferent categores, frst, each category must be represented wth a set of eywords. The best way to represent each category s to select only the exclusve eywords,.e., for category C,, we consder the eywords n K whch do not belong n another eyword sets K, where =1 m and. We refer to ths as the category eyword sets, CK = {CK 1, CK 2,, CK m }. Next, the test documents n the test data set are cleaned and the eywords are extracted by loong up n A, the lst of all eywords. Ths process gves us the representaton of those test documents, D = {d 1, d 2,, d p }, where p s the total number of documents to be classfed. After that, the membershp degrees between each document to each of the category sets are calculated usng the followng equaton. µ = max [1 (1 r, )], Eq. 2, a d b CK a b where µ, s the membershp degree of d belongng to C, r a,b s the fuzzy relaton between eyword a d and eyword b CK. A document d s classfed nto the category C where the membershp degree µ, s the maxmum. The eyword a n d s assocated to category C f the eywords b s n CK (for category C ) are related to the eyword a. Whenever there s at least one eyword n CK whch s strongly related to the eyword a n d (.e., r a,b ~ 1), then Eq. 2 yelds µ, ~ 1, and the eyword a s a good fuzzy ndex for the category C. In the case when all eywords n CK are ether loosely related or unrelated to a, the eyword a s not a good fuzzy ndex for C (.e., µ, ~ 0). 3. Experments and results Ths secton provdes the descrptons and characterstcs of the data sets used for performng our experments. Also, we brefly revew the vector space model wth the cosne coeffcent as a comparson approach. Then, the expermental results and dscussons are presented Expermental data sets Experments usng the predefned categores and the document sets collected from two Web portals, Yahoo! [21] and Open Drectory Proect (ODP) [19], are conducted. The bref descrpton and hstory of these two Web portals are provded n [20]. In our experments, we only consder those documents n Englsh and gnore all other non-englsh documents. Therefore, the categores, World and Regonal, are excluded from our expermental

4 data sets. Table 1 shows the selected categores from these two Web portals. Based on these predefned categores, we collect approxmately 18,000 documents from each of the Web drectores as the tranng and test data sets. To avod the problem of over-fttng the data when performng the experments, we randomly select two-thrd of the documents as the tranng data set and one-thrd as the test data set. Table 1. Predefned category sets from two Web portals Yahoo! ODP Category Abbr. Category Abbr. Arts & Humantes art Arts art Busness & Economy bus Busness bus Computers & Internet com Computers com Educaton edu Games game Entertanment et Health health Government gov Home home Health health Kds and Teens d News & Meda news News news Recreaton & Sports rec Recreaton rec Scence sc Scence sc Socal Scence sosc Shoppng shop Socety & Culture soc Socety soc TOTAL 12 Sports sport TOTAL 13 Consderng only the tranng data sets from these two dfferent Web stes, we extract and select the most frequently occurred eywords from each category as follows. For the Yahoo! data set, 350 most frequent eywords are selected from each of 12 categores. Some of the eywords appear n more than one category, but we only consder one nstance for each of these. The total number of all dstnct eywords s For the ODP data set, we also select 350 most frequent eywords from each of 13 categores. The total number of dstnct eywords s Vector space model The vector space model s one of the classcal clusterng methods frst proposed by [13]. Ths method has been successfully appled to many IR systems ncludng the well-nown SMART system [12]. The vector space model assgns the attrbutes (eywords n ths context) nto n-dmensonal space, where n s the number of the attrbutes. Therefore, each document can be represented by an n-dmensonal vector called a document vector. For the classfcaton problem, we have some predefned set of categores, where each can also be represented by an n-dmensonal vector called category vector. To classfy a document nto one of the categores, the document vector s compared wth all category vectors usng a smlarty metrc. The document s classfed nto the category where the smlarty measure s the hghest among all other categores. Several approaches for calculatng the smlarty measure between documents have been proposed [11]. Two types of measures have been wdely used. The frst s the dstance metrc (representng dssmlarty) such as Eucldean dstance. The second type s smlarty measures such as cosne and dce coeffcents. In ths paper, as a comparson approach, the cosne coeffcent s used to calculate the smlarty measures between a document and a category. The calculaton of the cosne coeffcent s gven below. n ( f, g, ) v v = 1 Eq. 3 COSINE( f, g ) = n n 2 f, = 1 = 1 g 2, where v f F, F s a set of n-dmensonal document vectors, v g G, G s a set of n-dmensonal category vectors, and n represents the total number of dstnct eywords Results and dscussons To compare the performance of our method (denoted as Fuzzy) to the vector space model (denoted as Vector) approach, we use the test data sets and measure the classfcaton accuracy by varyng the vector lengths of the category vectors. To see the effect of usng dfferent sets of eywords n representng the category vectors, we provde two ways of selectng the eywords: selectng from the most frequently occurred eywords (denoted as topmost), and selectng from the least frequently occurred eywords (denoted as bottommost). Fgure 1 shows the expermental result by usng the Yahoo! data set. As can be seen from ths fgure, for all cases, the classfcaton accuracy ncreases when the number of eywords used to represent the category vectors s ncreased. Our approach yelds a hgher accuracy compared to the vector space model. For example, when the vector length s 10, our approach yelds the accuraces of 74.9% for the topmost sets and 41.1% for the bottommost sets, whereas the vector space model yelds the accuraces of 57. for the topmost sets and 12.2% for the bottommost sets. In Fgure 2, the performance result based on 12 categores of Yahoo! s presented. As expected, our approach yelds hgher accuraces for all categores.

5 We perform the same experments on the ODP data set. The results are shown n Fgure 3 and Fgure 4, respectvely. The results are smlar to the results obtaned from the Yahoo! data set, except one dfferent observaton. By usng the bottommost eywords n our approach, the average accuracy s 78.1%, and by usng the topmost eywords n the vector space model, the average accuracy s 67.1%. That s, by usng ether the topmost or bottommost representatons, our approach performs better than the vector space model. category does not affect the performance as much as the vector space model. For example, for the Yahoo! data set, by usng the bottommost eywords, nstead of the topmost eywords, the accuracy drops 21.4% n our approach, whereas the accuracy drops 39. n the vector space model approach. Fuzzy(topm ost) Vector(topm ost) Fuzzy(bottom m ost) Vector(bottom m ost) Fuzzy(topm ost) Fuzzy(bottom m ost) 6 4 Vector(topm ost) Vector(bottom m ost) Vector Length Vector Length Fgure 3. Classfcaton performance by varyng the vector length ODP Fgure 1. Classfcaton performance by varyng the vector length Yahoo! Vector(topm ost) Fuzzy(topm ost) 6 4 art vector(topm ost) Fuzzy(topm ost) bus com edu et gov health news rec sc sosc Category soc 6 4 art bus com game health home d news rec sc shop soc sport Category Fgure 4. Classfcaton performance by categores ODP Fgure 2. Classfcaton performance by categores - Yahoo! Table 2 shows the summarzed results for both Yahoo! and ODP data sets. The results are calculated by averagng the accuracy values over all the vector lengths. By usng the topmost representaton for the category vector, our approach yelds hgher average classfcaton accuraces of 13.7% and 17.7% over the vector space model for the Yahoo! and ODP data sets, respectvely. Another observaton s that, for our approach, the selecton of dfferent eywords n representng the Data set Table 2. Average classfcaton accuracy Fuzzy (topmost) Fuzzy (bottommost) Vector (topmost) Vector (bottommost) Yahoo! 81.5% 60.1% 67.8% 28.8% ODP 84.8% 78.1% 67.1% 46.1%

6 4. Concluson In ths paper, an alternatve approach of automatcally classfyng the Web documents nto some predefned categores usng the fuzzy assocaton concept s proposed. Realzng the ambguty n word usage n Englsh, the fuzzy assocaton method avods ths problem by capturng the relatonshp or assocaton among dfferent ndex terms or eywords n the documents. The result s that each par of words has an assocated value to dstngush tself from other pars of words. Experments usng the data sets obtaned from two dfferent Web drectores, Yahoo! and ODP, are conducted. Both Web portals are ndependent and have dfferent characterstcs from each other. We compare our fuzzy assocaton approach to the vector space model approach. To see the effect of dfferent eyword selectons for category vectors, two dfferent alternatves: selectng from the most frequently occurred eyword (topmost) and selectng from the least frequently occurred eywords (bottommost) wth varyng vector lengths are used. The results show that, on average, our approach yelds hgher classfcaton accuraces compared to the vector space model for both the topmost and bottommost cases. In addton, wth our approach, usng fewer numbers of eywords for category representaton does not degrade the accuracy as much compared wth the vector space model. 5. Acnowledgments For Shu-chng Chen, ths research was supported n part by NSF CDA References [1] A.Z. Broder, S.C. Glassman, and M.S. Manasse, Syntactc Clusterng of the Web, Proceedngs of the 6th Internatonal World Wde Web Conference, Aprl 1997, pp [2] C. Cheur, M. Goldwasser, P. Raghavan, and E. Upfal, Web Search Usng Automatc Classfcaton, Proceedngs of the 6th Internatonal World Wde Web Conference, Aprl [3] R. Cooley, B. Mobasher, and J. Srvastava, Web Mnng: Informaton and Pattern Dscovery on the World Wde Web, Proceedngs of the 9th IEEE Internatonal Conference on Tools wth Artfcal Intellgence (ICTAI'97), November 1997, pp [4] S. T. Dumas and H. Chen, Herarchcal Classfcaton of Web Content, Proceedngs of the 23rd Internatonal ACM Conference on Research and Development n Informaton Retreval (SIGIR 00), August 2000, pp [5] D. Koller and M. Saham, Herarchcally Classfyng Documents Usng Very Few Words, Proceedngs of the 14th Internatonal Conference on Machne Learnng (ICML 97), July 1997, pp [6] S. Myamoto, Two Approaches for Informaton Retreval Through Fuzzy Assocatons, IEEE Transactons on Systems, Man, and Cybernetcs, vol. 19, no. 1, January/February 1989, pp [7] S. Myamoto, T. Myae, and K. Naayama, Generaton of a Pseudothesaurus for Informaton Retreval Based on Cooccurences and Fuzzy Set Operatons, IEEE Transactons on Systems, Man, and Cybernetcs, vol. 13, no. 1, 1983, pp [8] S. Myamoto and K. Naayama, Fuzzy Informaton Retreval Based on a Fuzzy Pseudothesaurus, IEEE Transactons on Systems, Man, and Cybernetcs, vol. 16, no. 2, March/Aprl 1986, pp [9] Y. Ogawa, T. Morta, and K. Kobayash, A Fuzzy Document Retreval System Usng the Keyword Connecton Matrx and a Learnng Method, Fuzzy Sets and Systems, vol. 39, 1991, pp [10] J. Ptow and P. Proll, Mnng Longest Repeatng Subsequences to Predct World Wde Web Surfng, Proceedngs of the 2nd USENIX Symposum on Internet Technologes and Systems (USITS'99), Oct 1999, pp [11] E. Rasmussen, Chapter 16: Clusterng Algorthms, n W. B. Fraes and R. Baeza-Yates, edtors, Informaton Retreval: Data Structures &Algorthms, Prentce Hall, 1992, pp [12] G. Salton, edtor. The SMART retreval system: experments n automatc document processng, Prentce- Hall Seres n Automatc Computaton, Englewood Clffs, New Jersey, 1971, Chapters [13] G. Salton, A. Wong, and C.S. Yang, A Vector-Space Model for Informaton Retreval, Communcatons of the ACM, vol. 18, no. 11, 1975, pp [14] M.-L. Shyu, S.-C. Chen, and C. Haruechayasa, Mnng User Access Behavor on the WWW, IEEE Internatonal Conference on Systems, Man, and Cybernetcs, October 2001, pp [15] M.-L. Shyu, S.-C. Chen, C. Haruechayasa, C.-M. Shu, and S.-T. L, Dsont Web Document Clusterng and Management n Electronc Commerce, Proceedngs of the Seventh Internatonal Conference on Dstrbuted Multmeda Systems (DMS 01), September [16] S. Tun, R. Abdullah, and T.E. Kong, Automatc Topc Identfcaton Usng Ontology Herarchy, Proceedngs of the Second Internatonal Conference on Computatonal Lngustcs and Intellgent Text Processng (CICLng 01), February 2001, pp [17] J. Wang, A Survey of Web Cachng Schemes for the Internet, ACM Computer Communcaton Revew, October 1999, pp [18] L.A. Zadeh, Fuzzy Sets, n D. Dubos, H. Prade, and R.R. Yager, edtors, Readngs n Fuzzy Sets for Intellgent Systems, Morgan Kaufmann Publshers, [19] Open Drectory Proect ODP. [20] The Maor Search Engnes. [21] Yahoo! Web Search Drectory.

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Impact of a New Attribute Extraction Algorithm on Web Page Classification

Impact of a New Attribute Extraction Algorithm on Web Page Classification Impact of a New Attrbute Extracton Algorthm on Web Page Classfcaton Gösel Brc, Banu Dr, Yldz Techncal Unversty, Computer Engneerng Department Abstract Ths paper ntroduces a new algorthm for dmensonalty

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

2 Haruechaiyasak, Shyu and Chen identification is proposed. Our topic identification process is based on a classification method which uses a supervis

2 Haruechaiyasak, Shyu and Chen identification is proposed. Our topic identification process is based on a classification method which uses a supervis International Journal of Computational Intelligence and Applications cfl World Scientific Publishing Company IDENTIFYING TOPICS FOR WEB DOCUMENTS THROUGH FUZZY ASSOCIATION LEARNING CHOOCHART HARUECHAIYASAK,

More information

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY

THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY Proceedngs of the 20 Internatonal Conference on Machne Learnng and Cybernetcs, Guln, 0-3 July, 20 THE CONDENSED FUZZY K-NEAREST NEIGHBOR RULE BASED ON SAMPLE FUZZY ENTROPY JUN-HAI ZHAI, NA LI, MENG-YAO

More information

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images Internatonal Journal of Informaton and Electroncs Engneerng Vol. 5 No. 6 November 015 Usng Fuzzy Logc to Enhance the Large Sze Remote Sensng Images Trung Nguyen Tu Huy Ngo Hoang and Thoa Vu Van Abstract

More information

Online Text Mining System based on M2VSM

Online Text Mining System based on M2VSM FR-E2-1 SCIS & ISIS 2008 Onlne Text Mnng System based on M2VSM Yasufum Takama 1, Takash Okada 1, Toru Ishbash 2 1. Tokyo Metropoltan Unversty, 2. Tokyo Metropoltan Insttute of Technology 6-6 Asahgaoka,

More information

A Knowledge Management System for Organizing MEDLINE Database

A Knowledge Management System for Organizing MEDLINE Database A Knowledge Management System for Organzng MEDLINE Database Hyunk Km, Su-Shng Chen Computer and Informaton Scence Engneerng Department, Unversty of Florda, Ganesvlle, Florda 32611, USA Wth the exploson

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals nkselector: A Web Mnng Approach to Hyperlnk Selecton for Web Portals Xao Fang and Olva R. u Sheng Department of Management Informaton Systems Unversty of Arzona, AZ 8572 {xfang,sheng}@bpa.arzona.edu Submtted

More information

Study on Fuzzy Models of Wind Turbine Power Curve

Study on Fuzzy Models of Wind Turbine Power Curve Proceedngs of the 006 IASME/WSEAS Internatonal Conference on Energy & Envronmental Systems, Chalkda, Greece, May 8-0, 006 (pp-7) Study on Fuzzy Models of Wnd Turbne Power Curve SHU-CHEN WANG PEI-HWA HUANG

More information

Document Representation and Clustering with WordNet Based Similarity Rough Set Model

Document Representation and Clustering with WordNet Based Similarity Rough Set Model IJCSI Internatonal Journal of Computer Scence Issues, Vol. 8, Issue 5, No 3, September 20 ISSN (Onlne): 694-084 www.ijcsi.org Document Representaton and Clusterng wth WordNet Based Smlarty Rough Set Model

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

A new paradigm of fuzzy control point in space curve

A new paradigm of fuzzy control point in space curve MATEMATIKA, 2016, Volume 32, Number 2, 153 159 c Penerbt UTM Press All rghts reserved A new paradgm of fuzzy control pont n space curve 1 Abd Fatah Wahab, 2 Mohd Sallehuddn Husan and 3 Mohammad Izat Emr

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

User Tweets based Genre Prediction and Movie Recommendation using LSI and SVD

User Tweets based Genre Prediction and Movie Recommendation using LSI and SVD User Tweets based Genre Predcton and Move Recommendaton usng LSI and SVD Saksh Bansal, Chetna Gupta Department of CSE/IT Jaypee Insttute of Informaton Technology,sec-62 Noda, Inda sakshbansal76@gmal.com,

More information

A Novel Term_Class Relevance Measure for Text Categorization

A Novel Term_Class Relevance Measure for Text Categorization A Novel Term_Class Relevance Measure for Text Categorzaton D S Guru, Mahamad Suhl Department of Studes n Computer Scence, Unversty of Mysore, Mysore, Inda Abstract: In ths paper, we ntroduce a new measure

More information

A Web Site Classification Approach Based On Its Topological Structure

A Web Site Classification Approach Based On Its Topological Structure Internatonal Journal on Asan Language Processng 20 (2):75-86 75 A Web Ste Classfcaton Approach Based On Its Topologcal Structure J-bn Zhang,Zh-mng Xu,Kun-l Xu,Q-shu Pan School of Computer scence and Technology,Harbn

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

A Resources Virtualization Approach Supporting Uniform Access to Heterogeneous Grid Resources 1

A Resources Virtualization Approach Supporting Uniform Access to Heterogeneous Grid Resources 1 A Resources Vrtualzaton Approach Supportng Unform Access to Heterogeneous Grd Resources 1 Cunhao Fang 1, Yaoxue Zhang 2, Song Cao 3 1 Tsnghua Natonal Labatory of Inforamaton Scence and Technology 2 Department

More information

Experiments in Text Categorization Using Term Selection by Distance to Transition Point

Experiments in Text Categorization Using Term Selection by Distance to Transition Point Experments n Text Categorzaton Usng Term Selecton by Dstance to Transton Pont Edgar Moyotl-Hernández, Héctor Jménez-Salazar Facultad de Cencas de la Computacón, B. Unversdad Autónoma de Puebla, 14 Sur

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Associative Based Classification Algorithm For Diabetes Disease Prediction

Associative Based Classification Algorithm For Diabetes Disease Prediction Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha

More information

Query classification using topic models and support vector machine

Query classification using topic models and support vector machine Query classfcaton usng topc models and support vector machne Deu-Thu Le Unversty of Trento, Italy deuthu.le@ds.untn.t Raffaella Bernard Unversty of Trento, Italy bernard@ds.untn.t Abstract Ths paper descrbes

More information

A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials

A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials IEICE TRANS. FUNDAMENTALS, VOL.E92 A, NO.8 AUGUST 2009 1961 PAPER Specal Secton on Sgnal Processng A Novel Vdeo Retreval Method Based on Web Communty Extracton Usng Features of Vdeo Materals Yasutaka HATAKEYAMA

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks

Decision Strategies for Rating Objects in Knowledge-Shared Research Networks Decson Strateges for Ratng Objects n Knowledge-Shared Research etwors ALEXADRA GRACHAROVA *, HAS-JOACHM ER **, HASSA OUR ELD ** OM SUUROE ***, HARR ARAKSE *** * nsttute of Control and System Research,

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Ontology Generator from Relational Database Based on Jena

Ontology Generator from Relational Database Based on Jena Computer and Informaton Scence Vol. 3, No. 2; May 2010 Ontology Generator from Relatonal Database Based on Jena Shufeng Zhou (Correspondng author) College of Mathematcs Scence, Laocheng Unversty No.34

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

A Clustering Algorithm for Key Frame Extraction Based on Density Peak Journal of Computer and Communcatons, 2018, 6, 118-128 http://www.scrp.org/ournal/cc ISSN Onlne: 2327-5227 ISSN Prnt: 2327-5219 A Clusterng Algorthm for Key Frame Extracton Based on Densty Peak Hong Zhao

More information

HIGH-LEVEL SEMANTICS OF IMAGES IN WEB DOCUMENTS USING WEIGHTED TAGS AND STRENGTH MATRIX

HIGH-LEVEL SEMANTICS OF IMAGES IN WEB DOCUMENTS USING WEIGHTED TAGS AND STRENGTH MATRIX HIGH-LEVEL SEMANTICS OF IMAGES IN WEB DOCUMENTS USING WEIGHTED TAGS AND STRENGTH MATRIX P.Shanmugavadvu 1, P.Sumathy 2, A.Vadvel 3 12 Department of Computer Scence and Applcatons, Gandhgram Rural Insttute,

More information

Professional competences training path for an e-commerce major, based on the ISM method

Professional competences training path for an e-commerce major, based on the ISM method World Transactons on Engneerng and Technology Educaton Vol.14, No.4, 2016 2016 WIETE Professonal competences tranng path for an e-commerce maor, based on the ISM method Ru Wang, Pn Peng, L-gang Lu & Lng

More information

A Similarity Measure Method for Symbolization Time Series

A Similarity Measure Method for Symbolization Time Series Research Journal of Appled Scences, Engneerng and Technology 5(5): 1726-1730, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scentfc Organzaton, 2013 Submtted: July 27, 2012 Accepted: September 03, 2012

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

1. Introduction. Abstract

1. Introduction. Abstract Image Retreval Usng a Herarchy of Clusters Danela Stan & Ishwar K. Seth Intellgent Informaton Engneerng Laboratory, Department of Computer Scence & Engneerng, Oaland Unversty, Rochester, Mchgan 48309-4478

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers Research Artcle Internatonal Journal of Current Engneerng and Technology ISSN 77-46 3 INPRESSCO. All Rghts Reserved. Avalable at http://npressco.com/category/jcet Fuzzy Logc Based RS Image Usng Maxmum

More information

Semantic Image Retrieval Using Region Based Inverted File

Semantic Image Retrieval Using Region Based Inverted File Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:

More information

A Fuzzy Image Matching Algorithm with Linguistic Spatial Queries

A Fuzzy Image Matching Algorithm with Linguistic Spatial Queries Fuzzy Matchng lgorthm wth Lngustc Spatal Queres TZUNG-PEI HONG, SZU-PO WNG, TIEN-HIN WNG, EEN-HIN HIEN epartment of Electrcal Engneerng, Natonal Unversty of Kaohsung Insttute of Informaton Management,

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

A NOTE ON FUZZY CLOSURE OF A FUZZY SET

A NOTE ON FUZZY CLOSURE OF A FUZZY SET (JPMNT) Journal of Process Management New Technologes, Internatonal A NOTE ON FUZZY CLOSURE OF A FUZZY SET Bhmraj Basumatary Department of Mathematcal Scences, Bodoland Unversty, Kokrajhar, Assam, Inda,

More information

Information Retrieval

Information Retrieval Anmol Bhasn abhasn[at]cedar.buffalo.edu Moht Devnan mdevnan[at]cse.buffalo.edu Sprng 2005 #$ "% &'" (! Informaton Retreval )" " * + %, ##$ + *--. / "#,0, #'",,,#$ ", # " /,,#,0 1"%,2 '",, Documents are

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Learning Semantics-Preserving Distance Metrics for Clustering Graphical Data

Learning Semantics-Preserving Distance Metrics for Clustering Graphical Data Learnng Semantcs-Preservng Dstance Metrcs for Clusterng Graphcal Data Aparna S. Varde, Elke A. Rundenstener Carolna Ruz Mohammed Manruzzaman,3 Rchard D. Ssson Jr.,3 Department of Computer Scence Center

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Querying by sketch geographical databases. Yu Han 1, a *

Querying by sketch geographical databases. Yu Han 1, a * 4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Mining User Similarity Using Spatial-temporal Intersection

Mining User Similarity Using Spatial-temporal Intersection www.ijcsi.org 215 Mnng User Smlarty Usng Spatal-temporal Intersecton Ymn Wang 1, Rumn Hu 1, Wenhua Huang 1 and Jun Chen 1 1 Natonal Engneerng Research Center for Multmeda Software, School of Computer,

More information

MPEG-7 Pictorially Enriched Ontologies for Video Annotation

MPEG-7 Pictorially Enriched Ontologies for Video Annotation MPEG-7 Pctorally Enrched Ontologes for Vdeo Annotaton C. Grana, R.Vezzan, D. Bulgarell, R. Cucchara Dpartmento d Ingegnera dell Informazone Unverstà degl Stud d Modena e Reggo Emla Abstract. A system for

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Bridges and cut-vertices of Intuitionistic Fuzzy Graph Structure

Bridges and cut-vertices of Intuitionistic Fuzzy Graph Structure Internatonal Journal of Engneerng, Scence and Mathematcs (UGC Approved) Journal Homepage: http://www.jesm.co.n, Emal: jesmj@gmal.com Double-Blnd Peer Revewed Refereed Open Access Internatonal Journal -

More information

A Combined Approach for Mining Fuzzy Frequent Itemset

A Combined Approach for Mining Fuzzy Frequent Itemset A Combned Approach for Mnng Fuzzy Frequent Itemset R. Prabamaneswar Department of Computer Scence Govndammal Adtanar College for Women Truchendur 628 215 ABSTRACT Frequent Itemset Mnng s an mportant approach

More information

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies

Deep Classifier: Automatically Categorizing Search Results into Large-Scale Hierarchies Deep Classfer: Automatcally Categorzng Search Results nto Large-Scale Herarches Dkan Xng 1, Gu-Rong Xue 1, Qang Yang 2, Yong Yu 1 1 Shangha Jao Tong Unversty, Shangha, Chna {xaobao,grxue,yyu}@apex.sjtu.edu.cn

More information

A fast algorithm for color image segmentation

A fast algorithm for color image segmentation Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au

More information

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research Schedulng Remote Access to Scentfc Instruments n Cybernfrastructure for Educaton and Research Je Yn 1, Junwe Cao 2,3,*, Yuexuan Wang 4, Lanchen Lu 1,3 and Cheng Wu 1,3 1 Natonal CIMS Engneerng and Research

More information

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks 2017 2nd Internatonal Semnar on Appled Physcs, Optoelectroncs and Photoncs (APOP 2017) ISBN: 978-1-60595-522-3 FAHP and Modfed GRA Based Network Selecton n Heterogeneous Wreless Networks Xaohan DU, Zhqng

More information

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity ISSN(Onlne): 2320-9801 ISSN (Prnt): 2320-9798 Internatonal Journal of Innovatve Research n Computer and Communcaton Engneerng (An ISO 3297: 2007 Certfed Organzaton) Vol.2, Specal Issue 1, March 2014 Proceedngs

More information

Correlative features for the classification of textural images

Correlative features for the classification of textural images Correlatve features for the classfcaton of textural mages M A Turkova 1 and A V Gadel 1, 1 Samara Natonal Research Unversty, Moskovskoe Shosse 34, Samara, Russa, 443086 Image Processng Systems Insttute

More information