Measuring Semantic Similarity between Words Using Page Counts and Snippets
|
|
- Moses Holmes
- 6 years ago
- Views:
Transcription
1 Measuring Semantic Similarity between Words Using Page Counts and Snippets Manasa.Ch Computer Science & Engineering, SR Engineering College Warangal, Andhra Pradesh, India V.Ramana Assistant Professor, CSE SR Engineering College, Warangal, Andhra Pradesh, India S.P. Ananda Raj Sr. Assistant Professor, CSE SR Engineering College, Warangal, Andhra Pradesh, India Abstract Web mining involves activities such as document clustering, community mining etc. to be performed on web. Such tasks need measuring semantic similarity between words. This helps in performing web mining activities easily in many applications. However, the accuracy of measuring semantic similarity between any two words is difficult task. In this paper a new approach is proposed to measure similarity between words. This approach is based on text snippets and page counts. These two measures are taken from the results of a search engine like Google. To achieve the aim of this paper, lexical patterns are extracted from text snippets and word co-occurrence measures are defined using page counts. The results of these two are combined. Moreover, we proposed algorithms such as pattern clustering and pattern extraction in order to find various relationships between any given two words. Support Vector Machines, a data mining technique, is used to optimize the results. The empirical results reveal that the proposed techniques are finding best results that can be compared with human ratings and accuracy in web mining activities. Key Words - Text snippets, word count, semantic similarity, web mining, lexical patterns 1. INTRODUCTION Web mining has gained popularity as huge amount of information is being made available over web and the automated processing of such data or information is the need of the hour. The applications of web mining include entity disambiguation, relation detection and community extraction. Information retrieval and natural language processing are two important aspects involved in all web mining applications. Lexical dictionary such as Word Net is widely used to achieve natural language processing. However, it is a general purpose lexical ontology. As part of web mining documents are to be compared and analyzed programmatically. This is a tedious task as the meaning of words change across domains over time. The problem with lexical dictionaries is that they are not having diverse information about words in various contexts. For instance the word apple is somehow related to computer science as there is a company by name Apple which has been instrumental in brining many computer hardware and software technologies. However, this word is ignored in some of the lexical dictionaries as they consider it as a fruit. As new words are created and many meanings are associated with the words, the lexical dictionaries have proved to be inadequate to handle things when the words having new meanings and relationships with other words which are not yet updated in lexical dictionaries. To overcome the drawbacks mentioned above, we propose a method that automatically finds semantic similarity between words or entities based on the page counts and text snippets retrieved from web search engines like Google. Page count is an estimate of number of pages that contain query words. Snippet is some text extracted by web search engine based on 553
2 the query term given. The following is the text snippet obtained from Google search engine with query string apple. Apple Inc. (NASDAQ: AAPL; formerly Apple Computer, Inc.) is an American multinational corporation that designs and sells consumer electronics, computer... Fig. 1: Shows text snippet given by Google search engine for search word apple Similarity measures have been associated with text snippets for query expansion [4], personal name disambiguation [9], and community mining [17]. The text snippets and page counts are automatically obtained from search engines and used in web mining. However, they have the drawbacks as follows page count analysis ignores the position of a word in a page page count of a polysemous word (a word with multiple senses) might contain a combination of all its senses. Because the large number of documents in the result set, only those snippets for the top ranking results for a query can be processed We propose a method that overcomes the problems mentioned above. We use both snippets and page counts and propose algorithms such as lexical pattern extraction and pattern clustering to accurately measure semantic similarity between words. The main contributions of this paper include lexical patterns extraction to identify relation between words, SVM usage to integrate machine learning approach to optimize results. 2. RELATED WORK In [15] taxonomy of words is used to calculate similarity between to words by finding the length of the shorted path connecting two words. Information content concept is used by Resnik [9] where similarity between two concepts was introduced. The maximum of similarity between any concepts that the words belong to is used for finding similarity between words. Information content and also structural semantic information are combined by Li et al. [3] in order to have a similarity measure. Very high accuracy was shown by this technique when used with Charles [11] benchmark data set. Lin [8] defined similarity as the information which is in common to both concepts while Cilibrasi and Vitanyi [12] proposed a metric known as distance metric. This metric is defined using page counts retrieved from web search engines. Snippets were used by [4] in order to measure semantic similarity between any given two words. They represented each snippet as TF-IDF weighted term vector. A double checking model is developed by Chen et al. [4] which is based on snippets returned by web search engine. In various web mining applications such as word sense disambiguation [6], language modeling [13], synonym extraction [5], thesauri extraction [4] the concept of measuring semantic similarity is used. 3. PROPOSED METHOD The proposed method that finds similarity between two words A, B is supposed to return a value between 0.0 and 1.0. The value 0.0 indicates that there is no similarity between words while 1.0 indicates there are absolute similarity between given words. The proposed method makes use of page counts and text snippets retrieved by search engine like Google. For instance the words gem and jewel are given to Google and the resultant page counts and text snippets are used by our method to find similarity between words. The proposed method is visualized as shown in fig. 2. As illustrated in Fig. 2, two words such as gem, jewel is given as input to search engine. The search engine is returning page counts and also text snippets. These are extracted and given input to our proposed techniques. Page counts are given to word cooccurrence measures such as Web Jaccard, WebOverlap, WebDice, and WebPMI. The result of these techniques is given to SVM. On the other hand, the text snippets are given to the proposed algorithms that can generate pattern clusters which in turn are given to SVM. Now SVM has got two inputs. They are work co-occurrence measures and also pattern clusters. The SVM is trained with these and finally accurate semantic similarity is calculated for the given two words such as gem and jewel. 554
3 . sorted clusters in ascending order do mean that the most useful clusters are at the top. Steps for pattern clustering Fig. 2: outline of proposed method 3.1 Page Count Based Co-Occurrence Measures For given two words A and B page counts are given by search engine when these words are given as input. The four famous word co-occurrence measures such as Jaccard, Overlap (Simpson), Dice, and Point wise mutual information (PMI) are used in the proposed design in order to find similarity between words. 3.2 Lexical Pattern Extraction To overcome the drawbacks of using text snippets directly, we propose an algorithm known as lexical pattern extraction algorithm based on text snippets. The algorithm is meant for finding semantic relations that exist between given words. This technique has been used by various natural language processing tasks like extracting hypernyms [1], [7], question answering [10], meronyms [14] and paraphrase extract. Lexical patterns are the patterns that satisfy the following criteria. 1. A subsequence must exactly contain one occurrence of each A and B 2. The max length of subsequence is L words 3. In a subsequence one or more words can be skipped. However, consequently it should be less than g. 4. Only negation contractions in a context are expanded. 3.3 Lexical Pattern Clustering The extracted lexical patterns are clustered based on the similarity with respect to given cluster. Each cluster contains patterns that express similar semantic relations. Algorithm 1 returns such clusters. The 4. TRAINING WITH SVM A two- class SVM is trained with both synonymous and nonsynonymous word pairs generated from WordNet. For 3000 words the word pairs are extracted. The total number of words in the training data is Then lexical patterns are extracted subject to specified threshold. Lexical patterns thus extracted are clustered and given to SVM. The SVM acts up on both results of word co-occurrence measures and also pattern clusters in order to calculate semantic similarity between two given words. 5. EXPERIMENTAL RESULTS The experimental results include semantic similarities between given two words by using SVM and page counts and text snippets retrieved from search engines for given words. 555
4 ISSN: Fig3:home page We have to enter a key word in given text box to search in the search engine.for example, google and opera are the words to search as in Fig 4 and Fig 6. Fig 6 shows entering of a word opera to search Fig 7 shows page counts and snippets retrived for given word opera Fig 4 shows entering of a word google to search When we click on search button,it displays the page counts and snippets as result. For example, the page counts and snippets for google and opera are shown in Fig 5 and Fig 7. We have to enter two words to measure semantic similarity between them.the measurement ranges from 0 to 1. For given words google and opera the semantic similarity is 0.8. Fig 8 shows semantic similarity between google and opera as 0.8 Fig 5 shows page counts and snippets retrived for given word google For various words, we can measure semantic similarity between them. The result is close to 1 when they are semantically closed and it is close to 0 when they are not closed semantically.the output will be shown in form of graphs and tables as follows. 556
5 Table 1 shows semantic similarities for various word pairs accuracy. To achieve these techniques like pattern extraction and pattern clustering are introduced. These algorithms help in finding various relationships between words. SVM was trained with relationships identified between the given words. The experiments are made with synonymous and non synonymous word pairs that are collected from Word net synsets. The experimental results have shown that the proposed method is far better than the existing approaches that are employed to measure semantic similarity between words. 7. REFERENCES Graph 1 shows semantic similarities for various word pairs 6. CONCLUSION We used the results of web search engine for two words and proposed a semantic similarity measure which is based on the page counts and text snippets that are the results of a web search engine like Google. The aim of this paper is to measure semantic similarity between any two given words with utmost [1] C. Buckley, G. Salton, J. Allan, and A. Singhal.Automatic query expansion using smart: Trec 3. In Proc. of 3rd Text REtreival Conference, pages 69{80, [2] D. Bollegala, Y. Matsuo, and M. Ishizuka.Disambiguating personal names on the web using automatically extracted key phrases. In Proc. of the 17th European Conference on Artificial Intelligence,pages 553{557, [3] D. R. Cutting, J. O. Pedersen, D. Karger, and J. W.Tukey. Scatter/gather: A cluster-based approach to browsing large document collections. In Proceedings SIGIR '92, pages 318{329, [4] D. Lin. An information-theoretic de nition of similarity. In Proc. of the 15th ICML, pages 296{304,1998. [5] D. Lin. Automatic retreival and clustering of similar words. In Proc. of the 17th COLING, pages 768{774,1998. WWW 2007 / Track: Semantic Web Session: Similarity and Extraction 765 Table 7: Entity Disambiguation Results Jaguar Java Method Precision Recall F Precision Recall F WebJaccard 0:5613 0:541 0:5288 0:5738 0:5564 0:5243 WebOverlap 0:6463 0:6314 0:6201 0:6228 0:5895 0:56 WebDice 0:5613 0:541 0:5288 0:5738 0:5564 0:5243 WebPMI 0:5607 0:478 0:5026 0:7747 0:595 0:6468 Sahami [36] 0:6061 0:6337 0:6019 0:751 0:4793 0:5761 CODC [6] 0:5312 0:6159 0:5452 0:7744 0:5895 0:6358 Proposed 0:6892 0:7144 0:672 0:8198 0:6446 0:691 [6] F. Keller and M. Lapata. Using the web to obtain frequencies for unseen bigrams. Computational Linguistics, 29(3):459{484, [7] G. Miller and W. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes,6(1):1{28, [8] H. Han, H. Zha, and C. L. Giles. Name disambiguation in author citations using a k-way spectral clustering method. In Proceedings of the International Conference on Digital Libraries, [9] J. Curran. Ensemble menthods for automatic thesaurus extraction. In Proc. of EMNLP, [10] J. Mori, Y. Matsuo, and M. Ishizuka. Extracting keyphrases to represent relations in social networks from web. In Proc. of 20th IJCAI, [11] M. Fleischman and E. Hovy. Multi-document person name resolution. In Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics (ACL), Reference Resolution Workshop,
6 [12] M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of 14th COLING, pages 539{545, [13] M. Lapata and F. Keller. Web-based models ofr natural language processing. ACM Transactions on Speech and Language Processing, 2(1):1{31, [14] M. Mitra, A. Singhal, and C. Buckley. Improving automatic query expansion. In Proc. of 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 206{214, [15]P. Cimano, S. Handschuh, and S. Staab. Towards the self-annotating web. In Proc. of 13th WWW, [16] R. Bekkerman and A. McCallum. Disambiguating web appearances of people in a social network. In Proceedings of the World Wide Web Conference (WWW), pages 463{470, [17] Z. Bar-Yossef and M. Gurevich. Random sampling from a search engine's index. In Proceedings of 15 th International World Wide Web Conference, ABOUT THE AUTHORS S.P.Anandaraj received B.E (CSE) degree from Madras University, Chennai in the year 2004, M.Tech (CSE) with Gold Medal from Dr.MGR Educational and Research Institute, University in the year 2007 (Distinction with Honors). Now Pursuing Ph.D in St. Peter s University, Chennai. He has 8 Years of Teaching Experience. His areas of interest are Information security and Sensor Networks. He has published papers in International Journal, International Conference and National Conference and attended nearly15 National Workshops/FDP/Seminars etc. He is a member of ISTE, CSI, IEEE, Member of IACSIT and Member of IAENG. Manasa.Ch received the M.C.A Degree from Kamala Institute of Technology and Science, Huzurabad, Karimnagar, A.P, India. Currently doing M.tech in Computer Science and Engineering at SR Engineering College, Warangal, India. Her research interests include Knowledge and Data Engineering. She has Participated in ISTE approved National conference on Mobile Communications and Data Engineering at VITS, Karimnagar,A.P. and participated in Women Student Congress at NIT, Warangal, organized by IEEE WIE student branch, V.Ramana received B.Tech (CSE) degree from JNTU, Hyderabad in the year 2006.M.Tech (AI) from university of Hyderabad in the year 2010, He has2 Years of Teaching Experience. His area of interest is Artificial Intelligence and Machine Learning. He has published papers in International Journal, International Conference and National Conference and attended National Workshops/FDP/Seminars etc., He is a member of CSI. 558
MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI
MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI 1 KAMATCHI.M, 2 SUNDARAM.N 1 M.E, CSE, MahaBarathi Engineering College Chinnasalem-606201, 2 Assistant Professor,
More informationMEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY
MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY Ankush Maind 1, Prof. Anil Deorankar 2 and Dr. Prashant Chatur 3 1 M.Tech. Scholar, Department of Computer Science and Engineering, Government
More informationResolving Referential Ambiguity on the Web Using Higher Order Co-occurrences in Anchor-Texts
Resolving Referential Ambiguity on the Web Using Higher Order Co-occurrences in Anchor-Texts Rama.K, Sridevi.M, Vishnu Murthy.G Department of Computer Science and Engineering CVSR college of Engineering,
More informationDiscovering Semantic Similarity between Words Using Web Document and Context Aware Semantic Association Ranking
Discovering Semantic Similarity between Words Using Web Document and Context Aware Semantic Association Ranking P.Ilakiya Abstract The growth of information in the web is too large, so search engine come
More informationIJMIE Volume 2, Issue 8 ISSN:
DISCOVERY OF ALIASES NAME FROM THE WEB N.Thilagavathy* T.Balakumaran** P.Ragu** R.Ranjith kumar** Abstract An individual is typically referred by numerous name aliases on the web. Accurate identification
More informationClustering and Classification Augmented with Semantic Similarity for Text Mining
www.ijcsi.org 391 Clustering and Classification Augmented with Semantic Similarity for Text Mining S.Revathi 1, Dr.T.Nalini 2 1 PG Scholar, Department of CSE, Bharath University, Chennai, Tamil Nadu, India.
More informationShrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent
More informationWeb Information Retrieval using WordNet
Web Information Retrieval using WordNet Jyotsna Gharat Asst. Professor, Xavier Institute of Engineering, Mumbai, India Jayant Gadge Asst. Professor, Thadomal Shahani Engineering College Mumbai, India ABSTRACT
More informationWeb Service Matchmaking Using Web Search Engine and Machine Learning
International Journal of Web Engineering 2012, 1(1): 1-5 DOI: 10.5923/j.web.20120101.01 Web Service Matchmaking Using Web Search Engine and Machine Learning Incheon Paik *, Eigo Fujikawa School of Computer
More informationText Mining. Munawar, PhD. Text Mining - Munawar, PhD
10 Text Mining Munawar, PhD Definition Text mining also is known as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT).[1] A process of identifying novel information from a collection
More informationISSN: (Online) Volume 2, Issue 3, March 2014 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 2, Issue 3, March 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Paper / Case Study Available online at: www.ijarcsms.com
More informationKeywords Web Query, Classification, Intermediate categories, State space tree
Abstract In web search engines, the retrieval of information is quite challenging due to short, ambiguous and noisy queries. This can be resolved by classifying the queries to appropriate categories. In
More informationMaking Sense Out of the Web
Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide
More informationImproving Recommendations Through. Re-Ranking Of Results
Improving Recommendations Through Re-Ranking Of Results S.Ashwini M.Tech, Computer Science Engineering, MLRIT, Hyderabad, Andhra Pradesh, India Abstract World Wide Web has become a good source for any
More informationWEIGHTING QUERY TERMS USING WORDNET ONTOLOGY
IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.4, April 2009 349 WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY Mohammed M. Sakre Mohammed M. Kouta Ali M. N. Allam Al Shorouk
More informationA Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2
A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 1 Student, M.E., (Computer science and Engineering) in M.G University, India, 2 Associate Professor
More informationAssigning Vocation-Related Information to Person Clusters for Web People Search Results
Global Congress on Intelligent Systems Assigning Vocation-Related Information to Person Clusters for Web People Search Results Hiroshi Ueda 1) Harumi Murakami 2) Shoji Tatsumi 1) 1) Graduate School of
More informationIMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK
IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK 1 Mount Steffi Varish.C, 2 Guru Rama SenthilVel Abstract - Image Mining is a recent trended approach enveloped in
More informationTERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES
TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.
More informationRobust Estimation of Google Counts for Social Network Extraction
Robust Estimation of Google Counts for Social Network Extraction Yutaka Matsuo and Hironori Tomobe and Takuichi Nishimura AIST 1-18-13 Sotokanda, Chiyoda-ku Tokyo 101-0021, Japan Abstract Various studies
More informationQuestion Answering Approach Using a WordNet-based Answer Type Taxonomy
Question Answering Approach Using a WordNet-based Answer Type Taxonomy Seung-Hoon Na, In-Su Kang, Sang-Yool Lee, Jong-Hyeok Lee Department of Computer Science and Engineering, Electrical and Computer Engineering
More informationInternational Journal of Scientific & Engineering Research Volume 2, Issue 12, December ISSN Web Search Engine
International Journal of Scientific & Engineering Research Volume 2, Issue 12, December-2011 1 Web Search Engine G.Hanumantha Rao*, G.NarenderΨ, B.Srinivasa Rao+, M.Srilatha* Abstract This paper explains
More informationCIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets
CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,
More informationData Mining for XML Query-Answering Support
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 5, Issue 6 (Sep-Oct. 2012), PP 25-29 Data Mining for XML Query-Answering Support KC. Ravi Kumar 1, E. Krishnaveni
More informationWord Disambiguation in Web Search
Word Disambiguation in Web Search Rekha Jain Computer Science, Banasthali University, Rajasthan, India Email: rekha_leo2003@rediffmail.com G.N. Purohit Computer Science, Banasthali University, Rajasthan,
More informationOntology Based Prediction of Difficult Keyword Queries
Ontology Based Prediction of Difficult Keyword Queries Lubna.C*, Kasim K Pursuing M.Tech (CSE)*, Associate Professor (CSE) MEA Engineering College, Perinthalmanna Kerala, India lubna9990@gmail.com, kasim_mlp@gmail.com
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea Intelligent Information Retrieval 1. Relevance feedback - Direct feedback - Pseudo feedback 2. Query expansion
More informationEnhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm Rekha Jain 1, Sulochana Nathawat 2, Dr. G.N. Purohit 3 1 Department of Computer Science, Banasthali University, Jaipur, Rajasthan ABSTRACT
More informationDisambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity
Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity Tomáš Kramár, Michal Barla and Mária Bieliková Faculty of Informatics and Information Technology Slovak University
More informationAn Analysis of Researcher Network Evolution on the Web
An Analysis of Researcher Network Evolution on the Web Yutaka Matsuo 1, Yuki Yasuda 2 1 National Institute of AIST, Aomi 2-41-6, Tokyo 135-0064, JAPAN 2 University of Tokyo, Hongo 7-3-1, Tokyo 113-8656,
More informationSense-based Information Retrieval System by using Jaccard Coefficient Based WSD Algorithm
ISBN 978-93-84468-0-0 Proceedings of 015 International Conference on Future Computational Technologies (ICFCT'015 Singapore, March 9-30, 015, pp. 197-03 Sense-based Information Retrieval System by using
More informationEFFECTIVE EFFICIENT BOOLEAN RETRIEVAL
EFFECTIVE EFFICIENT BOOLEAN RETRIEVAL J Naveen Kumar 1, Dr. M. Janga Reddy 2 1 jnaveenkumar6@gmail.com, 2 pricipalcmrit@gmail.com 1 M.Tech Student, Department of Computer Science, CMR Institute of Technology,
More informationDocument Retrieval using Predication Similarity
Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research
More informationExtracting Key Phrases to Disambiguate Personal Name Queries in Web Search
Extracting Key Phrases to Disambiguate Personal Name Queries in Web Search Danushka Bollegala Yutaka Matsuo Mitsuru Ishizuka Graduate School of Information Science and Technology The University of Tokyo
More informationAutomatic Discovery of Association Orders between Name and Aliases from The Web using Anchor Texts-Based Co-Occurrences
64 IJCSNS International Journal of Computer Science and Network Security, VOL.14 No.6, June 2014 Automatic Discovery of Association Orders between Name and Aliases from The Web using Anchor Texts-Based
More informationKeywords: clustering algorithms, unsupervised learning, cluster validity
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based
More informationNavigation Cost Modeling Based On Ontology
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661 Volume 4, Issue 3 (Sep.-Oct. 2012), PP 34-39 Navigation Cost Modeling Based On Ontology 1 Madala Venkatesh, 2 Dr.R.V.Krishnaiah 1 Department
More informationMATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA
Journal of Computer Science, 9 (5): 534-542, 2013 ISSN 1549-3636 2013 doi:10.3844/jcssp.2013.534.542 Published Online 9 (5) 2013 (http://www.thescipub.com/jcs.toc) MATRIX BASED INDEXING TECHNIQUE FOR VIDEO
More informationCANDIDATE LINK GENERATION USING SEMANTIC PHEROMONE SWARM
CANDIDATE LINK GENERATION USING SEMANTIC PHEROMONE SWARM Ms.Susan Geethu.D.K 1, Ms. R.Subha 2, Dr.S.Palaniswami 3 1, 2 Assistant Professor 1,2 Department of Computer Science and Engineering, Sri Krishna
More informationNATURAL LANGUAGE PROCESSING
NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity
More informationInferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets
2016 IEEE 16th International Conference on Data Mining Workshops Inferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets Teruaki Hayashi Department of Systems Innovation
More informationOptimal Query. Assume that the relevant set of documents C r. 1 N C r d j. d j. Where N is the total number of documents.
Optimal Query Assume that the relevant set of documents C r are known. Then the best query is: q opt 1 C r d j C r d j 1 N C r d j C r d j Where N is the total number of documents. Note that even this
More informationLearning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search
1 / 33 Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search Bernd Wittefeld Supervisor Markus Löckelt 20. July 2012 2 / 33 Teaser - Google Web History http://www.google.com/history
More informationQUERY EXPANSION USING WORDNET WITH A LOGICAL MODEL OF INFORMATION RETRIEVAL
QUERY EXPANSION USING WORDNET WITH A LOGICAL MODEL OF INFORMATION RETRIEVAL David Parapar, Álvaro Barreiro AILab, Department of Computer Science, University of A Coruña, Spain dparapar@udc.es, barreiro@udc.es
More informationCorrelation Based Feature Selection with Irrelevant Feature Removal
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationA New Technique to Optimize User s Browsing Session using Data Mining
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
More informationAn Efficient Language Interoperability based Search Engine for Mobile Users 1 Pilli Srivalli, 2 P.S.Sitarama Raju
An Efficient Language Interoperability based Search Engine for Mobile Users 1 Pilli Srivalli, 2 P.S.Sitarama Raju 1 Final M.Tech Student, 2 Professor 1,2 Dept of CSE, MVGR college of Engineering,Chintavalasa,AP,India.
More informationIMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER
IMPLEMENTATION OF CLASSIFICATION ALGORITHMS USING WEKA NAÏVE BAYES CLASSIFIER N. Suresh Kumar, Dr. M. Thangamani 1 Assistant Professor, Sri Ramakrishna Engineering College, Coimbatore, India 2 Assistant
More informationTowards Breaking the Quality Curse. AWebQuerying Web-Querying Approach to Web People Search.
Towards Breaking the Quality Curse. AWebQuerying Web-Querying Approach to Web People Search. Dmitri V. Kalashnikov Rabia Nuray-Turan Sharad Mehrotra Dept of Computer Science University of California, Irvine
More informationQuery- And User-Dependent Approach for Ranking Query Results in Web Databases
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727Volume 6, Issue 6 (Nov. - Dec. 2012), PP 36-43 Query- And User-Dependent Approach for Ranking Query Results in Web Databases
More informationInternational Journal of Scientific & Engineering Research, Volume 5, Issue 7, July ISSN
International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July-2014 445 Clusters using High Dimensional Data for Feature Subset Algorithm V. Abhilash M. Tech Student Department of
More informationKEYWORD GENERATION FOR SEARCH ENGINE ADVERTISING
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 6, June 2014, pg.367
More informationEntity and Knowledge Base-oriented Information Retrieval
Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationWhat is this Song About?: Identification of Keywords in Bollywood Lyrics
What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics
More informationA Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet
A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch
More informationThe Goal of this Document. Where to Start?
A QUICK INTRODUCTION TO THE SEMILAR APPLICATION Mihai Lintean, Rajendra Banjade, and Vasile Rus vrus@memphis.edu linteam@gmail.com rbanjade@memphis.edu The Goal of this Document This document introduce
More informationText Document Clustering Using DPM with Concept and Feature Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,
More informationImproving Retrieval Experience Exploiting Semantic Representation of Documents
Improving Retrieval Experience Exploiting Semantic Representation of Documents Pierpaolo Basile 1 and Annalina Caputo 1 and Anna Lisa Gentile 1 and Marco de Gemmis 1 and Pasquale Lops 1 and Giovanni Semeraro
More informationNUS-I2R: Learning a Combined System for Entity Linking
NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationDomain-specific Concept-based Information Retrieval System
Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical
More informationImprovement of Web Search Results using Genetic Algorithm on Word Sense Disambiguation
Volume 3, No.5, May 24 International Journal of Advances in Computer Science and Technology Pooja Bassin et al., International Journal of Advances in Computer Science and Technology, 3(5), May 24, 33-336
More informationIs Brad Pitt Related to Backstreet Boys? Exploring Related Entities
Is Brad Pitt Related to Backstreet Boys? Exploring Related Entities Nitish Aggarwal, Kartik Asooja, Paul Buitelaar, and Gabriela Vulcu Unit for Natural Language Processing Insight-centre, National University
More informationAdvances in Natural and Applied Sciences. Information Retrieval Using Collaborative Filtering and Item Based Recommendation
AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Information Retrieval Using Collaborative Filtering and Item Based Recommendation
More informationUnderstanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22
Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion
More informationA Combined Method of Text Summarization via Sentence Extraction
Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 434 A Combined Method of Text Summarization via Sentence Extraction
More informationTest Model for Text Categorization and Text Summarization
Test Model for Text Categorization and Text Summarization Khushboo Thakkar Computer Science and Engineering G. H. Raisoni College of Engineering Nagpur, India Urmila Shrawankar Computer Science and Engineering
More informationISSN (Online) ISSN (Print)
Accurate Alignment of Search Result Records from Web Data Base 1Soumya Snigdha Mohapatra, 2 M.Kalyan Ram 1,2 Dept. of CSE, Aditya Engineering College, Surampalem, East Godavari, AP, India Abstract: Most
More informationA Novel Approach for Inferring and Analyzing User Search Goals
A Novel Approach for Inferring and Analyzing User Search Goals Y. Sai Krishna 1, N. Swapna Goud 2 1 MTech Student, Department of CSE, Anurag Group of Institutions, India 2 Associate Professor, Department
More informationSemantic-Based Information Retrieval for Java Learning Management System
AENSI Journals Australian Journal of Basic and Applied Sciences Journal home page: www.ajbasweb.com Semantic-Based Information Retrieval for Java Learning Management System Nurul Shahida Tukiman and Amirah
More informationR 2 D 2 at NTCIR-4 Web Retrieval Task
R 2 D 2 at NTCIR-4 Web Retrieval Task Teruhito Kanazawa KYA group Corporation 5 29 7 Koishikawa, Bunkyo-ku, Tokyo 112 0002, Japan tkana@kyagroup.com Tomonari Masada University of Tokyo 7 3 1 Hongo, Bunkyo-ku,
More informationIntroduction to Text Mining. Hongning Wang
Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:
More informationAutomatic Discovery of Association Orders between Name and Aliases from the Web using Anchor Texts-based Co-occurrences
Automatic Discovery of Association Orders between Name and Aliases from the Web using Anchor Texts-based Co-occurrences Rama Subbu Lashmi B Department of Computer Science and Engineering Sri Venateswara
More informationA Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering
A Novel Approach for Restructuring Web Search Results by Feedback Sessions Using Fuzzy clustering R.Dhivya 1, R.Rajavignesh 2 (M.E CSE), Department of CSE, Arasu Engineering College, kumbakonam 1 Asst.
More informationOntology-Based Web Query Classification for Research Paper Searching
Ontology-Based Web Query Classification for Research Paper Searching MyoMyo ThanNaing University of Technology(Yatanarpon Cyber City) Mandalay,Myanmar Abstract- In web search engines, the retrieval of
More informationMeasuring The Degree Of Similarity Between Web Ontologies Based On Semantic Coherence
Measuring The Degree Of Similarity Between Web Ontologies Based On Semantic Coherence ABHIK BANERJEE, HAREENDRA MUNIMADUGU, SRINIVASA RAGHAVAN VEDANARAYANAN, LAWRENCE J. MAZLACK Applied Computational Intelligence
More informationAn Adaptive Agent for Web Exploration Based on Concept Hierarchies
An Adaptive Agent for Web Exploration Based on Concept Hierarchies Scott Parent, Bamshad Mobasher, Steve Lytinen School of Computer Science, Telecommunication and Information Systems DePaul University
More informationLinking Entities in Chinese Queries to Knowledge Graph
Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn
More informationTHE METHOD OF AUTOMATED FORMATION OF THE SEMANTIC DATABASE MODEL OF THE DIALOG SYSTEM
International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 7, July 2018, pp. 1117 1122, Article ID: IJCIET_09_07_117 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=9&itype=7
More informationWeb Query Translation with Representative Synonyms in Cross Language Information Retrieval
Web Query Translation with Representative Synonyms in Cross Language Information Retrieval August 25, 2005 Bo-Young Kang, Qing Li, Yun Jin, Sung Hyon Myaeng Information Retrieval and Natural Language Processing
More informationA Novel Techinque For Ranking of Documents Using Semantic Similarity
A Novel Techinque For Ranking of Documents Using Semantic Similarity Rajni Kumari Rajpal Department Of Computer Science & Engineering Raipur Institute Of Technology Raipur(C.G.), India Mr.Yogesh Rathore
More informationKeywords Web Usage, Clustering, Pattern Recognition
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Real
More informationA BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK
A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK Qing Guo 1, 2 1 Nanyang Technological University, Singapore 2 SAP Innovation Center Network,Singapore ABSTRACT Literature review is part of scientific
More informationAutomatic Identification of User Goals in Web Search [WWW 05]
Automatic Identification of User Goals in Web Search [WWW 05] UichinLee @ UCLA ZhenyuLiu @ UCLA JunghooCho @ UCLA Presenter: Emiran Curtmola@ UC San Diego CSE 291 4/29/2008 Need to improve the quality
More informationEffect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching
Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna
More informationEvaluating a Conceptual Indexing Method by Utilizing WordNet
Evaluating a Conceptual Indexing Method by Utilizing WordNet Mustapha Baziz, Mohand Boughanem, Nathalie Aussenac-Gilles IRIT/SIG Campus Univ. Toulouse III 118 Route de Narbonne F-31062 Toulouse Cedex 4
More informationPayal Gulati. House No. 1H-36, NIT, Faridabad E xp e r i e nc e
Payal Gulati House No. 1H-36, NIT, gulatipayal@yahoo.co.in Total Experience: 9.5 years E xp e r i e nc e Currently working as Assistant Professor (IT) in YMCA University of Science & Technology, since
More informationSAACO: Semantic Analysis based Ant Colony Optimization Algorithm for Efficient Text Document Clustering
SAACO: Semantic Analysis based Ant Colony Optimization Algorithm for Efficient Text Document Clustering 1 G. Loshma, 2 Nagaratna P Hedge 1 Jawaharlal Nehru Technological University, Hyderabad 2 Vasavi
More informationA hybrid method to categorize HTML documents
Data Mining VI 331 A hybrid method to categorize HTML documents M. Khordad, M. Shamsfard & F. Kazemeyni Electrical & Computer Engineering Department, Shahid Beheshti University, Iran Abstract In this paper
More informationLITERATURE SURVEY ON SEARCH TERM EXTRACTION TECHNIQUE FOR FACET DATA MINING IN CUSTOMER FACING WEBSITE
International Journal of Civil Engineering and Technology (IJCIET) Volume 8, Issue 1, January 2017, pp. 956 960 Article ID: IJCIET_08_01_113 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=8&itype=1
More informationClassification of Text Documents Using B-Tree
Classification of Text Documents Using B-Tree B.S. Harish *, D.S. Guru, and S. Manjunath Department of Information Science and Engineering, SJCE, Mysore Department of Studies in Computer Science, University
More informationPapers for comprehensive viva-voce
Papers for comprehensive viva-voce Priya Radhakrishnan Advisor : Dr. Vasudeva Varma Search and Information Extraction Lab, International Institute of Information Technology, Gachibowli, Hyderabad, India
More informationJianyong Wang Department of Computer Science and Technology Tsinghua University
Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity
More informationExploiting Symmetry in Relational Similarity for Ranking Relational Search Results
Exploiting Symmetry in Relational Similarity for Ranking Relational Search Results Tomokazu Goto, Nguyen Tuan Duc, Danushka Bollegala, and Mitsuru Ishizuka The University of Tokyo, Japan {goto,duc}@mi.ci.i.u-tokyo.ac.jp,
More informationCOMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE
COMP90042 LECTURE 3 LEXICAL SEMANTICS SENTIMENT ANALYSIS REVISITED 2 Bag of words, knn classifier. Training data: This is a good movie.! This is a great movie.! This is a terrible film. " This is a wonderful
More informationIMPROVING THE RELEVANCY OF DOCUMENT SEARCH USING THE MULTI-TERM ADJACENCY KEYWORD-ORDER MODEL
IMPROVING THE RELEVANCY OF DOCUMENT SEARCH USING THE MULTI-TERM ADJACENCY KEYWORD-ORDER MODEL Lim Bee Huang 1, Vimala Balakrishnan 2, Ram Gopal Raj 3 1,2 Department of Information System, 3 Department
More informationA Machine Learning Approach for Displaying Query Results in Search Engines
A Machine Learning Approach for Displaying Query Results in Search Engines Tunga Güngör 1,2 1 Boğaziçi University, Computer Engineering Department, Bebek, 34342 İstanbul, Turkey 2 Visiting Professor at
More informationA Framework for Securing Databases from Intrusion Threats
A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:
More informationRavindranagar Colony, Habsiguda, : Department of Computer Science & Engineering, University College of Engineering(A),
Name Designation Father Name : Dr.V.B.Narsimha : Assistant Professor : V.Raghavulu Date of Birth : 20-06-1969 Residence Address : Flat No.101, SR Apartment, Ravindranagar Colony, Habsiguda, Hyderabad,
More informationRanking Assessment of Event Tweets for Credibility
Ranking Assessment of Event Tweets for Credibility Sravan Kumar G Student, Computer Science in CVR College of Engineering, JNTUH, Hyderabad, India Abstract: Online social network services have become a
More information