Scientific Literature Retrieval based on Terminological Paraphrases using Predicate Argument Tuple
|
|
- Spencer Johnston
- 6 years ago
- Views:
Transcription
1 Scientific Literature Retrieval based on Terminological Paraphrases using Predicate Argument Tuple Sung-Pil Choi 1, Sa-kwang Song 1, Hanmin Jung 1, Michaela Geierhos 2, Sung Hyon Myaeng 3 1 Korea Institute of Science and Technology Information, Daejeon, Korea 2 Munich University, Munich, Germany 3 Korea Advanced Institute of Science and Technology, Daejeon, Korea {spchoi, esmallj, jhm}@kisti.re.kr, michaela.geierhos@cis.unimuenchen.de, myaeng@kaist.ac.kr Abstract. The conceptual condensability of technical terms permits us to use them as effective queries to search scientific databases. However, authors often employ alternative expressions to represent the meanings of specific terms, in other words, Terminological Paraphrases (TPs) in the literature for certain reasons. In this paper, we propose an effective way to retrieve de facto relevance documents which only contain those TPs and cannot be searched by conventional models in an environment with only controlled vocabularies by adapting Predicate Argument Tuple (PAT). The experiment confirms that PAT-based document retrieval is an effective and promising method to search those kinds of documents and to improve terminology-based scientific information access models. 1 Introduction Terminology is defined as a set of linguistic elements, each of which represents, designates, and defines a technical concept in a particular scientific field. InfoTerm [1], an international information center for terminology, specifies two important roles of terminology: to conceptually represent the expertise of a particular domain, and serve as a tool to access domain-specific information and knowledge. Although much effort has been devoted to invent effective ways of query formulation and processing thus far, most of the world s major scientific databases adopt simple keyword-based strategies rather than more enhanced but complicated approaches 1. One reason is that scientific documents such as articles and patents include many technical terms that are discriminative and therefore highly informative. Accordingly, given that users and contents can share these technical terms, simple termbased methods can still achieve high levels of satisfaction. 1 Google Scholar( PubMed( Microsoft Academic Search ( adfa, p. 1, Springer-Verlag Berlin Heidelberg 2012
2 The conceptual condensability of technical terms permits us to use them as effective queries to search scientific databases. However, authors often employ alternative expressions to represent the meanings of specific terms in the literature for certain reasons. Normal keyword matching models can only find documents that contain the input query terms. In sum, with a single technical term, it is nontrivial to access documents that include only alternative expressions of terms, in other words, terminological paraphrases (TPs). In this paper, we propose an effective way to retrieve documents that contain the alternative expressions which denote the concepts of terminologies in literature by adapting Predicate Argument Tuple (PAT). A PAT consists of multiple arguments and a predicate which represents the semantic relation between them and therefore expresses both syntactic and semantic interrelations between words in a sentence. We exploit PATs as indices for searching various textual segments similar to an input sentence that defines a particular terminology (TPs). To achieve this, we construct a novel document retrieval system based on the PATs to investigate the retrieval of the de facto relevance documents which only contain those TPs and cannot be searched by conventional models in an environment with only controlled vocabularies (namely, single terms). 2 Related Work To enhance the search functions of PubMed, the largest biomedical literature database in the world, Lu et al. (2009) introduced the Automatic Term Mapping (ATM) method, which automatically maps user queries into MeSH descriptors and enables QE with various types of thesaurus information [2]. There have been many studies of QE application to improve the performance of biomedical information retrieval with controlled vocabularies such as MeSH and UMLS [3-7]. 3 PAT-based Scientific Literature Retrieval System This chapter explains a newly invented retrieval system that can identify the TPs of input query terms in scientific literatures based on the definitions of the terms and therefore retrieve de facto relevance documents in an efficient way. We start by introducing the detailed architecture of our proposed system.
3 Fig. 1. System Architecture and Process of PAT-based Retrieval System Fig. 1 shows the architecture and procedure of our system. With an input query term, the term definition finder can obtain the definition of the term from various sources. Definitional PATs, which compose a term definition, are extracted from the definition by applying syntactic parsing, PAT extraction, and preprocessing. With a PAT query consisting of definitional PATs, the system searches and ranks relevant documents that have similar sentences to the definition of the input term. To build the search database, our system extracts all the PATs, rather than words from the original target texts as indices and constructs an inverted file based on them as seen in the Fig. 2. Fig. 2. PAT-based Inverted File Fig. 2 shows a small portion of the PAT-based inverted file. Although conventional information retrieval systems have very complex indexing structures, we construct a simple inverted file structure that contains only sentence identifiers as posting information.
4 3.1 Predicate Argument Tuple (PAT) Predicate Argument Structure (PAS) is a graph structure that denotes collectively the syntactic and semantic relations between words in a sentence [8]. Figure 3 shows an example of the PAS generated from the results of the Enju Parser [8]. Fig. 3. Predicate Argument Structure and Predicate Argument Tuples in a Sentence In the left side of the figure, the gray boxes represent predicates, the white boxes denote arguments, and the arrows express the syntactic relations between them. For example, although the predicate covering in the sentence has two arguments, structure and portion, sperm carries only a single noun argument, head. We can extract Predicate-Argument Tuples (PATs) from the PAS of a sentence as in Fig. 4. A PAT is an element of a PAS and can be classified into one of four types: connective, verbal, adjectival, and nominal. 3.2 Ranking by PAT To compute the similarity between an input PAT query and a document and then rank the search results, we use a simple ranking scheme which measures how many PATs in a PAT query exist in a document. p p Q p S PMRQ, S (1) p p S where Q is a PAT query, p is a single PAT and S is a set of PATs in a sentence. Although we use the PMR (PAT Match Ratio) as our main ranking scheme in this fundamental research, we can invent many additional schemes which can be more effective in retrieving documents containing TPs.
5 4 Experiments In this chapter, we investigated the retrieval of these de facto relevant documents in an environment with only controlled vocabularies (namely, single terms) to retrieve TPs from scientific literature. 4.1 Experimental Settings We use a set of abstracts in biomedical domain selected from NDSL (National Discovery for Science Leaders) 2 database. Table 1 shows its statistics. Table 1. Target Database used in the Experiment Items # of documents # of sentences # of PAT indices extracted Size 615,125 6,061,366 20,608,631 As for the experimental queries, the experiment uses 43 terms randomly selected from MeSH thesaurus which frequently appear in the target database as shown in Table 2. Table 2. Sample Queries from 43 Terms ID MeSH Term Term Definition D Bronchitis, Chronic A subcategory of chronic obstructive pulmonary disease. D Monilethrix Rare autosomal dominant disorder of the hair shaft. D Femur Head Necrosis Aseptic or avascular necrosis of the femoral head. D Kidney Failure, Chronic The end-stage of chronic renal insufficiency. D Dermatitis, Seborrheic A chronic inflammatory disease of the skin with unknown etiology. D Nervous System Disease Diseases of the central and peripheral nervous system. D Hyperargininemia A rare autosomal recessive disorder of the urea cycle. We use three different retrieval models for comparison in this experiment: the (1) Pseudo-Relevance Feedback model (PRF), (2) relevance model with term definitions (DEF), and (3) PAT-based document retrieval (PAT) for performance comparison. For (1) and (2), we used Indri system which produces a ranking model based on a combination of language models [9] and an inference network [10]. In addition, its relevance feedback uses Lavrenko s relevance model [11]. Two experts performed the relevance judgment manually with the top 10 documents retrieved by each system based on the 43 query terms. We measured the agreement ratio for all judged documents. The results are shown in Table
6 Table 3. Agreement Ratio in Relevant Judgements Systems Kappa Score [12] Evaluation 3 PRF Substantial Agreement PAT Almost Perfect Agreement DEF Substantial Agreement Average Substantial Agreement Two raters almost perfectly agreed on the result of a PAT-based search. As for the others, the scores were not significantly different. We selected and analyzed one of the two judgment results without adjusting the conflicts. 4.2 Experimental Results and Discussion Table 4 shows the comprehensive results of the experiment with the three document retrieval systems. Table 4. Evaluation Results of the Three Retrieval Models (Top 10) Items PRF PAT DEF Number of total query terms (S) 43 # of terms searching more than 1 document 29 (67.4%) 43 (100%) 43 (100%) # of terms searching more than 10 documents 16 (37.2%) 28 (65.12%) 43 (100%) Total # of retrieved documents (A) Total # of relevant documents (B) # of retrieved documents per term (A/S) # of relevant documents per term (B/S) Average precision over terms Total precision First, we counted the number of input query terms that retrieved more than one document. Whereas PAT and DEF could retrieve documents with all queries, only 29 queries retrieved more than one document by using PRF. The numbers of queries retrieving more than 10 documents were 16 with PRF, 28 with PAT, and 43 with DEF. This shows the difficulty of retrieving documents without the query terms. PAT retrieved the largest number of relevant documents (226) and showed the highest average precision over terms (0.59). Total precision, which refers to the ratio of relevant documents to the total retrieved documents, was highest in PAT. Although PRF showed low precision, its total precision was relatively competitive (0.57) in that this model used only statistical information to expand the initial query terms. 3 Fair (0.2 <κ 0.4), Moderate (0.4 <κ 0.6), Substantial (0.6 <κ 0.8), and Almost perfect (κ> 0.8)
7 5 Conclusion and Future Work In this paper, we confirmed that PAT-based document retrieval is an effective and promising method to search relevant documents with no explicit query terms as well as to improve terminology-based scientific information access models. Moreover, we found that PAT-based retrieval could search hidden relevant documents that could not be retrieved by the PRF model. Therefore, our proposed model can be used as a supplementary model by combining it with other conventional retrieval models to improve search performance. The most pressing issue for future studies will be to expand the PAT retrieval model to search more TPs from the literature. It is possible to generate synonymous PATs such as cause(virus, disease), cause(virus, disorder) and develop(host, disease) without much lexical ambiguity owing to the richness of their contextual information. 6 References 1. InfoTerm. Terminology Standardization. 2010; Available from: 2. Lu, Z., W. Kim, and W.J. Wilbur, Evaluation of query expansion using MeSH in PubMed. Inf. Retr., (1): p Abdou, S., P. Ruck, and J. Savoy, Evaluation of stemming, query expansion and manual indexing approaches for the genomic task. cell. 501: p Aronson, A.R., The effect of textual variation on concept based information retrieval, in Proceedings a conference of the American Medical Informatics Association p Srinivasan, P., Query expansion and MEDLINE. Inf. Process. Manage., (4): p Choi, S.-P., S.-K. Song, and S.-H. Myaeng, Analysis of Sentential Paraphrase Patterns and Errors through Predicate-Argument Tuple-based Approximate Alignment. KIPS Journal, B(2). 7. Choi, S.-P. and S.-H. Myaeng, Simplicity is better: revisiting single kernel PPI extraction, in COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics Miyao, Y. and J.i. Tsujii, Feature Forest Models for Probabilistic HPSG Parsing. Computational Linguistics, (1): p Ponte, J.M. and W.B. Croft, A language modeling approach to information retrieval, in Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. 1998, ACM: Melbourne, Australia. p Turtle, H. and W.B. Croft, Evaluation of an inference network-based retrieval model. ACM Trans. Inf. Syst., (3): p Lavrenko, V. and W.B. Croft, Relevance based language models, in Proceedings of the 24th annual international ACM SIGIR conference on
8 Research and development in information retrieval. 2001, ACM: New Orleans, Louisiana, United States. p Cohen, J., Weighed kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, (4): p
Query Reformulation for Clinical Decision Support Search
Query Reformulation for Clinical Decision Support Search Luca Soldaini, Arman Cohan, Andrew Yates, Nazli Goharian, Ophir Frieder Information Retrieval Lab Computer Science Department Georgetown University
More informationSNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task
SNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of Korea wakeup06@empas.com, jinchoi@snu.ac.kr
More informationDOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL
DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL Shuguang Wang Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA swang@cs.pitt.edu Shyam Visweswaran Department of Biomedical
More informationClassification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014
Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014 Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of
More informationA Model for Information Retrieval Agent System Based on Keywords Distribution
A Model for Information Retrieval Agent System Based on Keywords Distribution Jae-Woo LEE Dept of Computer Science, Kyungbok College, 3, Sinpyeong-ri, Pocheon-si, 487-77, Gyeonggi-do, Korea It2c@koreaackr
More informationAnalyzing Patterns with Timelines on Researcher Data
Analyzing Email Patterns with Timelines on Researcher Data Jangwon Gim 1, Yunji Jang 1, Do-Heon Jeong 1,*, Hanmin Jung 1 1 Korea Institute of Science and Technology Information (KISTI) 245 Daehak-ro, Yuseong-gu,
More informationShrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent
More informationSheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms
Sheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms Yikun Guo, Henk Harkema, Rob Gaizauskas University of Sheffield, UK {guo, harkema, gaizauskas}@dcs.shef.ac.uk
More informationDocument Retrieval using Predication Similarity
Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research
More informationA Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2
A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,
More informationFinding Topic-centric Identified Experts based on Full Text Analysis
Finding Topic-centric Identified Experts based on Full Text Analysis Hanmin Jung, Mikyoung Lee, In-Su Kang, Seung-Woo Lee, Won-Kyung Sung Information Service Research Lab., KISTI, Korea jhm@kisti.re.kr
More informationWSU-IR at TREC 2015 Clinical Decision Support Track: Joint Weighting of Explicit and Latent Medical Query Concepts from Diverse Sources
WSU-IR at TREC 2015 Clinical Decision Support Track: Joint Weighting of Explicit and Latent Medical Query Concepts from Diverse Sources Saeid Balaneshin-kordan, Alexander Kotov, and Railan Xisto Department
More informationDocument Structure Analysis in Associative Patent Retrieval
Document Structure Analysis in Associative Patent Retrieval Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media Studies University of Tsukuba 1-2 Kasuga, Tsukuba, 305-8550,
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationEffect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching
Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna
More informationQuery Likelihood with Negative Query Generation
Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer
More informationA Multiple-stage Approach to Re-ranking Clinical Documents
A Multiple-stage Approach to Re-ranking Clinical Documents Heung-Seon Oh and Yuchul Jung Information Service Center Korea Institute of Science and Technology Information {ohs, jyc77}@kisti.re.kr Abstract.
More informationIndexing and Query Processing
Indexing and Query Processing Jaime Arguello INLS 509: Information Retrieval jarguell@email.unc.edu January 28, 2013 Basic Information Retrieval Process doc doc doc doc doc information need document representation
More informationAutomatic Generation of Query Sessions using Text Segmentation
Automatic Generation of Query Sessions using Text Segmentation Debasis Ganguly, Johannes Leveling, and Gareth J.F. Jones CNGL, School of Computing, Dublin City University, Dublin-9, Ireland {dganguly,
More informationCHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING
43 CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING 3.1 INTRODUCTION This chapter emphasizes the Information Retrieval based on Query Expansion (QE) and Latent Semantic
More informationMaking Sense Out of the Web
Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide
More informationThe University of Amsterdam at the CLEF 2008 Domain Specific Track
The University of Amsterdam at the CLEF 2008 Domain Specific Track Parsimonious Relevance and Concept Models Edgar Meij emeij@science.uva.nl ISLA, University of Amsterdam Maarten de Rijke mdr@science.uva.nl
More informationUniversity of Amsterdam at INEX 2010: Ad hoc and Book Tracks
University of Amsterdam at INEX 2010: Ad hoc and Book Tracks Jaap Kamps 1,2 and Marijn Koolen 1 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Faculty of Science,
More informationA Semantic Multi-Field Clinical Search for Patient Medical Records
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 18, No 1 Sofia 2018 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2018-0014 A Semantic Multi-Field Clinical
More informationToward Interlinking Asian Resources Effectively: Chinese to Korean Frequency-Based Machine Translation System
Toward Interlinking Asian Resources Effectively: Chinese to Korean Frequency-Based Machine Translation System Eun Ji Kim and Mun Yong Yi (&) Department of Knowledge Service Engineering, KAIST, Daejeon,
More informationMeSH: A Thesaurus for PubMed
Resources and tools for bibliographic research MeSH: A Thesaurus for PubMed October 24, 2012 What is MeSH? Who uses MeSH? Why use MeSH? Searching by using the MeSH Database What is MeSH? Acronym for Medical
More informationMeSH : A Thesaurus for PubMed
Scuola di dottorato di ricerca in Scienze Molecolari Resources and tools for bibliographic research MeSH : A Thesaurus for PubMed What is MeSH? Who uses MeSH? Why use MeSH? Searching by using the MeSH
More informationAn Investigation of Basic Retrieval Models for the Dynamic Domain Task
An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu
More informationUnsupervised Semantic Parsing
Unsupervised Semantic Parsing Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos) 1 Outline Motivation Unsupervised semantic parsing Learning and inference
More informationNatural Language Query Processing for SPARQL generation - a Prototype System for SNOMEDCT
Natural Language Query Processing for SPARQL generation - a Prototype System for SNOMEDCT Jin-Dong Kim Database Center for Life Science Research Organization of Information and Systems Tokyo, Japan jdkim@dbcls.rois.ac.jp
More informationQuality-Based Automatic Classification for Presentation Slides
Quality-Based Automatic Classification for Presentation Slides Seongchan Kim, Wonchul Jung, Keejun Han, Jae-Gil Lee, and Mun Y. Yi Dept. of Knowledge Service Engineering, KAIST, Korea {sckim,wonchul.jung,keejun.han,jaegil,munyi@kaist.ac.kr}
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationA BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK
A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK Qing Guo 1, 2 1 Nanyang Technological University, Singapore 2 SAP Innovation Center Network,Singapore ABSTRACT Literature review is part of scientific
More informationA Practical Passage-based Approach for Chinese Document Retrieval
A Practical Passage-based Approach for Chinese Document Retrieval Szu-Yuan Chi 1, Chung-Li Hsiao 1, Lee-Feng Chien 1,2 1. Department of Information Management, National Taiwan University 2. Institute of
More informationDomain-specific Concept-based Information Retrieval System
Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical
More informationHybrid Approach for Query Expansion using Query Log
Volume 7 No.6, July 214 www.ijais.org Hybrid Approach for Query Expansion using Query Log Lynette Lopes M.E Student, TSEC, Mumbai, India Jayant Gadge Associate Professor, TSEC, Mumbai, India ABSTRACT Web
More informationThe NLM Medical Text Indexer System for Indexing Biomedical Literature
The NLM Medical Text Indexer System for Indexing Biomedical Literature James G. Mork 1, Antonio J. Jimeno Yepes 2,1, Alan R. Aronson 1 1 National Library of Medicine, Bethesda, MD, USA {mork,alan}@nlm.nih.gov
More informationInter and Intra-Document Contexts Applied in Polyrepresentation
Inter and Intra-Document Contexts Applied in Polyrepresentation Mette Skov, Birger Larsen and Peter Ingwersen Department of Information Studies, Royal School of Library and Information Science Birketinget
More informationRobust Relevance-Based Language Models
Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new
More informationThis is the author s version of a work that was submitted/accepted for publication in the following source:
This is the author s version of a work that was submitted/accepted for publication in the following source: Koopman, Bevan, Bruza, Peter, Sitbon, Laurianne, & Lawley, Michael (2011) AEHRC & QUT at TREC
More informationBalancing Manual and Automatic Indexing for Retrieval of Paper Abstracts
Balancing Manual and Automatic Indexing for Retrieval of Paper Abstracts Kwangcheol Shin 1, Sang-Yong Han 1, and Alexander Gelbukh 1,2 1 Computer Science and Engineering Department, Chung-Ang University,
More informationA Novel Approach of Mining Write-Prints for Authorship Attribution in Forensics
DIGITAL FORENSIC RESEARCH CONFERENCE A Novel Approach of Mining Write-Prints for Authorship Attribution in E-mail Forensics By Farkhund Iqbal, Rachid Hadjidj, Benjamin Fung, Mourad Debbabi Presented At
More informationText mining tools for semantically enriching the scientific literature
Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the
More informationA Semantic Model for Concept Based Clustering
A Semantic Model for Concept Based Clustering S.Saranya 1, S.Logeswari 2 PG Scholar, Dept. of CSE, Bannari Amman Institute of Technology, Sathyamangalam, Tamilnadu, India 1 Associate Professor, Dept. of
More informationOptimization of the PubMed Automatic Term Mapping
238 Medical Informatics in a United and Healthy Europe K.-P. Adlassnig et al. (Eds.) IOS Press, 2009 2009 European Federation for Medical Informatics. All rights reserved. doi:10.3233/978-1-60750-044-5-238
More informationOptimal Query. Assume that the relevant set of documents C r. 1 N C r d j. d j. Where N is the total number of documents.
Optimal Query Assume that the relevant set of documents C r are known. Then the best query is: q opt 1 C r d j C r d j 1 N C r d j C r d j Where N is the total number of documents. Note that even this
More informationApplying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task
Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task Walid Magdy, Gareth J.F. Jones Centre for Next Generation Localisation School of Computing Dublin City University,
More informationTowards Semantic Search and Inference in Electronic Medical Records: an approach using Concept- based Information Retrieval
Towards Semantic Search and Inference in Electronic Medical Records: an approach using Concept- based Information Retrieval Bevan Koopman 1,2 Peter Bruza 2 Laurianne Sitbon 2 Michael Lawley 1 1: Australian
More informationUMass at TREC 2017 Common Core Track
UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer
More informationExploiting Symmetry in Relational Similarity for Ranking Relational Search Results
Exploiting Symmetry in Relational Similarity for Ranking Relational Search Results Tomokazu Goto, Nguyen Tuan Duc, Danushka Bollegala, and Mitsuru Ishizuka The University of Tokyo, Japan {goto,duc}@mi.ci.i.u-tokyo.ac.jp,
More informationRMIT University at TREC 2006: Terabyte Track
RMIT University at TREC 2006: Terabyte Track Steven Garcia Falk Scholer Nicholas Lester Milad Shokouhi School of Computer Science and IT RMIT University, GPO Box 2476V Melbourne 3001, Australia 1 Introduction
More informationQuoogle: A Query Expander for Google
Quoogle: A Query Expander for Google Michael Smit Faculty of Computer Science Dalhousie University 6050 University Avenue Halifax, NS B3H 1W5 smit@cs.dal.ca ABSTRACT The query is the fundamental way through
More informationWeb Information Retrieval using WordNet
Web Information Retrieval using WordNet Jyotsna Gharat Asst. Professor, Xavier Institute of Engineering, Mumbai, India Jayant Gadge Asst. Professor, Thadomal Shahani Engineering College Mumbai, India ABSTRACT
More informationResearch and Design of Key Technology of Vertical Search Engine for Educational Resources
2017 International Conference on Arts and Design, Education and Social Sciences (ADESS 2017) ISBN: 978-1-60595-511-7 Research and Design of Key Technology of Vertical Search Engine for Educational Resources
More informationJoining Collaborative and Content-based Filtering
Joining Collaborative and Content-based Filtering 1 Patrick Baudisch Integrated Publication and Information Systems Institute IPSI German National Research Center for Information Technology GMD 64293 Darmstadt,
More informationMining High Order Decision Rules
Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high
More informationCHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS
82 CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS In recent years, everybody is in thirst of getting information from the internet. Search engines are used to fulfill the need of them. Even though the
More informationQuery Phrase Expansion using Wikipedia for Patent Class Search
Query Phrase Expansion using Wikipedia for Patent Class Search 1 Bashar Al-Shboul, Sung-Hyon Myaeng Korea Advanced Institute of Science and Technology (KAIST) December 19 th, 2011 AIRS 11, Dubai, UAE OUTLINE
More informationCharles University at CLEF 2007 CL-SR Track
Charles University at CLEF 2007 CL-SR Track Pavel Češka and Pavel Pecina Institute of Formal and Applied Linguistics Charles University, 118 00 Praha 1, Czech Republic {ceska,pecina}@ufal.mff.cuni.cz Abstract
More informationQUT IElab at CLEF 2018 Consumer Health Search Task: Knowledge Base Retrieval for Consumer Health Search
QUT Ilab at CLF 2018 Consumer Health Search Task: Knowledge Base Retrieval for Consumer Health Search Jimmy 1,3, Guido Zuccon 1, Bevan Koopman 2 1 Queensland University of Technology, Brisbane, Australia
More informationTilburg University. Authoritative re-ranking of search results Bogers, A.M.; van den Bosch, A. Published in: Advances in Information Retrieval
Tilburg University Authoritative re-ranking of search results Bogers, A.M.; van den Bosch, A. Published in: Advances in Information Retrieval Publication date: 2006 Link to publication Citation for published
More informationDESIGN AND IMPLEMENTATION OF AN INTERACTIVE QUERY EXPANSION METHODOLOGY FOR INFORMATION RETRIEVAL
DESIGN AND IMPLEMENTATION OF AN INTERACTIVE QUERY EXPANSION METHODOLOGY FOR INFORMATION RETRIEVAL S. Ruban 1, Vanitha T 2. and S. Behin Sam 3 1 Department of Computer Science, Bharathiar University, Coimbatore,
More informationAn Efficient Approach for Color Pattern Matching Using Image Mining
An Efficient Approach for Color Pattern Matching Using Image Mining * Manjot Kaur Navjot Kaur Master of Technology in Computer Science & Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib,
More informationAuthor: Yunqing Xia, Zhongda Xie, Qiuge Zhang, Huiyuan Zhao, Huan Zhao Presenter: Zhongda Xie
Author: Yunqing Xia, Zhongda Xie, Qiuge Zhang, Huiyuan Zhao, Huan Zhao Presenter: Zhongda Xie Outline 1.Introduction 2.Motivation 3.Methodology 4.Experiments 5.Conclusion 6.Future Work 2 1.Introduction(1/3)
More informationGeosemantically-enhanced PubMed Queries Using the Geonames Ontology and Web Services
Geosemantically-enhanced PubMed Queries Using the Geonames Ontology and Web Services Maged N. Kamel Boulos, PhD, MSc, MBBCh Plymouth University, UK mnkboulos@ieee.org Agenda About PubMed and MeSH The Problem
More informationImproving Recognition through Object Sub-categorization
Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,
More informationMaximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009
Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images
More informationA RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH
A RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH A thesis Submitted to the faculty of the graduate school of the University of Minnesota by Vamshi Krishna Thotempudi In partial fulfillment of the requirements
More informationImproving the Precision of Web Search for Medical Domain using Automatic Query Expansion
Improving the Precision of Web Search for Medical Domain using Automatic Query Expansion Vinay Kakade vkakade@cs.stanford.edu Madhura Sharangpani smadhura@cs.stanford.edu Department of Computer Science
More informationBioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data
BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data María-Esther Vidal 1, Louiqa Raschid 2, Natalia Márquez 1, Jean Carlo Rivera 1, and Edna Ruckhaus 1 1 Universidad
More informationQuery Disambiguation from Web Search Logs
Vol.133 (Information Technology and Computer Science 2016), pp.90-94 http://dx.doi.org/10.14257/astl.2016. Query Disambiguation from Web Search Logs Christian Højgaard 1, Joachim Sejr 2, and Yun-Gyung
More informationBiomedical literature mining for knowledge discovery
Biomedical literature mining for knowledge discovery REZARTA ISLAMAJ DOĞAN National Center for Biotechnology Information National Library of Medicine Outline Biomedical Literature Access Challenges in
More informationLecture 7: Relevance Feedback and Query Expansion
Lecture 7: Relevance Feedback and Query Expansion Information Retrieval Computer Science Tripos Part II Ronan Cummins Natural Language and Information Processing (NLIP) Group ronan.cummins@cl.cam.ac.uk
More informationCLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval
DCU @ CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval Walid Magdy, Johannes Leveling, Gareth J.F. Jones Centre for Next Generation Localization School of Computing Dublin City University,
More informationOntology Based Prediction of Difficult Keyword Queries
Ontology Based Prediction of Difficult Keyword Queries Lubna.C*, Kasim K Pursuing M.Tech (CSE)*, Associate Professor (CSE) MEA Engineering College, Perinthalmanna Kerala, India lubna9990@gmail.com, kasim_mlp@gmail.com
More informationQuery Difficulty Prediction for Contextual Image Retrieval
Query Difficulty Prediction for Contextual Image Retrieval Xing Xing 1, Yi Zhang 1, and Mei Han 2 1 School of Engineering, UC Santa Cruz, Santa Cruz, CA 95064 2 Google Inc., Mountain View, CA 94043 Abstract.
More informationAn Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst
An Evaluation of Information Retrieval Accuracy with Simulated OCR Output W.B. Croft y, S.M. Harding y, K. Taghva z, and J. Borsack z y Computer Science Department University of Massachusetts, Amherst
More informationMeSH : A Thesaurus for PubMed
Resources and tools for bibliographic research MeSH : A Thesaurus for PubMed What is MeSH? Who uses MeSH? Why use MeSH? Searching by using the MeSH Database What is MeSH? http://www.ncbi.nlm.nih.gov/mesh
More informationString Vector based KNN for Text Categorization
458 String Vector based KNN for Text Categorization Taeho Jo Department of Computer and Information Communication Engineering Hongik University Sejong, South Korea tjo018@hongik.ac.kr Abstract This research
More informationEnriching Knowledge Domain Visualizations: Analysis of a Record Linkage and Information Fusion Approach to Citation Data
Enriching Knowledge Domain Visualizations: Analysis of a Record Linkage and Information Fusion Approach to Citation Data Marie B. Synnestvedt, MSEd 1, 2 1 Drexel University College of Information Science
More informationKnowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.
Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European
More informationA Session-based Ontology Alignment Approach for Aligning Large Ontologies
Undefined 1 (2009) 1 5 1 IOS Press A Session-based Ontology Alignment Approach for Aligning Large Ontologies Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University,
More informationLecture 14: Annotation
Lecture 14: Annotation Nathan Schneider (with material from Henry Thompson, Alex Lascarides) ENLP 23 October 2016 1/14 Annotation Why gold 6= perfect Quality Control 2/14 Factors in Annotation Suppose
More informationExternal Query Reformulation for Text-based Image Retrieval
External Query Reformulation for Text-based Image Retrieval Jinming Min and Gareth J. F. Jones Centre for Next Generation Localisation School of Computing, Dublin City University Dublin 9, Ireland {jmin,gjones}@computing.dcu.ie
More informationUsing a Medical Thesaurus to Predict Query Difficulty
Using a Medical Thesaurus to Predict Query Difficulty Florian Boudin, Jian-Yun Nie, Martin Dawes To cite this version: Florian Boudin, Jian-Yun Nie, Martin Dawes. Using a Medical Thesaurus to Predict Query
More informationEvaluating Relevance Ranking Strategies for MEDLINE Retrieval
32 Lu et al., Evaluating Relevance Ranking Strategies Application of Information Technology Evaluating Relevance Ranking Strategies for MEDLINE Retrieval ZHIYONG LU, PHD, WON KIM, PHD, W. JOHN WILBUR,
More informationA Relevance Feedback-Based System For Quickly Narrowing Biomedical Literature Search Result
Wayne State University Wayne State University Dissertations 1-1-2013 A Relevance Feedback-Based System For Quickly Narrowing Biomedical Literature Search Result Massuod Hassan Alatrash Wayne State University,
More informationSemantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman
Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information
More informationTREC-10 Web Track Experiments at MSRA
TREC-10 Web Track Experiments at MSRA Jianfeng Gao*, Guihong Cao #, Hongzhao He #, Min Zhang ##, Jian-Yun Nie**, Stephen Walker*, Stephen Robertson* * Microsoft Research, {jfgao,sw,ser}@microsoft.com **
More informationApproach Research of Keyword Extraction Based on Web Pages Document
2017 3rd International Conference on Electronic Information Technology and Intellectualization (ICEITI 2017) ISBN: 978-1-60595-512-4 Approach Research Keyword Extraction Based on Web Pages Document Yangxin
More informationSEARCH TECHNIQUES: BASIC AND ADVANCED
17 SEARCH TECHNIQUES: BASIC AND ADVANCED 17.1 INTRODUCTION Searching is the activity of looking thoroughly in order to find something. In library and information science, searching refers to looking through
More informationAN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH
AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-
More informationExtensible Dynamic Form Approach for Supplier Discovery
Extensible Dynamic Form Approach for Supplier Discovery Yan Kang, Jaewook Kim, and Yun Peng Department of Computer Science and Electrical Engineering University of Maryland, Baltimore County {kangyan1,
More informationPOMap results for OAEI 2017
POMap results for OAEI 2017 Amir Laadhar 1, Faiza Ghozzi 2, Imen Megdiche 1, Franck Ravat 1, Olivier Teste 1, and Faiez Gargouri 2 1 Paul Sabatier University, IRIT (CNRS/UMR 5505) 118 Route de Narbonne
More informationTEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION
TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION Ms. Nikita P.Katariya 1, Prof. M. S. Chaudhari 2 1 Dept. of Computer Science & Engg, P.B.C.E., Nagpur, India, nikitakatariya@yahoo.com 2 Dept.
More informationA Language Independent Author Verifier Using Fuzzy C-Means Clustering
A Language Independent Author Verifier Using Fuzzy C-Means Clustering Notebook for PAN at CLEF 2014 Pashutan Modaresi 1,2 and Philipp Gross 1 1 pressrelations GmbH, Düsseldorf, Germany {pashutan.modaresi,
More informationOpen Research Online The Open University s repository of research publications and other research outputs
Open Research Online The Open University s repository of research publications and other research outputs A Study of Document Weight Smoothness in Pseudo Relevance Feedback Conference or Workshop Item
More informationMultimedia Information Systems
Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 6: Text Information Retrieval 1 Digital Video Library Meta-Data Meta-Data Similarity Similarity Search Search Analog Video Archive
More informationUniversity of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015
University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic
More informationDynamic Visualization of Hubs and Authorities during Web Search
Dynamic Visualization of Hubs and Authorities during Web Search Richard H. Fowler 1, David Navarro, Wendy A. Lawrence-Fowler, Xusheng Wang Department of Computer Science University of Texas Pan American
More information