Learning to Reweight Terms with Distributed Representations

Size: px
Start display at page:

Download "Learning to Reweight Terms with Distributed Representations"

Transcription

1 Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215

2 Outline Goal: Assign weights to query terms for better retrieval results Motivation Method Evaluation Conclusions

3 Motivation Assigning weights to terms is not a new problem score(q, d) = t q s(t, q, C)f(t, d) (1) Brief summary of existing representative methods: Type Method Performance Features Model Constant bad - - Unsupervised IDF average - - Supervised LtR-methods manual, good non-convex (Rank measure) (WSD) external Supervised Zhao et al average manual requires SVD (Term recall) Can we have something that is directly term weight supervised, can lead to good retrieval performance without manual / expensive feature construction?

4 Probably yes? What might be a good choice of term weights? Term recall The probability of term t of query q appears in relevant documents R q, can be estimated as Rq,t R q if relevance judgments are available True term recall proves to be highly effective as shown by Zhao et al[3, 2] But which features are good to predict that?

5 Distributed word vectors Distributed word vectors learnt from a large corpora can be used to effectively measure semantic similarity vec(king) vec(man) vec(queen) vec(woman) Another important characteristic of word vector is the addition property, i.e., it s often semantically meaningful to construct word vectors for phrases from those of individual terms, such as representing vec(chinese river) vec(chinese) + vec(river)

6 Idea in this work Intuitively, term recall measures the relative importance of a term w.r.t the whole query in determining document-query relevance With the addition property of word vectors, we can construct feature vectors for terms from their word vector representations, which resembles the above intuition of term recall For a term t in query q, we construct the feature vector x(t) given the word vector representation w(t) as x(t) vec(t) vec(q) where vec(q) is the average word vector of all the terms in q. The above defined feature vector is essentially the offset vector of a term to the center of the entire query

7 Feature vector illustration Figure: Demonstration of transforming global 2-dimensional word vectors into feature vectors local to the query.

8 Term recall learning with feature vectors Model term recall as a function of the feature vectors defined Preprocessing: transform term recall r from [,1] to y R by a logit function y = logit(r) Models the transformed term recall y as a linear function of the feature vectors x Employ l 1 norm regularization in learning (l 2 norm also tried) LASSO optimization: 1 ˆβ = arg min β R p 2 M n i (y ij β x ij ) 2 + λ β 1 (2) i=1 j=1 Predict term recall for testing query term with feature vector x as ( ) P(t R) = sigmoid ˆβ x = exp{ ˆβ x } 1 + exp{ ˆβ x } (3)

9 Incorporating predicted weights to query models (I) Base query models: Bag-of-Words (BOW): apple pie recipe Sequential Dependency model (SD): #weight(.8 #combine( apple pie recipe ).1 #combine( #1(apple pie) #1(pie recipe) ).1 #combine( #uw8(apple pie) #uw8(pie recipe) ) )

10 Incorporating predicted weights to query models (II) Term recall enhanced query models: DeepTR-BOW: #weight( P(apple R) apple P(pie R) pie P(recipe R) recipe ) DeepTR-SD: #weight(.8 #weight( P(apple R) apple P(pie R) pie P(recipe R) recipe ).1 #combine(#1(apple pie) #1(pie recipe) ).1 #combine( #uw8(apple pie) #uw8(pie recipe) ) )

11 Evaluation Data sets Collection # Docs # Words TREC Topics ROBUST4 528K 253M , 61-7 WT1g 1,692K 1,76M GOV2 25,25K 24,7M ClueWeb9B 29,38K 23,89M 1-2 Baseline query models: Bag of Words (BOW), Sequential Dependency model (SD), Weighted Sequential Dependency model (WSD) [1] Retrieval models: Language model and BM25 Evaluation metrics: P@1, MAP for all collections, NDCG@1 and ERR@2 for collections with graded relevance judgments

12 Evaluation Word vector source: trained from all above collections with dimensions ranging from 1, 3, 5 pre-trained by Google trained from Google News (1 billion words) with dimension 3

13 Experimental results Retrieval results of DeepTR-BOW with language model (dimension=3) Query Model ROBUST4 WT1g GOV2 ClueWeb9B MAP MAP MAP MAP BOW SD WSD DeepTR-BOW.2591 b b.718 (Corpus) (+3.2/-1.9) (+8.2/+3.5) (+6.3/-1.6) (+2.2/-3.6) DeepTR-BOW.265 b.2111 b.2646 b.741 (GOV2) (+5.5/+.3) (+8.7/+3.9) (+6.3/-1.6) (+5.6/-.5) DeepTR-BOW.2657 b b.718 (ClueWeb9B) (+5.8/+.5) (+9.6/+4.8) (+7.9/-.1) (+2.2/-3.6) DeepTR-BOW.2673 b.2122 b.263 b.732 (Google) (+6.4/+1.2) (+9.3/+4.5) (+5.7/-2.2) (+4.2/-1.8) b : Statistically significant difference with BOW s : Statistically significant difference with SD Weighted BoW queries significantly outperforms unweigted BoW queries and are even comparable to SD queries.

14 Experimental results (cont.) Retrieval results of DeepTR-SD with language model (dimension=3) Query Model ROBUST4 WT1g GOV2 ClueWeb9B MAP MAP MAP MAP BOW SD WSD DeepTR-SD.2754 b s.2182 b s.2831 b s.748 (Corpus) (+9.6/+4.2) (+12.3/+7.4) (+13.8/+5.3) (+6.5/+.5) DeepTR-SD.2781 b s.2223 b s.2831 b s.86 b s (GOV2) (+1.7/+5.2) (+14.4/+9.4) (+13.8/+5.3) (+14.8/+8.2) DeepTR-SD.281 b s.2279 b s.289 b s.748 (ClueWeb9B) (+11.9/+6.3) (+17.3/+12.1) (+16.2/+7.5) (+6.5/+.5) DeepTR-SD.2842 b s.2256 b s.2814 b s.78 b s (Google) (+13.1/+7.5) (+16.1/+11.) (+13.1/+4.7) (+11./+4.7) b : Statistically significant difference with BOW s : Statistically significant difference with SD DeepTR-SD significantly outperforms both BoW and SD queries.

15 Experimental results (cont.) Retrieval results of DeepTR-BOW with BM25 (dimension=3) Query Model ROBUST4 WT1g GOV2 ClueWeb9B MAP MAP MAP MAP BOW SD DeepTR-BOW.2566 b.193 b.243 b.593 b (Corpus) (+4.3/+.8) (+6.7/+4.7) (+4.1/+.9) (+4.7/-1.2) DeepTR-BOW.2561 b.191 b.243 b.596 b (GOV2) (+4.1/+.6) (+7.1/+5.1) (+4.1/+.9) (+5.3/-.7) DeepTR-BOW.259 b.1932 b.2462 b.593 b (ClueWeb9B) (+5.3/+1.8) (+8.3/+6.3) (+5.5/+2.2) (+4.7/-1.2) DeepTR-BOW.26 b.1922 b.2449 b.584 (Google) (+5.7/+2.1) (+7.8/+5.7) (+4.9/+1.7) (+3.1/-2.7) b : Statistically significant difference with BOW s : Statistically significant difference with SD Similar results observed for BM25 retrieval.

16 Experimental results (cont.) Retrieval results of DeepTR-SD with BM25 (dimension=3) Query Model ROBUST4 WT1g GOV2 ClueWeb9B MAP MAP MAP MAP BOW SD DeepTR-SD.2595 b s.1944 b s.2478 b s.613 b s (Corpus) (+5.5/+1.9) (+9./+7.) (+6.1/+2.9) (+8.2/+2.1) DeepTR-SD.269 b s.1941 s.2478 b s.616 b s (GOV2) (+6.1/+2.5) (+8.8/+6.8) (+6.1/+2.9) (+8.8/+2.6) DeepTR-SD.263 b s.1951 b s.252 b s.613 b s (ClueWeb9B) (+6.9/+3.3) (+9.4/+7.3) (+7.2/+3.9) (+8.2/+2.1) DeepTR-SD.2627 b s.1938 s.2461 b.64 (Google) (+6.8/+3.2) (+8.7/+6.7) (+5.4/+2.2) (+6.8/+.7) b : Statistically significant difference with BOW s : Statistically significant difference with SD Similar results observed for BM25 retrieval.

17 Word vector dimensionality vectors trained from ClueWeb9B, Language Model retrieval Query Model ROBUST4 WT1g GOV2 ClueWeb9B MAP MAP MAP MAP BOW SD WSD DeepTR-BOW.2681 b.2239 b.2667 b.727 (1) (+6.7/+1.4) (+15.3/+1.2) (+7.2/-.8) (+3.6/-2.4) DeepTR-BOW.2657 b b.718 (3) (+5.8/+.5) (+9.6/+4.8) (+7.9/-.1) (+2.2/-3.6) DeepTR-BOW.2679 b.2179 b.2655 b.726 (5) (+6.6/+1.4) (+12.1/+7.2) (+6.7/-1.2) (+3.4/-2.5) DeepTR-SD.2851 b s.2373 b s.2848 b s.778 b s (1) (+13.5/+7.9) (+22.1/+16.8) (+14.4/+5.9) (+1.8/+4.5) DeepTR-SD.281 b s.2279 b s.289 b s.748 (3) (+11.9/+6.3) (+17.3/+12.1) (+16.2/+7.5) (+6.5/+.5) DeepTR-SD.2817 b s.236 b s.286 b s.781 b s (5) (+12.2/+6.6) (+18.7/+13.5) (+14.9/+6.4) (+11.3/+4.9) b : Statistically significant difference with BOW s : Statistically significant difference with SD d = 1 works slightly better

18 Experimental results (cont.) ROBUST BOW SD DeepTR SD (ROBUST4 3) DeepTR SD (WT1g 3) DeepTR SD (GOV2 3) DeepTR SD (ClueWeb9B 1) DeepTR SD (ClueWeb9B 3) DeepTR SD (ClueWeb9B 5) DeepTR SD (Google 3).25 MAP LM BM25

19 WT1g BOW SD DeepTR SD (ROBUST4 3) DeepTR SD (WT1g 3) DeepTR SD (GOV2 3) DeepTR SD (ClueWeb9B 1) DeepTR SD (ClueWeb9B 3) DeepTR SD (ClueWeb9B 5) DeepTR SD (Google 3) MAP LM BM25

20 GOV2 BOW SD DeepTR SD (ROBUST4 3) DeepTR SD (WT1g 3) DeepTR SD (GOV2 3) DeepTR SD (ClueWeb9B 1) DeepTR SD (ClueWeb9B 3) DeepTR SD (ClueWeb9B 5) DeepTR SD (Google 3) MAP LM BM25

21 ClueWeb9B.1 BOW SD DeepTR SD (ROBUST4 3) DeepTR SD (WT1g 3) DeepTR SD (GOV2 3) DeepTR SD (ClueWeb9B 1) DeepTR SD (ClueWeb9B 3) DeepTR SD (ClueWeb9B 5) DeepTR SD (Google 3) MAP.5 LM BM25

22 Robustness Number of queries ROBUST4 SD DeepTR SD (trec7) DeepTR SD (wt1g) DeepTR SD (gov2) DeepTR SD (clueweb1) DeepTR SD (clueweb3) DeepTR SD (clueweb5) DeepTR SD (google3) 2 1 < 1% [ 1%, 8%) [ 8%, 6%) [ 6%, 4%) [ 4%, 2%) [ 2%, ) (, 2%) [2%, 4%) [4%, 6%) [6%, 8%) [8%, 1%] >1% Relative Learning gain to Reweight Terms with Distributed Representations

23 Number of queries WT1g SD DeepTR SD (trec7) DeepTR SD (wt1g) DeepTR SD (gov2) DeepTR SD (clueweb1) DeepTR SD (clueweb3) DeepTR SD (clueweb5) DeepTR SD (google3) 1 5 < 1% [ 1%, 8%) [ 8%, 6%) [ 6%, 4%) [ 4%, 2%) [ 2%, ) Relative gain (, 2%) [2%, 4%) [4%, 6%) [6%, 8%) [8%, 1%] >1%

24 Number of queries GOV2 SD DeepTR SD (trec7) DeepTR SD (wt1g) DeepTR SD (gov2) DeepTR SD (clueweb1) DeepTR SD (clueweb3) DeepTR SD (clueweb5) DeepTR SD (google3) 1 < 1% [ 1%, 8%) [ 8%, 6%) [ 6%, 4%) [ 4%, 2%) [ 2%, ) Relative gain (, 2%) [2%, 4%) [4%, 6%) [6%, 8%) [8%, 1%] >1%

25 Number of queries ClueWeb9B SD DeepTR SD (trec7) DeepTR SD (wt1g) DeepTR SD (gov2) DeepTR SD (clueweb1) DeepTR SD (clueweb3) DeepTR SD (clueweb5) DeepTR SD (google3) 1 5 < 1% [ 1%, 8%) [ 8%, 6%) [ 6%, 4%) [ 4%, 2%) [ 2%, ) Relative gain (, 2%) [2%, 4%) [4%, 6%) [6%, 8%) [8%, 1%] >1%

26 Query length Relative improvement over LM Relative improvement over LM ROBUST4 SD DeepTR SD Len 3 (7%) Len 4 5 (27%) Len 6+ (66%) GOV2 SD DeepTR SD Relative improvement over LM Relative improvement over LM WT1g SD DeepTR SD Len 3 (26%) Len 4 5 (39%) Len 6+ (35%) ClueWeb9B SD DeepTR SD Len 3 (11%) Len 4 5 (41%) Len 6+ (48%) Len 3 (26%) Len 4 5 (4%) Len 6+ (34%)

27 Graded relevance Query Model Language Model Retrieval GOV2 ClueWeb9B BOW SD DeepTR-SD.4121 b.1626 b.1743 b.1312 s (GOV2) DeepTR-SD.4146 b.1614 b (ClueWeb9B) DeepTR-SD.4112 b (Google) Query Model BM25 Retrieval GOV2 ClueWeb9B NDCG@2 ERR@2 NDCG@2 ERR@2 BOW SD DeepTR-SD.48 b s.159 b s (GOV2) DeepTR-SD.433 b s.1587 b s (ClueWeb9B) DeepTR-SD.41 b s.1586 s (Google) b : Statistically significant difference with BOW s : Statistically significant difference with SD

28 Conclusions and future work Novel (simple) framework for learning term weightes from features directly constructed from distributed word vectors Extensive experiments with two retrieval models on four test collections demonstrates the effectiveness Completing the table: Type Method Performance Features Model Unsupervised Constant bad - - IDF average - - Supervised LtR-methods manual, good (Rank measure) (WSD) external non-convex Supervised (Term recall) Zhao et al average manual requires SVD Supervised Proposed work good automatic convex, cheap (Term recall) Future directions include predicting term recalls for bigrams and proximity terms, other possible ways to construct feature vectors from distributed vectors, using word vectors for query expansion, theoretical justifications, etc.

29 Acknowledgements and QA Reproducibility: codes, preprocessed queries, experimental setups are available Contact: Thanks to SIGIR for the student travel grant Questions?

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Deep Relevance Matching Model for Ad-hoc Retrieval A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese

More information

Estimating Embedding Vectors for Queries

Estimating Embedding Vectors for Queries Estimating Embedding Vectors for Queries Hamed Zamani Center for Intelligent Information Retrieval College of Information and Computer Sciences University of Massachusetts Amherst Amherst, MA 01003 zamani@cs.umass.edu

More information

Entity and Knowledge Base-oriented Information Retrieval

Entity and Knowledge Base-oriented Information Retrieval Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061

More information

Modern Retrieval Evaluations. Hongning Wang

Modern Retrieval Evaluations. Hongning Wang Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance

More information

Informativeness for Adhoc IR Evaluation:

Informativeness for Adhoc IR Evaluation: Informativeness for Adhoc IR Evaluation: A measure that prevents assessing individual documents Romain Deveaud 1, Véronique Moriceau 2, Josiane Mothe 3, and Eric SanJuan 1 1 LIA, Univ. Avignon, France,

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

Information Retrieval. Information Retrieval and Web Search

Information Retrieval. Information Retrieval and Web Search Information Retrieval and Web Search Introduction to IR models and methods Information Retrieval The indexing and retrieval of textual documents. Searching for pages on the World Wide Web is the most recent

More information

Northeastern University in TREC 2009 Million Query Track

Northeastern University in TREC 2009 Million Query Track Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College

More information

Chapter 8. Evaluating Search Engine

Chapter 8. Evaluating Search Engine Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can

More information

From Neural Re-Ranking to Neural Ranking:

From Neural Re-Ranking to Neural Ranking: From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing Hamed Zamani (1), Mostafa Dehghani (2), W. Bruce Croft (1), Erik Learned-Miller (1), and Jaap Kamps (2)

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea Intelligent Information Retrieval 1. Relevance feedback - Direct feedback - Pseudo feedback 2. Query expansion

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Introduction to IR models and methods Rada Mihalcea (Some of the slides in this slide set come from IR courses taught at UT Austin and Stanford) Information Retrieval

More information

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks Pramod Srinivasan CS591txt - Text Mining Seminar University of Illinois, Urbana-Champaign April 8, 2016 Pramod Srinivasan

More information

Learning Dense Models of Query Similarity from User Click Logs

Learning Dense Models of Query Similarity from User Click Logs Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational

More information

A Fixed-Point Method for Weighting Terms in Verbose Informational Queries

A Fixed-Point Method for Weighting Terms in Verbose Informational Queries A Fixed-Point Method for Weighting Terms in Verbose Informational Queries ABSTRACT Jiaul H. Paik UMIACS University of Maryland College Park, MD 20742 USA jiaul@umd.edu The term weighting and document ranking

More information

Learning Concept Importance Using a Weighted Dependence Model

Learning Concept Importance Using a Weighted Dependence Model Learning Concept Importance Using a Weighted Dependence Model Michael Bendersky Dept. of Computer Science University of Massachusetts Amherst, MA bemike@cs.umass.edu Donald Metzler Yahoo! Labs 4401 Great

More information

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano

More information

Information Retrieval. (M&S Ch 15)

Information Retrieval. (M&S Ch 15) Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion

More information

Fall Lecture 16: Learning-to-rank

Fall Lecture 16: Learning-to-rank Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin

More information

Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track

Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University Introduc:on Session

More information

Microsoft Research Asia at the Web Track of TREC 2009

Microsoft Research Asia at the Web Track of TREC 2009 Microsoft Research Asia at the Web Track of TREC 2009 Zhicheng Dou, Kun Chen, Ruihua Song, Yunxiao Ma, Shuming Shi, and Ji-Rong Wen Microsoft Research Asia, Xi an Jiongtong University {zhichdou, rsong,

More information

Midterm Exam Search Engines ( / ) October 20, 2015

Midterm Exam Search Engines ( / ) October 20, 2015 Student Name: Andrew ID: Seat Number: Midterm Exam Search Engines (11-442 / 11-642) October 20, 2015 Answer all of the following questions. Each answer should be thorough, complete, and relevant. Points

More information

CS54701: Information Retrieval

CS54701: Information Retrieval CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful

More information

CS 6320 Natural Language Processing

CS 6320 Natural Language Processing CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic

More information

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

End-to-End Neural Ad-hoc Ranking with Kernel Pooling End-to-End Neural Ad-hoc Ranking with Kernel Pooling Chenyan Xiong 1,Zhuyun Dai 1, Jamie Callan 1, Zhiyuan Liu, and Russell Power 3 1 :Language Technologies Institute, Carnegie Mellon University :Tsinghua

More information

Term Frequency Normalisation Tuning for BM25 and DFR Models

Term Frequency Normalisation Tuning for BM25 and DFR Models Term Frequency Normalisation Tuning for BM25 and DFR Models Ben He and Iadh Ounis Department of Computing Science University of Glasgow United Kingdom Abstract. The term frequency normalisation parameter

More information

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling

Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling Chenyan Xiong, Zhengzhong Liu, Jamie Callan, and Tie-Yan Liu* Carnegie Mellon University & Microsoft Research* 1

More information

Word Embeddings in Search Engines, Quality Evaluation. Eneko Pinzolas

Word Embeddings in Search Engines, Quality Evaluation. Eneko Pinzolas Word Embeddings in Search Engines, Quality Evaluation Eneko Pinzolas Neural Networks are widely used with high rate of success. But can we reproduce those results in IR? Motivation State of the art for

More information

Improving Patent Search by Search Result Diversification

Improving Patent Search by Search Result Diversification Improving Patent Search by Search Result Diversification Youngho Kim University of Massachusetts Amherst yhkim@cs.umass.edu W. Bruce Croft University of Massachusetts Amherst croft@cs.umass.edu ABSTRACT

More information

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Saeid Balaneshin 1 Alexander Kotov 1 Fedor Nikolaev 1,2 1 Textual Data Analytics Lab, Department of Computer Science, Wayne State

More information

WebSci and Learning to Rank for IR

WebSci and Learning to Rank for IR WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles

More information

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015 University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Information Search and Management Federated Search Prof. Chris Clifton 13 November 2017 Federated Search Outline Introduction to federated search Main research problems Resource Representation

More information

arxiv: v1 [cs.ir] 16 Oct 2017

arxiv: v1 [cs.ir] 16 Oct 2017 DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, Xueqi Cheng pl8787@gmail.com,{lanyanyan,guojiafeng,junxu,cxq}@ict.ac.cn,xujingfang@sogou-inc.com

More information

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Evaluation" Evaluation is key to building

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval Mohsen Kamyar چهارمین کارگاه ساالنه آزمایشگاه فناوری و وب بهمن ماه 1391 Outline Outline in classic categorization Information vs. Data Retrieval IR Models Evaluation

More information

Fractional Similarity : Cross-lingual Feature Selection for Search

Fractional Similarity : Cross-lingual Feature Selection for Search : Cross-lingual Feature Selection for Search Jagadeesh Jagarlamudi University of Maryland, College Park, USA Joint work with Paul N. Bennett Microsoft Research, Redmond, USA Using All the Data Existing

More information

Optimization for Machine Learning

Optimization for Machine Learning with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex

More information

Window Extraction for Information Retrieval

Window Extraction for Information Retrieval Window Extraction for Information Retrieval Samuel Huston Center for Intelligent Information Retrieval University of Massachusetts Amherst Amherst, MA, 01002, USA sjh@cs.umass.edu W. Bruce Croft Center

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Exploiting Internal and External Semantics for the Clustering of Short Texts Using World Knowledge

Exploiting Internal and External Semantics for the Clustering of Short Texts Using World Knowledge Exploiting Internal and External Semantics for the Using World Knowledge, 1,2 Nan Sun, 1 Chao Zhang, 1 Tat-Seng Chua 1 1 School of Computing National University of Singapore 2 School of Computer Science

More information

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Xiaodong Liu 12, Jianfeng Gao 1, Xiaodong He 1 Li Deng 1, Kevin Duh 2, Ye-Yi Wang 1 1

More information

CS646 (Fall 2016) Homework 1

CS646 (Fall 2016) Homework 1 CS646 (Fall 2016) Homework 1 Deadline: 11:59pm, Sep 28th, 2016 (EST) Access the following resources before you start working on HW1: Download the corpus file on Moodle: acm corpus.gz (about 90 MB). Check

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Pseudo-Relevance Feedback and Title Re-Ranking for Chinese Information Retrieval

Pseudo-Relevance Feedback and Title Re-Ranking for Chinese Information Retrieval Pseudo-Relevance Feedback and Title Re-Ranking Chinese Inmation Retrieval Robert W.P. Luk Department of Computing The Hong Kong Polytechnic University Email: csrluk@comp.polyu.edu.hk K.F. Wong Dept. Systems

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Information Search and Management Prof. Chris Clifton 27 August 2018 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 AD-hoc IR: Basic Process Information

More information

Supervised Hashing for Image Retrieval via Image Representation Learning

Supervised Hashing for Image Retrieval via Image Representation Learning Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar

More information

Improving Difficult Queries by Leveraging Clusters in Term Graph

Improving Difficult Queries by Leveraging Clusters in Term Graph Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu

More information

Robust Relevance-Based Language Models

Robust Relevance-Based Language Models Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new

More information

Information Retrieval

Information Retrieval Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data

More information

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition

More information

Retrieval and Feedback Models for Blog Distillation

Retrieval and Feedback Models for Blog Distillation Retrieval and Feedback Models for Blog Distillation CMU at the TREC 2007 Blog Track Jonathan Elsas, Jaime Arguello, Jamie Callan, Jaime Carbonell CMU s Blog Distillation Focus Two Research Questions: What

More information

over Multi Label Images

over Multi Label Images IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,

More information

Retrieval and Feedback Models for Blog Distillation

Retrieval and Feedback Models for Blog Distillation Retrieval and Feedback Models for Blog Distillation Jonathan Elsas, Jaime Arguello, Jamie Callan, Jaime Carbonell Language Technologies Institute, School of Computer Science, Carnegie Mellon University

More information

An Exploration of Query Term Deletion

An Exploration of Query Term Deletion An Exploration of Query Term Deletion Hao Wu and Hui Fang University of Delaware, Newark DE 19716, USA haowu@ece.udel.edu, hfang@ece.udel.edu Abstract. Many search users fail to formulate queries that

More information

CS294-1 Assignment 2 Report

CS294-1 Assignment 2 Report CS294-1 Assignment 2 Report Keling Chen and Huasha Zhao February 24, 2012 1 Introduction The goal of this homework is to predict a users numeric rating for a book from the text of the user s review. The

More information

On the automatic classification of app reviews

On the automatic classification of app reviews Requirements Eng (2016) 21:311 331 DOI 10.1007/s00766-016-0251-9 RE 2015 On the automatic classification of app reviews Walid Maalej 1 Zijad Kurtanović 1 Hadeer Nabil 2 Christoph Stanik 1 Walid: please

More information

Optimized Top-K Processing with Global Page Scores on Block-Max Indexes

Optimized Top-K Processing with Global Page Scores on Block-Max Indexes Optimized Top-K Processing with Global Page Scores on Block-Max Indexes Dongdong Shan 1 Shuai Ding 2 Jing He 1 Hongfei Yan 1 Xiaoming Li 1 Peking University, Beijing, China 1 Polytechnic Institute of NYU,

More information

Modeling Higher-Order Term Dependencies in Information Retrieval using Query Hypergraphs

Modeling Higher-Order Term Dependencies in Information Retrieval using Query Hypergraphs Modeling Higher-Order Term Dependencies in Information Retrieval using Query Hypergraphs ABSTRACT Michael Bendersky Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA bemike@cs.umass.edu

More information

Query Phrase Expansion using Wikipedia for Patent Class Search

Query Phrase Expansion using Wikipedia for Patent Class Search Query Phrase Expansion using Wikipedia for Patent Class Search 1 Bashar Al-Shboul, Sung-Hyon Myaeng Korea Advanced Institute of Science and Technology (KAIST) December 19 th, 2011 AIRS 11, Dubai, UAE OUTLINE

More information

AComparisonofRetrievalModelsusingTerm Dependencies

AComparisonofRetrievalModelsusingTerm Dependencies AComparisonofRetrievalModelsusingTerm Dependencies Samuel Huston and W. Bruce Croft Center for Intelligent Information Retrieval University of Massachusetts Amherst 140 Governors Dr, Amherst, Massachusetts,

More information

Query Likelihood with Negative Query Generation

Query Likelihood with Negative Query Generation Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer

More information

Semantic Matching by Non-Linear Word Transportation for Information Retrieval

Semantic Matching by Non-Linear Word Transportation for Information Retrieval Semantic Matching by Non-Linear Word Transportation for Information Retrieval Jiafeng Guo, Yixing Fan, Qingyao Ai, W. Bruce Croft CAS Key Lab of Network Data Science and Technology, Institute of Computing

More information

Learning-based Neuroimage Registration

Learning-based Neuroimage Registration Learning-based Neuroimage Registration Leonid Teverovskiy and Yanxi Liu 1 October 2004 CMU-CALD-04-108, CMU-RI-TR-04-59 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract

More information

Welcome to the class of Web Information Retrieval!

Welcome to the class of Web Information Retrieval! Welcome to the class of Web Information Retrieval! Tee Time Topic Augmented Reality and Google Glass By Ali Abbasi Challenges in Web Search Engines Min ZHANG z-m@tsinghua.edu.cn April 13, 2012 Challenges

More information

Federated Text Search

Federated Text Search CS54701 Federated Text Search Luo Si Department of Computer Science Purdue University Abstract Outline Introduction to federated search Main research problems Resource Representation Resource Selection

More information

In = number of words appearing exactly n times N = number of words in the collection of words A = a constant. For example, if N=100 and the most

In = number of words appearing exactly n times N = number of words in the collection of words A = a constant. For example, if N=100 and the most In = number of words appearing exactly n times N = number of words in the collection of words A = a constant. For example, if N=100 and the most common word appears 10 times then A = rn*n/n = 1*10/100

More information

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points? Ranked Retrieval One option is to average the precision scores at discrete Precision 100% 0% More junk 100% Everything points on the ROC curve But which points? Recall We want to evaluate the system, not

More information

Information Retrieval CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Information Retrieval CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Information Retrieval CS 6900 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Boolean Retrieval vs. Ranked Retrieval Many users (professionals) prefer

More information

On Duplicate Results in a Search Session

On Duplicate Results in a Search Session On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing

More information

Fall CS646: Information Retrieval. Lecture 2 - Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12

Fall CS646: Information Retrieval. Lecture 2 - Introduction to Search Result Ranking. Jiepu Jiang University of Massachusetts Amherst 2016/09/12 Fall 2016 CS646: Information Retrieval Lecture 2 - Introduction to Search Result Ranking Jiepu Jiang University of Massachusetts Amherst 2016/09/12 More course information Programming Prerequisites Proficiency

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

ICTNET at Web Track 2010 Diversity Task

ICTNET at Web Track 2010 Diversity Task ICTNET at Web Track 2010 Diversity Task Yuanhai Xue 1,2, Zeying Peng 1,2, Xiaoming Yu 1, Yue Liu 1, Hongbo Xu 1, Xueqi Cheng 1 1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

CS54701: Information Retrieval

CS54701: Information Retrieval CS54701: Information Retrieval Federated Search 10 March 2016 Prof. Chris Clifton Outline Federated Search Introduction to federated search Main research problems Resource Representation Resource Selection

More information

Inverted List Caching for Topical Index Shards

Inverted List Caching for Topical Index Shards Inverted List Caching for Topical Index Shards Zhuyun Dai and Jamie Callan Language Technologies Institute, Carnegie Mellon University {zhuyund, callan}@cs.cmu.edu Abstract. Selective search is a distributed

More information

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Automatic Summarization

Automatic Summarization Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

Improving Document Ranking for Long Queries with Nested Query Segmentation

Improving Document Ranking for Long Queries with Nested Query Segmentation Improving Document Ranking for Long Queries with Nested Query Segmentation Rishiraj Saha Roy 1, Anusha Suresh 2, Niloy Ganguly 2, and Monojit Choudhury 3 1 Max Planck Institute for Informatics, Saarbrücken,

More information

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Effective Latent Space Graph-based Re-ranking Model with Global Consistency Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case

More information

Document Clustering Using Locality Preserving Indexing

Document Clustering Using Locality Preserving Indexing Document Clustering Using Locality Preserving Indexing Deng Cai Department of Computer Science University of Illinois at Urbana Champaign 1334 Siebel Center, 201 N. Goodwin Ave, Urbana, IL 61801, USA Phone:

More information

The University of Amsterdam at the CLEF 2008 Domain Specific Track

The University of Amsterdam at the CLEF 2008 Domain Specific Track The University of Amsterdam at the CLEF 2008 Domain Specific Track Parsimonious Relevance and Concept Models Edgar Meij emeij@science.uva.nl ISLA, University of Amsterdam Maarten de Rijke mdr@science.uva.nl

More information

A RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH

A RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH A RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH A thesis Submitted to the faculty of the graduate school of the University of Minnesota by Vamshi Krishna Thotempudi In partial fulfillment of the requirements

More information

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task Chieh-Jen Wang, Yung-Wei Lin, *Ming-Feng Tsai and Hsin-Hsi Chen Department of Computer Science and Information Engineering,

More information

Text Analytics (Text Mining)

Text Analytics (Text Mining) CSE 6242 / CX 4242 Apr 1, 2014 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer,

More information

Practical Relevance Ranking for 10 Million Books.

Practical Relevance Ranking for 10 Million Books. Practical Relevance Ranking for 10 Million Books. Tom Burton-West Digital Library Production Service, University of Michigan Library, Ann Arbor, Michigan, US tburtonw@umich.edu Abstract. In this paper

More information

Ruslan Salakhutdinov and Geoffrey Hinton. University of Toronto, Machine Learning Group IRGM Workshop July 2007

Ruslan Salakhutdinov and Geoffrey Hinton. University of Toronto, Machine Learning Group IRGM Workshop July 2007 SEMANIC HASHING Ruslan Salakhutdinov and Geoffrey Hinton University of oronto, Machine Learning Group IRGM orkshop July 2007 Existing Methods One of the most popular and widely used in practice algorithms

More information

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments Hui Fang ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign Abstract In this paper, we report

More information

Overview of the NTCIR-13 OpenLiveQ Task

Overview of the NTCIR-13 OpenLiveQ Task Overview of the NTCIR-13 OpenLiveQ Task ABSTRACT Makoto P. Kato Kyoto University mpkato@acm.org Akiomi Nishida Yahoo Japan Corporation anishida@yahoo-corp.jp This is an overview of the NTCIR-13 OpenLiveQ

More information

Automatic Term Mismatch Diagnosis for Selective Query Expansion

Automatic Term Mismatch Diagnosis for Selective Query Expansion Automatic Term Mismatch Diagnosis for Selective Query Expansion Le Zhao Language Technologies Institute Carnegie Mellon University Pittsburgh, PA, USA lezhao@cs.cmu.edu Jamie Callan Language Technologies

More information

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert

More information

Tuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017

Tuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017 Tuning Philipp Koehn presented by Gaurav Kumar 28 September 2017 The Story so Far: Generative Models 1 The definition of translation probability follows a mathematical derivation argmax e p(e f) = argmax

More information

Online Expansion of Rare Queries for Sponsored Search

Online Expansion of Rare Queries for Sponsored Search Online Expansion of Rare Queries for Sponsored Search Peter Ciccolo, Evgeniy Gabrilovich, Vanja Josifovski, Don Metzler, Lance Riedel, Jeff Yuan Yahoo! Research 1 Sponsored Search 2 Sponsored Search in

More information

Analytic Knowledge Discovery Models for Information Retrieval and Text Summarization

Analytic Knowledge Discovery Models for Information Retrieval and Text Summarization Analytic Knowledge Discovery Models for Information Retrieval and Text Summarization Pawan Goyal Team Sanskrit INRIA Paris Rocquencourt Pawan Goyal (http://www.inria.fr/) Analytic Knowledge Discovery Models

More information

RMIT University at TREC 2006: Terabyte Track

RMIT University at TREC 2006: Terabyte Track RMIT University at TREC 2006: Terabyte Track Steven Garcia Falk Scholer Nicholas Lester Milad Shokouhi School of Computer Science and IT RMIT University, GPO Box 2476V Melbourne 3001, Australia 1 Introduction

More information

Research Article Relevance Feedback Based Query Expansion Model Using Borda Count and Semantic Similarity Approach

Research Article Relevance Feedback Based Query Expansion Model Using Borda Count and Semantic Similarity Approach Computational Intelligence and Neuroscience Volume 215, Article ID 568197, 13 pages http://dx.doi.org/1.1155/215/568197 Research Article Relevance Feedback Based Query Expansion Model Using Borda Count

More information

WITHIN-DOCUMENT TERM-BASED INDEX PRUNING TECHNIQUES WITH STATISTICAL HYPOTHESIS TESTING. Sree Lekha Thota

WITHIN-DOCUMENT TERM-BASED INDEX PRUNING TECHNIQUES WITH STATISTICAL HYPOTHESIS TESTING. Sree Lekha Thota WITHIN-DOCUMENT TERM-BASED INDEX PRUNING TECHNIQUES WITH STATISTICAL HYPOTHESIS TESTING by Sree Lekha Thota A thesis submitted to the Faculty of the University of Delaware in partial fulfillment of the

More information

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence! (301) 219-4649 james.mayfield@jhuapl.edu What is Information Retrieval? Evaluation

More information