A Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg
|
|
- Aldous Dalton
- 5 years ago
- Views:
Transcription
1 A Simple and Efficient Sampling Method for Es7ma7ng AP and ndcg Emine Yilmaz Microso' Research, Cambridge, UK Evangelos Kanoulas Javed Aslam Northeastern University, Boston, USA
2 Introduc7on Obtaining relevance judgments Relevance judgments are expensive TREC: Depth k pooling Document collec7ons can be very large Depth pooling is s7ll expensive (85600 judgments for TREC8) 3 min/doc, 40 hrs/wk, 50 wks/year ==> 2.14 man years! Evalua7on with incomplete judgments Bpref (Buckley and Voorhees, SIGIR 06) Evalua7on using condensed lists (Sakai SIGIR 07) Methods for ranking systems with less judgments (CartereYe et al. SIGIR 06, Moffat et al. SIGIR 07) Methods directly eslmalng measures with less judgments (Aslam et al. SIGIR 06, Yilmaz and Aslam CIKM 06)
3 Mo7va7on Inferred AP (Yilmaz and Aslam CIKM 06) No confidence intervals associated with the es7mates Incomplete relevance judgments random subset of complete judgments Importance Sampling (Aslam et al. SIGIR 06) Difficult to compute confidence intervals Overly complicated Combine the advantages of the two approaches Confidence intervals for inferred AP Extend inferred AP to incorporate nonrandom judgments
4 Inferred AP [Yilmaz and Aslam CIKM06] Average precision as a random experiment 1. Select a relevant document at random Rank of the document : k 2. Select a rank at random from the set {1,.,k} 3. Output the binary relevance of document at this rank. Average (step 1) of precisions at relevant documents (steps 2 and 3).
5 Inferred AP [Yilmaz and Aslam CIKM06] 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N
6 Inferred AP [Yilmaz and Aslam CIKM06] 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N
7 Inferred AP [Yilmaz and Aslam CIKM07] 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N PC 1 = = 1 PC 3 = =0.625 PC 9 = = infap = =0.7268
8 Variance in Inferred AP 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N Inferred AP is unbiased in expecta7on Varies in prac7ce Variance and Confidence Intervals Random Experiment can be realized as two stage sampling
9 Variance in Inferred AP 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N Two stages sampling Stage 1 : sample of cut off levels (relevant documents) and average precisions 1 st variance component
10 Variance in Inferred AP 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N Two stages sampling Stage 2 : sample of documents above each selected cut of level to compute precisions 2 nd variance component
11 Variance in Inferred AP 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N Law of Total Variance Total Variance in inferred AP = stage 1 variance + stage 2 variance Variance of Mean InfAP = Total Variance in InfAP / (# of Queries) 2 Assign confidence intervals to Mean InfAP according to Central Limit Theorem
12 Confidence Intervals for Mean InfAP
13 Confidence Intervals for Mean InfAP
14 Confidence Intervals for Mean InfAP Percentage of mean infap values deviating from actual MAP values TREC 8! Cumulative Function Distribution of infap values CDF for infap values CDF for Normal Distribution Number of std. dev. mean infap values deviate from the actual MAP values K S test : for 90% of systems the hypothesis cannot be rejected (α = 0.05)
15 Confidence Intervals for Mean InfAP
16 Stra7fied Random Sampling Goal : Unbiased es7mator of AP Decrease variance in the es7mator Evalua7on measures give more weight to documents towards the top of the list Top heavy sampling strategy can reduce variance in Mean InfAP
17 Stra7fied Random Sampling 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N 1 st Stratum, p = 60% 2 nd Stratum, p = 40% Divide complete pool of judgments into strata (disjoint con7guous subsets) Randomly sample some documents from each stratum to be judged Sampling percentage within each stratum can be different Evaluate search engines with sampled documents
18 Extended infap (xinfap) Select a relevant document at random (1 st step) Selected relevant document can fall in any of the strata By the defini7on of condi7onal expecta7on
19 Extended infap (xinfap) Select a relevant document at random (1 st step) Probability of picking relevant document from stratum s
20 Extended infap (xinfap) Select a relevant document at random (1 st step) Probability of picking relevant document from stratum s
21 Extended infap (xinfap) 1 st Stratum, p = 60% 2 nd Stratum, p = 40% 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N
22 Extended infap (xinfap) Select a relevant document at random (1 st step) Within each stratum: Judged documents uniform random subset of all documents Uniform distribu7on over the relevant documents computed as average of precisions at judged relevant documents
23 Extended infap (xinfap) Precision at a relevant document at rank k (2 nd and 3 rd step) Select a rank at random from the set {1,.,k} Output the binary relevance of document at this rank. Probability 1/k pick the current document
24 Extended infap (xinfap) Precision at a relevant document at rank k (2 nd and 3 rd step) Select a rank at random from the set {1,.,k} Output the binary relevance of document at this rank. Probability 1/k pick the current document Probability (k 1)/k pick a document above
25 Extended infap (xinfap) Precision at a relevant document at rank k (2 nd and 3 rd step) Select a rank at random from the set {1,.,k} Output the binary relevance of document at this rank. Probability 1/k pick the current document Probability (k 1)/k pick a document above Probability of picking a document (above k) from stratum s
26 Extended infap (xinfap) Precision at a relevant document at rank k (2 nd and 3 rd step) Select a rank at random from the set {1,.,k} Output the binary relevance of document at this rank. Probability 1/k pick the current document Probability (k 1)/k pick a document above
27 Extended infap (xinfap) Precision at a relevant document at rank k (2 nd and 3 rd step) Select a rank at random from the set {1,.,k} Output the binary relevance of document at this rank. Probability 1/k pick the current document Probability (k 1)/k pick a document above
28 Extended infap (xinfap) 1 st Stratum, p = 60% 2 nd Stratum, p = 40% 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N
29 Extended infap (xinfap) 1 st Stratum, p = 60% 2 nd Stratum, p =40% 1. R 2. N 3. R 4. R 5. N 6. R 7. N 8. N 9. R 10. N
30 TREC Terabyte 06 Depth 50 pool Available judgments Remainder P% judged
31 TREC Terabyte 06 Depth 50 pool Available judgments Depth 50 pool Standard Measures Remainder P% judged Remainder P% judged
32 TREC Terabyte 06 Depth 50 pool Available judgments Depth 50 pool Standard Measures Depth 50 pool Inferred AP Remainder P% judged Remainder P% judged Remainder P% judged
33 Simulate Terabyte Setup on TREC 8 data Assume complete judgments: depth 100 pool Form different depth k pools k є {1,2,3,4,5,10,20,30,40,50} For each k compute the total number of documents in depth k pool Randomly sample equal number of documents from the complete judgment set (excluding depth k pool) Assume the remaining documents are unjudged Evaluate search engines with sampled documents
34 Comparison of the measures : RMS error
35 Comparison of the measures: Kendall s Tau
36 Inferred ndcg (infndcg) Apply the same methodology to ndcg Es7mate DCG and DCG I separately E[DCG I ]can be computed using the es7mated number of relevant documents (for each relevance grade)
37 DCG as a Random Experiment For each rank i, associate a variable DCG as a random experiment 1. Select a document at random Rank of the document: i 2. Output the value of
38 Es7ma7ng DCG with Incomplete Judgments DCG as a random experiment 1. Select a document at random Rank of the document: i 2. Output the value of Due to proper7es of condi7onal expecta7on,
39 Overall Results: TREC8
40 Overall Results: TREC8
41 Overall Results: TREC10
42 Overall Results: TREC10
43 Conclusions
Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually
More informationModeling Relevance as a Function of Retrieval Rank
Modeling Relevance as a Function of Retrieval Rank Xiaolu Lu 1(B), Alistair Moffat 2, and J. Shane Culpepper 1 1 RMIT University, Melbourne, Australia {xiaolu.lu,shane.culpepper}@rmit.edu.au 2 The University
More informationSearch Engines Chapter 8 Evaluating Search Engines Felix Naumann
Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition
More informationCSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"
CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Evaluation" Evaluation is key to building
More informationRetrieval Evaluation. Hongning Wang
Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User
More informationChapter 8. Evaluating Search Engine
Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can
More informationThe TREC 2006 Terabyte Track
The TREC 2006 Terabyte Track Stefan Büttcher University of Waterloo stefan@buettcher.org Charles L. A. Clarke University of Waterloo claclark@plg.uwaterloo.ca Ian Soboroff NIST ian.soboroff@nist.gov 1
More informationThe TREC 2005 Terabyte Track
The TREC 2005 Terabyte Track Charles L. A. Clarke University of Waterloo claclark@plg.uwaterloo.ca Ian Soboroff NIST ian.soboroff@nist.gov Falk Scholer RMIT fscholer@cs.rmit.edu.au 1 Introduction The Terabyte
More informationAn Active Learning Approach to Efficiently Ranking Retrieval Engines
Dartmouth College Computer Science Technical Report TR3-449 An Active Learning Approach to Efficiently Ranking Retrieval Engines Lisa A. Torrey Department of Computer Science Dartmouth College Advisor:
More informationDealing with Incomplete Judgments in Cascade Measures
ICTIR 7, October 4, 27, Amsterdam, The Netherlands Dealing with Incomplete Judgments in Cascade Measures Kai Hui Max Planck Institute for Informatics Saarbrücken Graduate School of Computer Science Saarbrücken,
More informationNortheastern University in TREC 2009 Web Track
Northeastern University in TREC 2009 Web Track Shahzad Rajput, Evangelos Kanoulas, Virgil Pavlu, Javed Aslam College of Computer and Information Science, Northeastern University, Boston, MA, USA Information
More informationAdvances on the Development of Evaluation Measures. Ben Carterette Evangelos Kanoulas Emine Yilmaz
Advances on the Development of Evaluation Measures Ben Carterette Evangelos Kanoulas Emine Yilmaz Information Retrieval Systems Match information seekers with the information they seek Why is Evaluation
More informationRMIT University at TREC 2006: Terabyte Track
RMIT University at TREC 2006: Terabyte Track Steven Garcia Falk Scholer Nicholas Lester Milad Shokouhi School of Computer Science and IT RMIT University, GPO Box 2476V Melbourne 3001, Australia 1 Introduction
More informationEvaluating Search Engines by Modeling the Relationship Between Relevance and Clicks
Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks Ben Carterette Center for Intelligent Information Retrieval University of Massachusetts Amherst Amherst, MA 01003 carteret@cs.umass.edu
More informationOverview of the TREC 2013 Crowdsourcing Track
Overview of the TREC 2013 Crowdsourcing Track Mark D. Smucker 1, Gabriella Kazai 2, and Matthew Lease 3 1 Department of Management Sciences, University of Waterloo 2 Microsoft Research, Cambridge, UK 3
More informationInformation Retrieval
Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning, Pandu Nayak and Prabhakar Raghavan Evaluation 1 Situation Thanks to your stellar performance in CS276, you
More informationEvaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology
Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationSimulating Simple User Behavior for System Effectiveness Evaluation
Simulating Simple User Behavior for System Effectiveness Evaluation Ben Carterette, Evangelos Kanoulas, Emine Yilmaz carteret@cis.udel.edu, e.kanoulas@shef.ac.uk, {eminey@microsoft.com, eyilmaz@ku.edu.tr}
More informationModern Retrieval Evaluations. Hongning Wang
Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance
More informationNortheastern University in TREC 2009 Million Query Track
Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College
More informationIntroduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010
Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval Lecture 8: Evalua*on 1 Sec. 6.2 This lecture How do we know if our results are any good? Evalua*ng a search engine Benchmarks Precision and recall 2 EVALUATING SEARCH
More informationRobust Linear Regression (Passing- Bablok Median-Slope)
Chapter 314 Robust Linear Regression (Passing- Bablok Median-Slope) Introduction This procedure performs robust linear regression estimation using the Passing-Bablok (1988) median-slope algorithm. Their
More informationPerformance Measures for Multi-Graded Relevance
Performance Measures for Multi-Graded Relevance Christian Scheel, Andreas Lommatzsch, and Sahin Albayrak Technische Universität Berlin, DAI-Labor, Germany {christian.scheel,andreas.lommatzsch,sahin.albayrak}@dai-labor.de
More informationMillion Query Track 2008 Overview
Million Query Track 2008 Overview James Allan*, Javed A. Aslam, Ben Carterette, Virgil Pavlu, Evangelos Kanoulas * Center for Intelligent Information Retrieval, Department of Computer Science University
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and
More informationAn Uncertainty-aware Query Selection Model for Evaluation of IR Systems
An Uncertainty-aware Query Selection Model for Evaluation of IR Systems Mehdi Hosseini, Ingemar JCox Computer Science Department University College London, UK {mhosseini, ingemar}@csuclacuk Nataša Milić-Frayling,
More informationComparison of Means: The Analysis of Variance: ANOVA
Comparison of Means: The Analysis of Variance: ANOVA The Analysis of Variance (ANOVA) is one of the most widely used basic statistical techniques in experimental design and data analysis. In contrast to
More informationEvaluation of Retrieval Systems
Performance Criteria Evaluation of Retrieval Systems. Expressiveness of query language Can query language capture information needs? 2. Quality of search results Relevance to users information needs 3.
More informationInformation Retrieval
Introduction to Information Retrieval Evaluation Rank-Based Measures Binary relevance Precision@K (P@K) Mean Average Precision (MAP) Mean Reciprocal Rank (MRR) Multiple levels of relevance Normalized Discounted
More informationMachine Learning: An Applied Econometric Approach Online Appendix
Machine Learning: An Applied Econometric Approach Online Appendix Sendhil Mullainathan mullain@fas.harvard.edu Jann Spiess jspiess@fas.harvard.edu April 2017 A How We Predict In this section, we detail
More informationEvaluation of Retrieval Systems
Performance Criteria Evaluation of Retrieval Systems 1 1. Expressiveness of query language Can query language capture information needs? 2. Quality of search results Relevance to users information needs
More informationA Random Number Based Method for Monte Carlo Integration
A Random Number Based Method for Monte Carlo Integration J Wang and G Harrell Department Math and CS, Valdosta State University, Valdosta, Georgia, USA Abstract - A new method is proposed for Monte Carlo
More informationExperiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India
Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India rroy@adobe.com 2014 Adobe Systems Incorporated. All Rights Reserved. 1 Introduction
More informationMillion Query Track 2008 Overview
Million Query Track 2008 Overview James Allan, Javed A. Aslam, Ben Carterette, Virgil Pavlu, Evangelos Kanoulas Department of Computer Science University of Massachusetts Amherst, Amherst, Massachusetts
More informationCourse work. Today. Last lecture index construc)on. Why compression (in general)? Why compression for inverted indexes?
Course work Introduc)on to Informa(on Retrieval Problem set 1 due Thursday Programming exercise 1 will be handed out today CS276: Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan
More informationOn Duplicate Results in a Search Session
On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this
More informationCS60092: Informa0on Retrieval
Introduc)on to CS60092: Informa0on Retrieval Sourangshu Bha1acharya Last lecture index construc)on Sort- based indexing Naïve in- memory inversion Blocked Sort- Based Indexing Merge sort is effec)ve for
More informationCS47300: Web Information Search and Management
CS47300: Web Information Search and Management Prof. Chris Clifton 27 August 2018 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 AD-hoc IR: Basic Process Information
More informationActive Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract)
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Active Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract) Christoph Sawade sawade@cs.uni-potsdam.de
More informationEvaluation of Retrieval Systems
Performance Criteria Evaluation of Retrieval Systems. Expressiveness of query language Can query language capture information needs? 2. Quality of search results Relevance to users information needs 3.
More informationE-Campus Inferential Statistics - Part 2
E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence
More informationUMass at TREC 2017 Common Core Track
UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer
More informationQuaero at TRECVid 2012: Semantic Indexing
Quaero at TRECVid 2012: Semantic Indexing Bahjat Safadi 1, Nadia Derbas 1, Abdelkader Hamadi 1, Franck Thollard 1, Georges Quénot 1, Jonathan Delhumeau 2, Hervé Jégou 2, Tobias Gehrig 3, Hazim Kemal Ekenel
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and
More informationDo User Preferences and Evaluation Measures Line Up?
Do User Preferences and Evaluation Measures Line Up? Mark Sanderson, Monica Lestari Paramita, Paul Clough, Evangelos Kanoulas Department of Information Studies, University of Sheffield Regent Court, 211
More informationStatistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals.
Statistics I 2011/2012 Notes about the third Computer Class: Simulation of samples and goodness of fit; Central Limit Theorem; Confidence intervals. In this Computer Class we are going to use Statgraphics
More informationAssignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system
Retrieval System Evaluation W. Frisch Institute of Government, European Studies and Comparative Social Science University Vienna Assignment 1 How did you select the search engines? How did you find the
More informationCSCI 599 Class Presenta/on. Zach Levine. Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates
CSCI 599 Class Presenta/on Zach Levine Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates April 26 th, 2012 Topics Covered in this Presenta2on A (Brief) Review of HMMs HMM Parameter Learning Expecta2on-
More informationEvaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology
Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2015 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Indexing Process Indexes Indexes are data structures designed to make search faster Text search
More informationExamining the Authority and Ranking Effects as the result list depth used in data fusion is varied
Information Processing and Management 43 (2007) 1044 1058 www.elsevier.com/locate/infoproman Examining the Authority and Ranking Effects as the result list depth used in data fusion is varied Anselm Spoerri
More informationDo user preferences and evaluation measures line up?
Do user preferences and evaluation measures line up? Mark Sanderson, Monica Lestari Paramita, Paul Clough, Evangelos Kanoulas Department of Information Studies, University of Sheffield Regent Court, 211
More informationThree Questions about Clinical Information Retrieval
Three Questions about Clinical Information Retrieval Stephen Wu, James Masanz, Ravikumar K.E., Hongfang Liu Mayo Clinic, Rochester, MN 1 Introduction Electronic Medical Records (EMRs) have greatly expanded
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Retrieval Models Provide a mathema1cal framework for defining the search process includes
More informationAn Exploration of Query Term Deletion
An Exploration of Query Term Deletion Hao Wu and Hui Fang University of Delaware, Newark DE 19716, USA haowu@ece.udel.edu, hfang@ece.udel.edu Abstract. Many search users fail to formulate queries that
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique
More informationTowards Process Understanding:
Towards Process Understanding: sta2s2cal analysis applied to the manufacturing process of tablets Drug Product Development: A QbD Approach Nadia Bou-Chacra Faculty of Pharmaceutical Sciences University
More informationMATH : EXAM 3 INFO/LOGISTICS/ADVICE
MATH 3342-004: EXAM 3 INFO/LOGISTICS/ADVICE INFO: WHEN: Friday (04/22) at 10:00am DURATION: 50 mins PROBLEM COUNT: Appropriate for a 50-min exam BONUS COUNT: At least one TOPICS CANDIDATE FOR THE EXAM:
More informationMULTI-DIMENSIONAL MONTE CARLO INTEGRATION
CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1 Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals
More informationMillion Query Track 2009 Overview
Million Query Track 29 Overview Ben Carterette, Virgil Pavlu, Hui Fang, Evangelos Kanoulas The Million Query Track ran for the third time in 29. The track is designed to serve two purposes: first, it is
More informationDiversification of Query Interpretations and Search Results
Diversification of Query Interpretations and Search Results Advanced Methods of IR Elena Demidova Materials used in the slides: Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova,
More informationOverview of the NTCIR-13 OpenLiveQ Task
Overview of the NTCIR-13 OpenLiveQ Task ABSTRACT Makoto P. Kato Kyoto University mpkato@acm.org Akiomi Nishida Yahoo Japan Corporation anishida@yahoo-corp.jp This is an overview of the NTCIR-13 OpenLiveQ
More informationRelevance and Effort: An Analysis of Document Utility
Relevance and Effort: An Analysis of Document Utility Emine Yilmaz, Manisha Verma Dept. of Computer Science University College London {e.yilmaz, m.verma}@cs.ucl.ac.uk Nick Craswell, Filip Radlinski, Peter
More informationCSCI 5417 Information Retrieval Systems. Jim Martin!
CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 7 9/13/2011 Today Review Efficient scoring schemes Approximate scoring Evaluating IR systems 1 Normal Cosine Scoring Speedups... Compute the
More informationRishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India. Monojit Choudhury and Srivatsan Laxman Microsoft Research India India
Rishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India Monojit Choudhury and Srivatsan Laxman Microsoft Research India India ACM SIGIR 2012, Portland August 15, 2012 Dividing a query into individual semantic
More informationInformation Retrieval and Web Search
Information Retrieval and Web Search IR Evaluation and IR Standard Text Collections Instructor: Rada Mihalcea Some slides in this section are adapted from lectures by Prof. Ray Mooney (UT) and Prof. Razvan
More informationQuasi-Monte Carlo Methods Combating Complexity in Cost Risk Analysis
Quasi-Monte Carlo Methods Combating Complexity in Cost Risk Analysis Blake Boswell Booz Allen Hamilton ISPA / SCEA Conference Albuquerque, NM June 2011 1 Table Of Contents Introduction Monte Carlo Methods
More informationPredicting Query Performance on the Web
Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,
More informationInformation Retrieval. Lecture 7 - Evaluation in Information Retrieval. Introduction. Overview. Standard test collection. Wintersemester 2007
Information Retrieval Lecture 7 - Evaluation in Information Retrieval Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1 / 29 Introduction Framework
More informationInformation Retrieval
Information Retrieval Lecture 7 - Evaluation in Information Retrieval Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 29 Introduction Framework
More informationSemantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman
Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information
More informationModeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track
Modeling Rich Interac1ons in Session Search Georgetown University at TREC 2014 Session Track Jiyun Luo, Xuchu Dong and Grace Hui Yang Department of Computer Science Georgetown University Introduc:on Session
More informationFractional Similarity : Cross-lingual Feature Selection for Search
: Cross-lingual Feature Selection for Search Jagadeesh Jagarlamudi University of Maryland, College Park, USA Joint work with Paul N. Bennett Microsoft Research, Redmond, USA Using All the Data Existing
More informationAdvanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University
Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University http://disa.fi.muni.cz The Cranfield Paradigm Retrieval Performance Evaluation Evaluation Using
More informationA Comparative Analysis of Cascade Measures for Novelty and Diversity
A Comparative Analysis of Cascade Measures for Novelty and Diversity Charles Clarke, University of Waterloo Nick Craswell, Microsoft Ian Soboroff, NIST Azin Ashkan, University of Waterloo Background Measuring
More informationExploiting Index Pruning Methods for Clustering XML Collections
Exploiting Index Pruning Methods for Clustering XML Collections Ismail Sengor Altingovde, Duygu Atilgan and Özgür Ulusoy Department of Computer Engineering, Bilkent University, Ankara, Turkey {ismaila,
More informationSearch Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]
Search Evaluation Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Table of Content Search Engine Evaluation Metrics for relevancy Precision/recall F-measure MAP NDCG Difficulties in Evaluating
More informationInforma(on Retrieval
Introduc)on to Informa(on Retrieval CS276 Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 8: Evalua)on Sec. 6.2 This lecture How do we know if our results are any good? Evalua)ng
More informationThis lecture. Measures for a search engine EVALUATING SEARCH ENGINES. Measuring user happiness. Measures for a search engine
Sec. 6.2 Introduc)on to Informa(on Retrieval CS276 Informa)on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 8: Evalua)on This lecture How do we know if our results are any good? Evalua)ng
More informationRepresentative & Informative Query Selection for Learning to Rank using Submodular Functions
Representative & Informative Query Selection for Learning to Rank using Submodular Functions Rishabh Mehrotra Dept of Computer Science University College London, UK r.mehrotra@cs.ucl.ac.uk Emine Yilmaz
More informationEvaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology
Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)
More informationHow do we obtain reliable estimates of performance measures?
How do we obtain reliable estimates of performance measures? 1 Estimating Model Performance How do we estimate performance measures? Error on training data? Also called resubstitution error. Not a good
More informationChapter 7b - Point Estimation and Sampling Distributions
Chater 7b - Point Estimation and Samling Distributions Chater 7 (b) Point Estimation and Samling Distributions Point estimation is a form of statistical inference. In oint estimation we use the data from
More informationMachine Learning Crash Course: Part I
Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec
More informationExamining the Limits of Crowdsourcing for Relevance Assessment
Examining the Limits of Crowdsourcing for Relevance Assessment Paul Clough: Information School, University of Sheffield, UK; p.d.clough@sheffield.ac.uk; +441142222664 Mark Sanderson: School of Computer
More informationIncreasing evaluation sensitivity to diversity
Inf Retrieval (2013) 16:530 555 DOI 10.1007/s10791-012-9218-8 SEARCH INTENTS AND DIVERSIFICATION Increasing evaluation sensitivity to diversity Peter B. Golbus Javed A. Aslam Charles L. A. Clarke Received:
More informationInformation Search in Web Archives
Information Search in Web Archives Miguel Costa Advisor: Prof. Mário J. Silva Co-Advisor: Prof. Francisco Couto Department of Informatics, Faculty of Sciences, University of Lisbon PhD thesis defense,
More informationIntroduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization. Wolfram Burgard
Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard 1 Motivation Recall: Discrete filter Discretize the continuous state space High memory complexity
More informationInformation Retrieval
Introduction to Information Retrieval Lecture 5: Evaluation Ruixuan Li http://idc.hust.edu.cn/~rxli/ Sec. 6.2 This lecture How do we know if our results are any good? Evaluating a search engine Benchmarks
More informationRobust Relevance-Based Language Models
Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new
More informationTREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback
RMIT @ TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback Ameer Albahem ameer.albahem@rmit.edu.au Lawrence Cavedon lawrence.cavedon@rmit.edu.au Damiano
More informationTHIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user
EVALUATION Sec. 6.2 THIS LECTURE How do we know if our results are any good? Evaluating a search engine Benchmarks Precision and recall Results summaries: Making our good results usable to a user 2 3 EVALUATING
More informationAmol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn
Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Mo>va>on: Parallel Query Processing Increasing parallelism in compu>ng Shared nothing clusters, mul> core technology,
More informationCS6322: Information Retrieval Sanda Harabagiu. Lecture 13: Evaluation
Sanda Harabagiu Lecture 13: Evaluation Sec. 6.2 This lecture How do we know if our results are any good? Evaluating a search engine Benchmarks Precision and recall Results summaries: Making our good results
More informationChapters 5-6: Statistical Inference Methods
Chapters 5-6: Statistical Inference Methods Chapter 5: Estimation (of population parameters) Ex. Based on GSS data, we re 95% confident that the population mean of the variable LONELY (no. of days in past
More informationComparing the Sensitivity of Information Retrieval Metrics
Comparing the Sensitivity of Information Retrieval Metrics Filip Radlinski Microsoft Cambridge, UK filiprad@microsoft.com Nick Craswell Microsoft Redmond, WA, USA nickcr@microsoft.com ABSTRACT Information
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Query Process Retrieval Models Provide a mathema.cal framework for defining the search process
More informationLearning Concept Importance Using a Weighted Dependence Model
Learning Concept Importance Using a Weighted Dependence Model Michael Bendersky Dept. of Computer Science University of Massachusetts Amherst, MA bemike@cs.umass.edu Donald Metzler Yahoo! Labs 4401 Great
More informationBluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition
Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created
More information