Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Similar documents
Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

Information Retrieval. (M&S Ch 15)

Information Retrieval. Lecture 7

Search Engine Architecture II

Evaluation. David Kauchak cs160 Fall 2009 adapted from:

CSCI 5417 Information Retrieval Systems. Jim Martin!

Chapter 8. Evaluating Search Engine

Information Retrieval

Modern Information Retrieval

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Chapter 6: Information Retrieval and Web Search. An introduction

THIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user

Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology

Information Retrieval

Full-Text Indexing For Heritrix

Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology

Information Retrieval and Web Search

Assignment 1. Assignment 2. Relevance. Performance Evaluation. Retrieval System Evaluation. Evaluate an IR system

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Information Retrieval

Information Retrieval. Lecture 7 - Evaluation in Information Retrieval. Introduction. Overview. Standard test collection. Wintersemester 2007

A New Measure of the Cluster Hypothesis

Page Title is one of the most important ranking factor. Every page on our site should have unique title preferably relevant to keyword.

ResPubliQA 2010

A Document Graph Based Query Focused Multi- Document Summarizer

CS6322: Information Retrieval Sanda Harabagiu. Lecture 13: Evaluation

Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology

Information Retrieval CSCI

Ryen W. White, Matthew Richardson, Mikhail Bilenko Microsoft Research Allison Heath Rice University

Information Retrieval

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

Information Retrieval

Query Processing and Alternative Search Structures. Indexing common words

A Security Model for Multi-User File System Search. in Multi-User Environments

CS 533 Information Retrieval Systems

Introduction to Information Retrieval

VK Multimedia Information Systems

Information Retrieval

Recommender Systems 6CCS3WSN-7CCSMWAL

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Contents. Access Content Library Scroll over the Services Tab and select Content Library.

Search Engine Architecture. Hongning Wang

Design and Implementation of Search Engine Using Vector Space Model for Personalized Search

ΕΠΛ660. Ανάκτηση µε το µοντέλο διανυσµατικού χώρου

Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University

Midterm Exam Search Engines ( / ) October 20, 2015

Making Good Better SEO & CRO

Part 7: Evaluation of IR Systems Francesco Ricci

Chapter 6. Queries and Interfaces

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

Sec. 8.7 RESULTS PRESENTATION

SEO and UAEX.EDU GETTING YOUR WEB PAGES FOUND IN GOOGLE

Today we show how a search engine works

Query Refinement and Search Result Presentation


Combining Information Retrieval and Relevance Feedback for Concept Location

Models for Document & Query Representation. Ziawasch Abedjan

Turning Lookers into Bookers!

Information Retrieval

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

Informa/on Retrieval. Text Search. CISC437/637, Lecture #23 Ben CartereAe. Consider a database consis/ng of long textual informa/on fields

DMI Exam PDDM Professional Diploma in Digital Marketing Version: 7.0 [ Total Questions: 199 ]

CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc.

Through emarketing and Social Media

Boolean Queries. Keywords combined with Boolean operators:

Countering Spam Using Classification Techniques. Steve Webb Data Mining Guest Lecture February 21, 2008

Introduction to Information Retrieval (Manning, Raghavan, Schutze)

Session 10: Information Retrieval

Chapter III.2: Basic ranking & evaluation measures

CSE 494: Information Retrieval, Mining and Integration on the Internet

CS47300: Web Information Search and Management

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al.

Heading-aware Snippet Generation for Web Search

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

Web Information Retrieval. Exercises Evaluation in information retrieval

Federated Text Search

Exam in course TDT4215 Web Intelligence - Solutions and guidelines - Wednesday June 4, 2008 Time:

Information Retrieval

Retrieval Evaluation. Hongning Wang

Multimedia Information Extraction and Retrieval Term Frequency Inverse Document Frequency

A Taxonomy of Web Search

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

The Stanford/Technicolor/Fraunhofer HHI Video Semantic Indexing System

Querying Introduction to Information Retrieval INF 141 Donald J. Patterson. Content adapted from Hinrich Schütze

Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India

Corpus Acquisition from the Interwebs. Christian Buck, University of Edinburgh

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Question No : 1 Web spiders carry out a key function within search. What is it? Choose one of the following:

Indexing: Part IV. Announcements (February 17) Keyword search. CPS 216 Advanced Database Systems

Chapter 2. Architecture of a Search Engine

Sparse Feature Learning

SYSTEMS FOR NON STRUCTURED INFORMATION MANAGEMENT

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

CS54701: Information Retrieval

Classification: Feature Vectors

Proceedings of NTCIR-9 Workshop Meeting, December 6-9, 2011, Tokyo, Japan

Optimal Query. Assume that the relevant set of documents C r. 1 N C r d j. d j. Where N is the total number of documents.

More on indexing CE-324: Modern Information Retrieval Sharif University of Technology

On the automatic classification of app reviews

Transcription:

Ranked Retrieval One option is to average the precision scores at discrete Precision 100% 0% More junk 100% Everything points on the ROC curve But which points? Recall We want to evaluate the system, not the corpus So it can t be based on number 0% ROC Curve Top 1000 Results Top 100 Results Top 10 Results Top 1000000 Results of documents returned Less junk

Ranked Retrieval - 11 point precision Evaluate based on precision at defined recall points Average the precision at 11 points This can be compared across corpora because it isn t based on corpus size or number of Precision 100% 0% More junk 100% 90% 80% 70% 60% Recall 50% 40% 30% 20% 10% 0% Less junk results returned

Ranked Retrieval - Mean Average Precision Why just 11 points? Why not average over all points? This is roughly equivalent to measuring the area under the curve. Recall Precision 100% 0% More junk 100% 0% Less junk

Ranked Retrieval - Precision at k Users don t care about results past a page or two So area under the curve is too naive. Let s evaluate precision with k results instead. 100% Precision 100% 0% More junk Recall Top K results could fall anywhere Highly dependent on number of relevant documents If k is 20 and relevant docs is 8 best score is 8/(8+12) =0.4 0% Less junk

Ranked Retrieval - Precision at R We know the number of relevant documents, r, so rather than looking at k results let s look at the top r results 100% Precision 100% 0% More junk If r is 20 best score is 20/(20) =1.0 best score is always 1.0 Recall 0% Precision at R Less junk

Ranked Retrieval - Precision at R It turns out that Precision at R is the break-even point When Precision and Recall are equal Do we care about this point for any rational reason? Recall Precision 100% 0% More junk 100% Precision at R 0% Less junk

Critiques of relevance Is the relevance of one document independent of another? Is a gold standard possible? Is a gold standard static? Uniform? Binary? Perhaps relevance as a ranking is better. Relevance versus marginal relevance what does another document add?

Refining a deployed system Once you have a system, with metrics, how do you consider changing the system to improve the metrics? A common approach is A/B testing. This is done by Google for clients and Amazon for itself and probably many others. The idea: Treat a small number of your users as experiments. Have them use the different system. Evaluate metrics on experimental group.

Evaluation in IR Gold standard approach Corpus Queries Gold Standard Standard Search Engine Proposed Alternative Search Engine Collect Metrics Collect Metrics Statistical Comparison of Metrics Experimental Validation or Reject of Alternative

Online A/B approach User Base Requires users an infrastructure to support Group A majority of traffic experimental traffic Group B testing Standard Search Engine Proposed Alternative Search Engine metrics that don t require a gold standard Collect Metrics Collect Metrics Statistical Comparison of Metrics Experimental Validation or Reject of Alternative

Amazon

Amazon

Amazon

Amazon

Google

Google

Snippets Little bits of text that summarize the page They function as an implicit tool for users to rank the results on their own (among those visible) The user does the final ranking Users are still biased by presented order though.

Snippets The goal of snippet generation is present the most informative bit of a document in light of the query present something which is self-contained i.e., a clause or a sentence present something short enough to fit in output be fast, accurate (where are the snippets stored?) Challenges Multiple occurrences of keyword in document Poor English (or other language) grammar

Snippets Snippets can be static A snippet for a web page is precompiled and always the same. Snippets can be dynamic Depends on the query informatics informatics definition

Snippets Snippets may contain A few sentences from the web page Meta data about the page Author, Date, Title Output of a text-summarization algorithm Advanced technology that attempts to write snippets Images from the document