Information Retrieval Test Collections Gintarė Grigonytė gintare@ling.su.se Department of Linguistics and Philology Uppsala University Slides based on previous IR course given by K.F. Heppin 2013-15 and Introduction to Information Retrieval slides https://nlp.stanford.edu/ir-book/ppt/ Gintarė Grigonytė 1/65
Overview User queries The Cranfield paradigm Test collections Indri query language Gintarė Grigonytė 2/65
Information need An information need is the underlying cause of the query that a person submits to a search engine Can be categorized using variety of dimensions: type of information needed domain > subject question that needs to be answered level of expertise: professional layperson type of task that led to the requirement for information: are you writing a paper, are you preparing for a meeting or for an exam Gintarė Grigonytė 3/65
Topic - formal information need <top> <num> C208 </num> <EN-title> "Sophie s World" </EN-title> <EN-desc> Find documents about the editorial success of the book "Sophie s World" by Jostein Gaarder. </EN-desc> <EN-narr> Relevant documents should describe the topic of "Sophie s World", and should mention its sales success. </EN-narr> </top> Gintarė Grigonytė 4/65
Queries A query is the formulation of a user information need put to the system. Keyword based queries are popular, since they are intuitive, easy to express, and allow for fast ranking. However, a query can also be a more complex combination of operations using different kinds of operators. Gintarė Grigonytė 5/65
Information need vs. queries Gintarė Grigonytė 6/65
User queries The term query is used both for the input the user gives to the system and for the modified version of this which the system uses for the matching with the index terms. In Web search, the user writes a key word based query in the field for search keys. Which the system translates into a more complex form for the matching process. Gintarė Grigonytė 7/65
Intermediary queries Query languages in the past were designed for professional searchers intermediaries. The user would input a natural language query to the intermediary, who translated it to a query the system could interpret. Gintarė Grigonytė 8/65
Keyword queries The result of key word queries for most retrieval models is the set of documents containing at least one of the words of the query The resulting documents are ranked according to the degree of similarity with respect to the query How the ranking is done depends on the retrieval model. Gintarė Grigonytė 9/65
Query in Indri query language Gintarė Grigonytė 10/65
Query modification Gintarė Grigonytė 11/65
Query expansion Gintarė Grigonytė 12/65
Facets Gintarė Grigonytė 13/65
Visualizing facets with Boolean syntax Gintarė Grigonytė 14/65
Luhn s Significant terms H. P. Luhn, The automatic creation of literature abstracts, IBM Journal of Research and Development, v.2 n.2, p.159-165, April 1958 [doi>10.1147/rd.22.0159] Gintarė Grigonytė 15/65
How do we know which searches are good? Gintarė Grigonytė 16/65
Use test collection Gintarė Grigonytė 17/65
Feedback The user indicates, consciously or unconsciously, which documents are relevant to their query or to indicate which terms extracted from those documents are relevant. The user or the system then constructs a new query from this information by: Boosting weights of terms from relevant documents Adding terms from relevant documents to the query Idea: you may not know what you re looking for, but you ll know when you see it Gintarė Grigonytė 18/65
Test collection environment To test and compare search strategies you need a laboratory environment that doesn t change a test collection determine how well IR systems perform compare the performance of the IR system with that of other systems compare search alghoritms compare search strategies Gintarė Grigonytė 19/65
The Cranfield Paradigm Evaluation of IR systems is the result of early experimentation initiated by Cyril Cleverdon Cleverdon started a series of projects, called the Cranfield projects, in 1957 that lasted for about 10 years in which he and his colleagues set the stage for information retrieval research. In the Cranfield project, retrieval experiments were conducted on test databases in a controlled, laboratorylike setting. The Cranfield projects provided a foundation for the evaluation of IR systems Gintarė Grigonytė 20/65
The laboratory environment Experiments at the Cranfield College of Aeronautics The objective was to study what kinds of indexing languages were most effective. At this time documents were indexed manually with a few keywords from controlled vocabularies/thesauri 1 100 documents of research in metallurgy Small enough to have every document assessed for relevance to every topic A database with, for the time, a large set of documents A set of information needs expressed in plain text A relevance judgement for every document in relation to every information need Gintarė Grigonytė 21/65
Recall, precision Gintarė Grigonytė 22/65
Implications for evaluation Model a real user application, with realistic information needs. Collect enough documents and create enough topics to allow significant testing on results. Make relevance judgments before the experiments, which prevents human bias and enables re-usability. Run strategy A and B Evaluate A and B using appropriate metrics Compare A with B statistically State whether A works better than B, A and B are equivalent, or B works better than A Gintarė Grigonytė 23/65
Test collection To test and compare strategies a test collection is needed. A test collection is a laboratory testbed representing the real world. A test collection consists of: A static set of documents A set of information needs/topics A set of known relevant documents for each of the information needs Gintarė Grigonytė 24/65
Test collection based IR evaluation System function separate relevant from non-relevant documents rank relevant above non-relevant documents rank highly relevant above less relevant documents Pupose of evaluation decide how well a system performs the function above determine the best system/queries/algorithms Gintarė Grigonytė 25/65
Why not any other way? Should users be involved? Why? Why not to use web search engine? Gintarė Grigonytė 26/65
Gintarė Grigonytė 27/65
Gintarė Grigonytė 28/65
Gintarė Grigonytė 29/65
Gintarė Grigonytė 30/65
Gintarė Grigonytė 31/65
Gintarė Grigonytė 32/65
Gintarė Grigonytė 33/65
Gintarė Grigonytė 34/65
Gintarė Grigonytė 35/65
Gintarė Grigonytė 36/65
Gintarė Grigonytė 37/65
Gintarė Grigonytė 38/65
Gintarė Grigonytė 39/65
Gintarė Grigonytė 40/65
Gintarė Grigonytė 41/65
Precision/Recall Gintarė Grigonytė 42/65
Indri query language terms field restrictions numeric combining beliefs field/passage retrieval filters Quick referece: https://www.lemurproject.org/lemur/indriquerylanguage.php Reference: https://sourceforge.net/p/lemur/wiki/ Gintarė Grigonytė 43/65
Gintarė Grigonytė 44/65
Gintarė Grigonytė 45/65
Gintarė Grigonytė 46/65
Gintarė Grigonytė 47/65
Gintarė Grigonytė 48/65
Gintarė Grigonytė 49/65
Gintarė Grigonytė 50/65
Gintarė Grigonytė 51/65
Gintarė Grigonytė 52/65
Gintarė Grigonytė 53/65
Gintarė Grigonytė 54/65
Gintarė Grigonytė 55/65
Gintarė Grigonytė 56/65
Gintarė Grigonytė 57/65
Gintarė Grigonytė 58/65
Gintarė Grigonytė 59/65
Gintarė Grigonytė 60/65
Gintarė Grigonytė 61/65
Gintarė Grigonytė 62/65
Gintarė Grigonytė 63/65
Gintarė Grigonytė 64/65
Next Lab 2 building index from document collection querying with Indri query language evaluation Gintarė Grigonytė 65/65