Information Retrieval

Similar documents
WebSci and Learning to Rank for IR

Search Engines and Learning to Rank

Fall Lecture 16: Learning-to-rank

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Ranking with Query-Dependent Loss for Web Search

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,

arxiv: v1 [stat.ap] 14 Mar 2018

LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang

One-Pass Ranking Models for Low-Latency Product Recommendations

Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER

A Few Things to Know about Machine Learning for Web Search

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia

UMass at TREC 2017 Common Core Track

Learning to Rank for Faceted Search Bridging the gap between theory and practice

arxiv: v1 [cs.ir] 16 Oct 2017

Northeastern University in TREC 2009 Million Query Track

Feature selection. LING 572 Fei Xia

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data

A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising

Linking Entities in Tweets to Wikipedia Knowledge Base

Entity and Knowledge Base-oriented Information Retrieval

Arama Motoru Gelistirme Dongusu: Siralamayi Ogrenme ve Bilgiye Erisimin Degerlendirilmesi. Retrieval Effectiveness and Learning to Rank

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

RankDE: Learning a Ranking Function for Information Retrieval using Differential Evolution

Apache Solr Learning to Rank FTW!

Lizhe Sun. November 17, Florida State University. Ranking in Statistics and Machine Learning. Lizhe Sun. Introduction

Learning Non-linear Ranking Functions for Web Search using Probabilistic Model Building GP

Learning to Rank: A New Technology for Text Processing

Structured Ranking Learning using Cumulative Distribution Networks

Optimizing Search Engines using Click-through Data

Learning to Rank Only Using Training Data from Related Domain

Learning to Rank with Deep Neural Networks

Information Retrieval

for Searching Social Media Posts

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

Information Retrieval Lecture 4: Web Search. Challenges of Web Search 2. Natural Language and Information Processing (NLIP) Group

Automatic Summarization

Learning to Rank for Information Retrieval

Chapter 27 Introduction to Information Retrieval and Web Search

Modern Retrieval Evaluations. Hongning Wang

Learning Temporal-Dependent Ranking Models

Predicting Query Performance on the Web

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track

Performance Measures for Multi-Graded Relevance

Graph mining assisted semi-supervised learning for fraudulent cash-out detection

Collective classification in network data

Advanced Click Models & their Applications to IR

Lecture 3: Improving Ranking with

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Information Retrieval Spring Web retrieval

Query Independent Scholarly Article Ranking

WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data

Introduction to Information Retrieval

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

RSDC 09: Tag Recommendation Using Keywords and Association Rules

Learning to Reweight Terms with Distributed Representations

Balancing Speed and Quality in Online Learning to Rank for Information Retrieval

A Formal Approach to Score Normalization for Meta-search

Link Analysis and Web Search

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Chapter 6: Information Retrieval and Web Search. An introduction

A General Approximation Framework for Direct Optimization of Information Retrieval Measures

IITH at CLEF 2017: Finding Relevant Tweets for Cultural Events

Information Retrieval

arxiv: v1 [cs.ir] 19 Sep 2016

Personalized Web Search

CPSC 340: Machine Learning and Data Mining. Ranking Fall 2016

Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track

A Comparing Pointwise and Listwise Objective Functions for Random Forest based Learning-to-Rank

An Investigation of Basic Retrieval Models for the Dynamic Domain Task

Information Retrieval. Lecture 7 - Evaluation in Information Retrieval. Introduction. Overview. Standard test collection. Wintersemester 2007

Information Retrieval

University of Delaware at Diversity Task of Web Track 2010

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

On the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled Collections

Developing Focused Crawlers for Genre Specific Search Engines

A Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search

Adapting Ranking Functions to User Preference

A Machine Learning Approach for Improved BM25 Retrieval

Multimedia Information Systems

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

CS6220: DATA MINING TECHNIQUES

Diversification of Query Interpretations and Search Results

Part 11: Collaborative Filtering. Francesco Ricci

Proximity Prestige using Incremental Iteration in Page Rank Algorithm

VisoLink: A User-Centric Social Relationship Mining

Information Retrieval May 15. Web retrieval

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

Mining Web Data. Lijun Zhang

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.

Automatically Building Research Reading Lists

Mining Web Data. Lijun Zhang

A Deep Relevance Matching Model for Ad-hoc Retrieval

LEARNING to rank is a kind of learning based information

Ranking Algorithms For Digital Forensic String Search Hits

Multi-label classification using rule-based classifier systems

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

From Passages into Elements in XML Retrieval

Transcription:

Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1

Course overview Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov i.markov@uva.nl Information Retrieval 2

This lecture Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov i.markov@uva.nl Information Retrieval 3

Outline 1 Current trends in IR 2 Ilya Markov i.markov@uva.nl Information Retrieval 4

IR conferences ACM Conference on Research and Development in Information Retrieval (SIGIR) ACM Conference on Information Knowledge and Management (CIKM) ACM Conference on Web Search and Data Mining (WSDM) European Conference on Information Retrieval (ECIR) Ilya Markov i.markov@uva.nl Information Retrieval 5

IR journals ACM Transactions o Information Systems (TOIS) Information Retrieval Journal (IRJ) Information Processing and Management (IPM) Ilya Markov i.markov@uva.nl Information Retrieval 6

Surveys Foundations and Trends in Information Retrieval (FnTIR) Synthesis Lectures on Information Concepts, Retrieval, and Services by Morgan&Claypool Publishers Ilya Markov i.markov@uva.nl Information Retrieval 7

SIGIR 2016 Evaluation Efficiency Retrieval models, learning-to-rank, web search Users, user needs, search behavior Novelty and diversity Speech, conversation systems, question answering Recommendation systems Entities and knowledge graphs Ilya Markov i.markov@uva.nl Information Retrieval 8

WSDM 2016 Communities, social interaction, social networks Search and semantics Observing users, leveraging users Big data algorithms Entities and structure Ilya Markov i.markov@uva.nl Information Retrieval 9

Ranking methods 1 Content-based Term-based Semantic 2 Link-based (web search) 3 Ilya Markov i.markov@uva.nl Information Retrieval 10

Outline 1 Current trends in IR 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 11

Machine learning Traditional ML solves a prediction problem (classification or regression) on a single instance at a time. Input {x i } n i=1 Output {y i } n i=1 Learn a model h(x) that optimizes a loss function L(h(x), y) For a new instance x new predict the output y = h(x new ) Ilya Markov i.markov@uva.nl Information Retrieval 12

the training data. This is also highly demanding for real search engines, because everyday these search engines will receive a lot of user feedback and usage logs indicating poor ranking for some queries or documents. It is very important to automatically learn from feedback and constantly improve the ranking mechanism. Due to the aforementioned two characteristics, learning to rank has been widely used in commercial search engines, 13 and has also attracted The aimgreat of LTR attention isfrom to the come academic up research with optimal community. ordering of items, Figure 1.1 shows the typical learning-to-rank flow. From the figure where the relative ordering among the items is more important we can see that since learning to rank is a kind of supervised learning, a training thanset the is needed. exact Thescore creationthat of a training eachset item is verygets. similar to T.-Y. Liu, Learning to Rank for Information Retrieval Fig. 1.1 Learning-to-rank framework. Ilya Markov i.markov@uva.nl Information Retrieval 13

Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 14

Machine learning Input {x i } n i=1 Output {y i } n i=1 Learn a model h(x) that optimizes a loss function L(h(x), y) Examples Linear model h(x i ) = w T x i = lk=1 w kx ik Quadratic loss function L(h(x i ), y i ) = h(x i ) y i 2 How to learn the model h(x), i.e., how to estimate its parameters? Ilya Markov i.markov@uva.nl Information Retrieval 15

Learning the model h(x) If there is a closed form solution for the parameters of h(x) 1 Compute the derivative of the loss function L with respect to some parameter w k 2 L Equate this derivative to zero: w k = 0 3 Find the optimal value of the parameter w k If there is no closed form solution, use gradient descent 1 Compute or approximate the gradient of the loss function [ L = parameters L w 1,..., L w l ] using the current values of the 2 Update the model parameters by taking a small step in the opposite direction of the gradient: w w η L Ilya Markov i.markov@uva.nl Information Retrieval 16

Gradient descent Picture taken from https://en.wikipedia.org/wiki/gradient_descent Ilya Markov i.markov@uva.nl Information Retrieval 17

Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 18

been widely used in commercial search engines, 13 and has also attracted great attention from the academic research community. Figure 1.1 shows the typical learning-to-rank flow. From the figure we can see that since learning to rank is a kind of supervised learning, a training set is needed. The creation of a training set is very similar to Fig. 1.1 Learning-to-rank T.-Y. Liu, framework. Learning to Rank for Information Retrieval 13 See http://blog.searchenginewatch.com/050622-082709, Ilya Markov http://blogs.msdn.com/msnsearch/archive/2005/06/21/431288.aspx, i.markov@uva.nl Information Retrieval 19

Query-document representation Each query-document pair (q (n), x i ) is represented as a vector of features x (n) i Features Content-based Link-based User-based = [x (n) i1, x (n) i2,..., x (n) il ] Ilya Markov i.markov@uva.nl Information Retrieval 20

document pair, as shown in Table 6.2. For the OHSUMED corpus, 40 features were extracted in total, as shown in Table 6.3. Content-based features Table 6.2 Learning features of TREC. ID Feature description 1 Term frequency (TF) of body 2 TF of anchor 3 TF of title 4 TF of URL 5 TF of whole document 6 Inverse document frequency (IDF) of body 7 IDF of anchor 8 IDF of title 9 IDF of URL 10 IDF of whole document 11 TF*IDF of body 12 TF*IDF of anchor 13 TF*IDF of title 14 TF*IDF of URL 15 TF*IDF of whole document 16 Document length (DL) of body 17 DL of anchor 18 DL of title 19 DL of URL 20 DL of whole document 21 BM25 of body 22 BM25 of anchor 23 BM25 of title 24 BM25 of URL 25 BM25 of whole document T.-Y. 26 LMIR.ABS Liu, Learning of body to Rank for Information Retrieval 27 LMIR.ABS of anchor 28 LMIR.ABS of title Ilya Markov 29 i.markov@uva.nl LMIR.ABS of URL Information Retrieval 21

Link-based features 6.1 The LETOR Collection 291 Table 6.2 (Continued) ID Feature description 40 LMIR.JM of whole document 41 Sitemap based term propagation 42 Sitemap based score propagation 43 Hyperlink base score propagation: weighted in-link 44 Hyperlink base score propagation: weighted out-link 45 Hyperlink base score propagation: uniform out-link 46 Hyperlink base feature propagation: weighted in-link 47 Hyperlink base feature propagation: weighted out-link 48 Hyperlink base feature propagation: uniform out-link 49 HITS authority 50 HITS hub 51 PageRank 52 HostRank 53 Topical PageRank 54 Topical HITS authority 55 Topical HITS hub 56 Inlink number 57 Outlink number 58 Number of slash in URL 59 Length of URL 60 Number of child page 61 BM25 of extracted title 62 LMIR.ABS of extracted title 63 T.-Y. LMIR.DIR Liu, Learning of extracted to Ranktitle for Information Retrieval 64 LMIR.JM of extracted title Table 6.3 Learning features of OHSUMED. Ilya Markov i.markov@uva.nl Information Retrieval 22

User-based features Type of interaction Clicks Time Queries Online metric Click-through rate for (q (n), x i ) Avg. click rank for (q (n), x i ) Avg. dwell time for (q (n), x i ) Avg. time to first click, when this click is on x i Avg. time to last click, when this click is on x i Number of reformulations before/after q (n) Number of times q (n) is abandoned Ilya Markov i.markov@uva.nl Information Retrieval 23

Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 24

been widely used in commercial search engines, 13 and has also attracted great attention from the academic research community. Figure 1.1 shows the typical learning-to-rank flow. From the figure we can see that since learning to rank is a kind of supervised learning, a training set is needed. The creation of a training set is very similar to Fig. 1.1 Learning-to-rank T.-Y. Liu, framework. Learning to Rank for Information Retrieval 13 See http://blog.searchenginewatch.com/050622-082709, Ilya Markov http://blogs.msdn.com/msnsearch/archive/2005/06/21/431288.aspx, i.markov@uva.nl Information Retrieval 25

approaches LambdaMART LambdaRank T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 26

Pointwise LTR query h() Rel(red) Rel(gray) Rel(orange) h(blue) h(yellow) Rel(red) h(green) query h() Re Re Re h(green) pointwise LTR h(white) h( pointwise LTR Ilya Markov i.markov@uva.nl Information Retrieval 27

Pointwise LTR Reduces to traditional ML Input: query-document feature vectors x (n) i = [x (n) i1, x (n) i2,..., x (n) il ] Output: relevance labels y i Objective: learn a model h(x) that correctly predicts labels y Ilya Markov i.markov@uva.nl Information Retrieval 28

Regression Picture taken from https://en.wikipedia.org/wiki/regression_analysis Ilya Markov i.markov@uva.nl Information Retrieval 29

Classification Picture taken from https://en.wikipedia.org/wiki/statistical_classification Ilya Markov i.markov@uva.nl Information Retrieval 30

icit constraints on the thresholds to the optimization problem. Current trends in IR cit constraint simply takes the form of b k 1 b k, while the imp traint Ordinal uses regression redundant training examples to guarantee the ord ionship among thresholds..2 Sum of margin strategy. T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 31

Pointwise LTR Pros + Intuitive interpretation of relevance + Clear, how to get relevance judgements Cons Has a different optimization objective compared to IR (e.g., finding a correct class) Ilya Markov i.markov@uva.nl Information Retrieval 32

Pairwise LTR query query Pref(red>gray) h() Pref(gray>green) g() h() Pref(green>red) h(red>blue) pairwise LTR pairwis Ilya Markov i.markov@uva.nl Information Retrieval 33

RankNet Pointwise scoring function f (x i ) with parameters {w k } l k=1 Pairwise ground-truth P ij = I(x i > x j ) Probability of x i > x j is modeled using logistic regression P ij = P(x i > x j ) = Pairwise loss function (cross entropy) 1 1 + e σ(f i f j ) C = P ij log P ij (1 P ij ) log(1 P ij ) = (1 P ij )σ(f i f j ) + log(1 + e σ(f i f j ) ) RankNet optimizes the total number of pairwise errors Ilya Markov i.markov@uva.nl Information Retrieval 34

RankNet (cont d) Optimize the cost C C f i = σ [ ] 1 (1 P ij ) 1 + e σ(f = C i f j ) f j Update parameter w k of the function f (x i ) ( C f i w k w k η + C ) f j f i w k f j w k Ilya Markov i.markov@uva.nl Information Retrieval 35

Speeding up RankNet training Define λ ij as λ ij = C f i [ ] 1 = σ (1 P ij ) 1 + e σ(f i f j ) Let I denote the set of pairs of indices {i, j}, for which x i should be ranked differently from x j for a given query q I = {i, j x i > x j } Sum all contributions to update parameter w k δw k = η ( ) f i f j λ ij λ ij w k w k {i,j} I = η f i λ ij w k i j:{i,j} I j:{j,i} I f i λ ij w k = η i λ i f i w k Ilya Markov i.markov@uva.nl Information Retrieval 36

Interpreting λ s λ i = λ ij j:{i,j} I j:{j,i} I λ ij λ i is a sum of forces applied to document x i shown for query q All documents x j, that should be ranked below x i, push it up with the force λ ij All documents x j, that should be ranked above x i, push it down with the force λ ij Ilya Markov i.markov@uva.nl Information Retrieval 37

Pairwise LTR Pros + Easy to get preference judgements + Comes closer to optimizing the ranking Cons Still does not optimize the whole ranking Higher computational complexity compared to pointwise LTR Ilya Markov i.markov@uva.nl Information Retrieval 38

Listwise LTR R(blue) R(yellow) R(red) R(green) R(white) q1 NDCG(q1) R(gray) R(red) R(orange) R(white) R(yellow) q2 h() NDCG(q2) q3 R(orange) R(green) R(blue) R(gray) R(red) NDCG(q3) q4 listwise LTR Ilya Markov i.markov@uva.nl Information Retrieval 39

From RankNet From RankNet to to LambdaRank LambdaRank to LambdaMART: An Overview 7 The black arrows denote the RankNet gradients (which increase Fig. 1 A set of urls ordered for a given query using a binary relevance measure. The light gray with the bars represent number urls that of are not pairwise relevant to theerrors) query, while the dark blue bars represent urls that are relevant to the query. Left: the total number of pairwise errors is thirteen. Right: by moving top RankNet url down cost three rank decreases levels, and thefrom bottom relevant 13 url onupthe five, theleft total number to 11 of pairwise on the errorsright has been reduced to eleven. However for IR measures like NDCG and ERR that emphasize the top The actual few results, ranking this is not what gets we want. worse The (black) arrows on the left denote the RankNet gradients (which increase with the number of pairwise errors), whereas what we d really like are the (red) The red arrows arrows on the right. is what we would actually like to see C. Burges, From RankNet to LambdaRank to LambdaMART: An Overview 4 LambdaRank Ilya Markov i.markov@uva.nl Information Retrieval 40

LambdaRank λ ij in RankNet [ ] 1 λ ij = σ (1 P ij ) 1 + e σ(f i f j ) λ ij in LambdaRank λ ij = σ 1 + e σ(f i f j ) NDCG NDCG = NDCG(orig. ranking) NDCG(x i and x j are swapped) LambdaRank directly uses the ranking to compute gradients (i.e., λ ij s) instead of computing and optimizing a cost function Ilya Markov i.markov@uva.nl Information Retrieval 41

LambdaRank (cont d) Proceed similarly to RankNet: Sum all λ ij s for document x i and query q λ i = λ ij j:{i,j} I j:{j,i} I Update parameter w k of the function f (x i ) λ ij w k w k η i λ i f i w k Ilya Markov i.markov@uva.nl Information Retrieval 42

LambdaMART Multiple Additive Regression Trees (MART) MART does not need a cost function but gradients Adopts gradients (λ ij s) from LambdaRank Hence the name: Lambda + MART Ilya Markov i.markov@uva.nl Information Retrieval 43

Listwise LTR Pros + Directly optimizes the whole ranking Cons Needs many judgements High computational complexity Ilya Markov i.markov@uva.nl Information Retrieval 44

Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 45

LEarning TO Rank datasets (LETOR) Query-document pairs precomputed feature vectors Relevance judgements http://research.microsoft.com/en-us/um/beijing/projects/letor/ Ilya Markov i.markov@uva.nl Information Retrieval 46

document and judge whether it is relevant to a given query. Therefore, the pooling strategy as introduced in Section 1 was used [35]. Many research papers [97, 101, 131, 133] have been published using the three tasks on the Gov corpus as their experimental platform. Current trends in IR Historical LTR datasets TREC 2003, 2004 Web IR track 6.1.1.2 Gov The corpus OHSUMED with 1, 053, corpus 110 pages Tasks The OHSUMED TD topic corpus distillation [64] is a subset of MEDLINE, a database on medical HP publications. homepageit finding consists of 348,566 records (out of over 7 million) from NP 270 named medical page finding journals during the years of 1987 1991. Table 6.1 Number of queries in TREC web track. 4 http://trec.nist.gov/tracks.html. Task TREC2003 TREC2004 Topic distillation 50 75 Homepage finding 150 75 Named page finding 150 75 Figure: Number of queries T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 47

Results on NP2003 ListNet 0.360 0.357 0.317 0.360 0.360 0.256 0.223 AdaRank 0.413 0.376 0.328 0.413 0.369 0.249 0.219 SVM map 0.293 0.304 0.291 0.293 0.302 0.247 0.205 le 6.7 Results on NP2003. Algorithm N@1 N@3 N@10 P@1 P@3 P@10 MAP Regression 0.447 0.614 0.665 0.447 0.220 0.081 0.564 RankSVM 0.580 0.765 0.800 0.580 0.271 0.092 0.696 RankBoost 0.600 0.764 0.807 0.600 0.269 0.094 0.707 FRank 0.540 0.726 0.776 0.540 0.253 0.090 0.664 ListNet 0.567 0.758 0.801 0.567 0.267 0.092 0.690 AdaRank 0.580 0.729 0.764 0.580 0.251 0.086 0.678 SVM map 0.560 0.767 0.798 0.560 0.269 0.089 0.687 tp://svmrank.yisongyue.com/svmmap.php T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 48

FRank 0.653 0.743 0.797 0.653 0.289 0.106 0.710 ListNet 0.720 0.813 0.837 0.720 0.320 0.106 0.766 AdaRank 0.733 0.805 0.838 0.733 0.309 0.106 0.771 SVM map 0.713 0.779 0.799 0.713 0.309 0.100 0.742 Current trends in IR Results on HP2004 ble 6.10 Results on HP2004. Algorithm N@1 N@3 N@10 P@1 P@3 P@10 MAP Regression 0.387 0.575 0.646 0.387 0.213 0.08 0.526 RankSVM 0.573 0.715 0.768 0.573 0.267 0.096 0.668 RankBoost 0.507 0.699 0.743 0.507 0.253 0.092 0.625 FRank 0.600 0.729 0.761 0.600 0.262 0.089 0.682 ListNet 0.600 0.721 0.784 0.600 0.271 0.098 0.690 AdaRank 0.613 0.816 0.832 0.613 0.298 0.094 0.722 SVM map 0.627 0.754 0.806 0.627 0.280 0.096 0.718 ble 6.11 Results on OHSUMED. Algorithm N@1 T.-Y. Liu, N@3 Learning N@10 to Rank for Information P@1 Retrieval P@3 P@10 MAP Regression 0.446 0.443 0.411 0.597 0.577 0.466 0.422 RankSVM 0.496 0.421 0.414 0.597 0.543 0.486 0.433 RankBoost 0.463 0.456 0.430 0.558 0.561 0.497 0.441 FRank 0.530 0.481 0.443 0.643 0.593 0.501 0.444 Ilya Markov i.markov@uva.nl Information Retrieval 49

Experimental comparison Listwise ranking algorithms perform very well on most datasets ListNet seems to be better than the others Pairwise ranking algorithms obtain good ranking accuracy on some (although not all) datasets Linear regression performs worse than the pairwise and listwise ranking algorithms T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 50

Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 51

summary Features Content-based Link-based User-based Approaches Pointwise (regression, classification, ordinal regression) Pairwise (RankNet) Listwise (LambdaRank, LambdaMART) Ilya Markov i.markov@uva.nl Information Retrieval 52

Materials Tie-Yan Liu Learning to Rank for Information Retrieval Foundations and Trends in Information Retrieval, 2009 Christopher J.C. Burges From RankNet to LambdaRank to LambdaMART: An Overview Microsoft Research Technical Report, 2010 Ilya Markov i.markov@uva.nl Information Retrieval 53

Course overview Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov i.markov@uva.nl Information Retrieval 54

See you tomorrow! Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov i.markov@uva.nl Information Retrieval 55