Information Retrieval
|
|
- Garry Curtis
- 6 years ago
- Views:
Transcription
1 Information Retrieval Learning to Rank Ilya Markov University of Amsterdam Ilya Markov Information Retrieval 1
2 Course overview Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov Information Retrieval 2
3 This lecture Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov Information Retrieval 3
4 Outline 1 Current trends in IR 2 Ilya Markov i.markov@uva.nl Information Retrieval 4
5 IR conferences ACM Conference on Research and Development in Information Retrieval (SIGIR) ACM Conference on Information Knowledge and Management (CIKM) ACM Conference on Web Search and Data Mining (WSDM) European Conference on Information Retrieval (ECIR) Ilya Markov Information Retrieval 5
6 IR journals ACM Transactions o Information Systems (TOIS) Information Retrieval Journal (IRJ) Information Processing and Management (IPM) Ilya Markov i.markov@uva.nl Information Retrieval 6
7 Surveys Foundations and Trends in Information Retrieval (FnTIR) Synthesis Lectures on Information Concepts, Retrieval, and Services by Morgan&Claypool Publishers Ilya Markov Information Retrieval 7
8 SIGIR 2016 Evaluation Efficiency Retrieval models, learning-to-rank, web search Users, user needs, search behavior Novelty and diversity Speech, conversation systems, question answering Recommendation systems Entities and knowledge graphs Ilya Markov Information Retrieval 8
9 WSDM 2016 Communities, social interaction, social networks Search and semantics Observing users, leveraging users Big data algorithms Entities and structure Ilya Markov Information Retrieval 9
10 Ranking methods 1 Content-based Term-based Semantic 2 Link-based (web search) 3 Ilya Markov i.markov@uva.nl Information Retrieval 10
11 Outline 1 Current trends in IR 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 11
12 Machine learning Traditional ML solves a prediction problem (classification or regression) on a single instance at a time. Input {x i } n i=1 Output {y i } n i=1 Learn a model h(x) that optimizes a loss function L(h(x), y) For a new instance x new predict the output y = h(x new ) Ilya Markov i.markov@uva.nl Information Retrieval 12
13 the training data. This is also highly demanding for real search engines, because everyday these search engines will receive a lot of user feedback and usage logs indicating poor ranking for some queries or documents. It is very important to automatically learn from feedback and constantly improve the ranking mechanism. Due to the aforementioned two characteristics, learning to rank has been widely used in commercial search engines, 13 and has also attracted The aimgreat of LTR attention isfrom to the come academic up research with optimal community. ordering of items, Figure 1.1 shows the typical learning-to-rank flow. From the figure where the relative ordering among the items is more important we can see that since learning to rank is a kind of supervised learning, a training thanset the is needed. exact Thescore creationthat of a training eachset item is verygets. similar to T.-Y. Liu, Learning to Rank for Information Retrieval Fig. 1.1 Learning-to-rank framework. Ilya Markov i.markov@uva.nl Information Retrieval 13
14 Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 14
15 Machine learning Input {x i } n i=1 Output {y i } n i=1 Learn a model h(x) that optimizes a loss function L(h(x), y) Examples Linear model h(x i ) = w T x i = lk=1 w kx ik Quadratic loss function L(h(x i ), y i ) = h(x i ) y i 2 How to learn the model h(x), i.e., how to estimate its parameters? Ilya Markov i.markov@uva.nl Information Retrieval 15
16 Learning the model h(x) If there is a closed form solution for the parameters of h(x) 1 Compute the derivative of the loss function L with respect to some parameter w k 2 L Equate this derivative to zero: w k = 0 3 Find the optimal value of the parameter w k If there is no closed form solution, use gradient descent 1 Compute or approximate the gradient of the loss function [ L = parameters L w 1,..., L w l ] using the current values of the 2 Update the model parameters by taking a small step in the opposite direction of the gradient: w w η L Ilya Markov i.markov@uva.nl Information Retrieval 16
17 Gradient descent Picture taken from Ilya Markov Information Retrieval 17
18 Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 18
19 been widely used in commercial search engines, 13 and has also attracted great attention from the academic research community. Figure 1.1 shows the typical learning-to-rank flow. From the figure we can see that since learning to rank is a kind of supervised learning, a training set is needed. The creation of a training set is very similar to Fig. 1.1 Learning-to-rank T.-Y. Liu, framework. Learning to Rank for Information Retrieval 13 See Ilya Markov i.markov@uva.nl Information Retrieval 19
20 Query-document representation Each query-document pair (q (n), x i ) is represented as a vector of features x (n) i Features Content-based Link-based User-based = [x (n) i1, x (n) i2,..., x (n) il ] Ilya Markov i.markov@uva.nl Information Retrieval 20
21 document pair, as shown in Table 6.2. For the OHSUMED corpus, 40 features were extracted in total, as shown in Table 6.3. Content-based features Table 6.2 Learning features of TREC. ID Feature description 1 Term frequency (TF) of body 2 TF of anchor 3 TF of title 4 TF of URL 5 TF of whole document 6 Inverse document frequency (IDF) of body 7 IDF of anchor 8 IDF of title 9 IDF of URL 10 IDF of whole document 11 TF*IDF of body 12 TF*IDF of anchor 13 TF*IDF of title 14 TF*IDF of URL 15 TF*IDF of whole document 16 Document length (DL) of body 17 DL of anchor 18 DL of title 19 DL of URL 20 DL of whole document 21 BM25 of body 22 BM25 of anchor 23 BM25 of title 24 BM25 of URL 25 BM25 of whole document T.-Y. 26 LMIR.ABS Liu, Learning of body to Rank for Information Retrieval 27 LMIR.ABS of anchor 28 LMIR.ABS of title Ilya Markov 29 i.markov@uva.nl LMIR.ABS of URL Information Retrieval 21
22 Link-based features 6.1 The LETOR Collection 291 Table 6.2 (Continued) ID Feature description 40 LMIR.JM of whole document 41 Sitemap based term propagation 42 Sitemap based score propagation 43 Hyperlink base score propagation: weighted in-link 44 Hyperlink base score propagation: weighted out-link 45 Hyperlink base score propagation: uniform out-link 46 Hyperlink base feature propagation: weighted in-link 47 Hyperlink base feature propagation: weighted out-link 48 Hyperlink base feature propagation: uniform out-link 49 HITS authority 50 HITS hub 51 PageRank 52 HostRank 53 Topical PageRank 54 Topical HITS authority 55 Topical HITS hub 56 Inlink number 57 Outlink number 58 Number of slash in URL 59 Length of URL 60 Number of child page 61 BM25 of extracted title 62 LMIR.ABS of extracted title 63 T.-Y. LMIR.DIR Liu, Learning of extracted to Ranktitle for Information Retrieval 64 LMIR.JM of extracted title Table 6.3 Learning features of OHSUMED. Ilya Markov Information Retrieval 22
23 User-based features Type of interaction Clicks Time Queries Online metric Click-through rate for (q (n), x i ) Avg. click rank for (q (n), x i ) Avg. dwell time for (q (n), x i ) Avg. time to first click, when this click is on x i Avg. time to last click, when this click is on x i Number of reformulations before/after q (n) Number of times q (n) is abandoned Ilya Markov i.markov@uva.nl Information Retrieval 23
24 Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 24
25 been widely used in commercial search engines, 13 and has also attracted great attention from the academic research community. Figure 1.1 shows the typical learning-to-rank flow. From the figure we can see that since learning to rank is a kind of supervised learning, a training set is needed. The creation of a training set is very similar to Fig. 1.1 Learning-to-rank T.-Y. Liu, framework. Learning to Rank for Information Retrieval 13 See Ilya Markov i.markov@uva.nl Information Retrieval 25
26 approaches LambdaMART LambdaRank T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov Information Retrieval 26
27 Pointwise LTR query h() Rel(red) Rel(gray) Rel(orange) h(blue) h(yellow) Rel(red) h(green) query h() Re Re Re h(green) pointwise LTR h(white) h( pointwise LTR Ilya Markov Information Retrieval 27
28 Pointwise LTR Reduces to traditional ML Input: query-document feature vectors x (n) i = [x (n) i1, x (n) i2,..., x (n) il ] Output: relevance labels y i Objective: learn a model h(x) that correctly predicts labels y Ilya Markov i.markov@uva.nl Information Retrieval 28
29 Regression Picture taken from Ilya Markov Information Retrieval 29
30 Classification Picture taken from Ilya Markov Information Retrieval 30
31 icit constraints on the thresholds to the optimization problem. Current trends in IR cit constraint simply takes the form of b k 1 b k, while the imp traint Ordinal uses regression redundant training examples to guarantee the ord ionship among thresholds..2 Sum of margin strategy. T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 31
32 Pointwise LTR Pros + Intuitive interpretation of relevance + Clear, how to get relevance judgements Cons Has a different optimization objective compared to IR (e.g., finding a correct class) Ilya Markov i.markov@uva.nl Information Retrieval 32
33 Pairwise LTR query query Pref(red>gray) h() Pref(gray>green) g() h() Pref(green>red) h(red>blue) pairwise LTR pairwis Ilya Markov Information Retrieval 33
34 RankNet Pointwise scoring function f (x i ) with parameters {w k } l k=1 Pairwise ground-truth P ij = I(x i > x j ) Probability of x i > x j is modeled using logistic regression P ij = P(x i > x j ) = Pairwise loss function (cross entropy) e σ(f i f j ) C = P ij log P ij (1 P ij ) log(1 P ij ) = (1 P ij )σ(f i f j ) + log(1 + e σ(f i f j ) ) RankNet optimizes the total number of pairwise errors Ilya Markov i.markov@uva.nl Information Retrieval 34
35 RankNet (cont d) Optimize the cost C C f i = σ [ ] 1 (1 P ij ) 1 + e σ(f = C i f j ) f j Update parameter w k of the function f (x i ) ( C f i w k w k η + C ) f j f i w k f j w k Ilya Markov i.markov@uva.nl Information Retrieval 35
36 Speeding up RankNet training Define λ ij as λ ij = C f i [ ] 1 = σ (1 P ij ) 1 + e σ(f i f j ) Let I denote the set of pairs of indices {i, j}, for which x i should be ranked differently from x j for a given query q I = {i, j x i > x j } Sum all contributions to update parameter w k δw k = η ( ) f i f j λ ij λ ij w k w k {i,j} I = η f i λ ij w k i j:{i,j} I j:{j,i} I f i λ ij w k = η i λ i f i w k Ilya Markov i.markov@uva.nl Information Retrieval 36
37 Interpreting λ s λ i = λ ij j:{i,j} I j:{j,i} I λ ij λ i is a sum of forces applied to document x i shown for query q All documents x j, that should be ranked below x i, push it up with the force λ ij All documents x j, that should be ranked above x i, push it down with the force λ ij Ilya Markov i.markov@uva.nl Information Retrieval 37
38 Pairwise LTR Pros + Easy to get preference judgements + Comes closer to optimizing the ranking Cons Still does not optimize the whole ranking Higher computational complexity compared to pointwise LTR Ilya Markov i.markov@uva.nl Information Retrieval 38
39 Listwise LTR R(blue) R(yellow) R(red) R(green) R(white) q1 NDCG(q1) R(gray) R(red) R(orange) R(white) R(yellow) q2 h() NDCG(q2) q3 R(orange) R(green) R(blue) R(gray) R(red) NDCG(q3) q4 listwise LTR Ilya Markov Information Retrieval 39
40 From RankNet From RankNet to to LambdaRank LambdaRank to LambdaMART: An Overview 7 The black arrows denote the RankNet gradients (which increase Fig. 1 A set of urls ordered for a given query using a binary relevance measure. The light gray with the bars represent number urls that of are not pairwise relevant to theerrors) query, while the dark blue bars represent urls that are relevant to the query. Left: the total number of pairwise errors is thirteen. Right: by moving top RankNet url down cost three rank decreases levels, and thefrom bottom relevant 13 url onupthe five, theleft total number to 11 of pairwise on the errorsright has been reduced to eleven. However for IR measures like NDCG and ERR that emphasize the top The actual few results, ranking this is not what gets we want. worse The (black) arrows on the left denote the RankNet gradients (which increase with the number of pairwise errors), whereas what we d really like are the (red) The red arrows arrows on the right. is what we would actually like to see C. Burges, From RankNet to LambdaRank to LambdaMART: An Overview 4 LambdaRank Ilya Markov i.markov@uva.nl Information Retrieval 40
41 LambdaRank λ ij in RankNet [ ] 1 λ ij = σ (1 P ij ) 1 + e σ(f i f j ) λ ij in LambdaRank λ ij = σ 1 + e σ(f i f j ) NDCG NDCG = NDCG(orig. ranking) NDCG(x i and x j are swapped) LambdaRank directly uses the ranking to compute gradients (i.e., λ ij s) instead of computing and optimizing a cost function Ilya Markov i.markov@uva.nl Information Retrieval 41
42 LambdaRank (cont d) Proceed similarly to RankNet: Sum all λ ij s for document x i and query q λ i = λ ij j:{i,j} I j:{j,i} I Update parameter w k of the function f (x i ) λ ij w k w k η i λ i f i w k Ilya Markov i.markov@uva.nl Information Retrieval 42
43 LambdaMART Multiple Additive Regression Trees (MART) MART does not need a cost function but gradients Adopts gradients (λ ij s) from LambdaRank Hence the name: Lambda + MART Ilya Markov i.markov@uva.nl Information Retrieval 43
44 Listwise LTR Pros + Directly optimizes the whole ranking Cons Needs many judgements High computational complexity Ilya Markov i.markov@uva.nl Information Retrieval 44
45 Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 45
46 LEarning TO Rank datasets (LETOR) Query-document pairs precomputed feature vectors Relevance judgements Ilya Markov Information Retrieval 46
47 document and judge whether it is relevant to a given query. Therefore, the pooling strategy as introduced in Section 1 was used [35]. Many research papers [97, 101, 131, 133] have been published using the three tasks on the Gov corpus as their experimental platform. Current trends in IR Historical LTR datasets TREC 2003, 2004 Web IR track Gov The corpus OHSUMED with 1, 053, corpus 110 pages Tasks The OHSUMED TD topic corpus distillation [64] is a subset of MEDLINE, a database on medical HP publications. homepageit finding consists of 348,566 records (out of over 7 million) from NP 270 named medical page finding journals during the years of Table 6.1 Number of queries in TREC web track. 4 Task TREC2003 TREC2004 Topic distillation Homepage finding Named page finding Figure: Number of queries T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov i.markov@uva.nl Information Retrieval 47
48 Results on NP2003 ListNet AdaRank SVM map le 6.7 Results on NP2003. Algorithm MAP Regression RankSVM RankBoost FRank ListNet AdaRank SVM map tp://svmrank.yisongyue.com/svmmap.php T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov Information Retrieval 48
49 FRank ListNet AdaRank SVM map Current trends in IR Results on HP2004 ble 6.10 Results on HP2004. Algorithm MAP Regression RankSVM RankBoost FRank ListNet AdaRank SVM map ble 6.11 Results on OHSUMED. Algorithm T.-Y. Liu, Learning to Rank for Information Retrieval MAP Regression RankSVM RankBoost FRank Ilya Markov Information Retrieval 49
50 Experimental comparison Listwise ranking algorithms perform very well on most datasets ListNet seems to be better than the others Pairwise ranking algorithms obtain good ranking accuracy on some (although not all) datasets Linear regression performs worse than the pairwise and listwise ranking algorithms T.-Y. Liu, Learning to Rank for Information Retrieval Ilya Markov Information Retrieval 50
51 Outline 2 Machine learning Features LTR approaches Experimental comparison Summary Ilya Markov i.markov@uva.nl Information Retrieval 51
52 summary Features Content-based Link-based User-based Approaches Pointwise (regression, classification, ordinal regression) Pairwise (RankNet) Listwise (LambdaRank, LambdaMART) Ilya Markov Information Retrieval 52
53 Materials Tie-Yan Liu Learning to Rank for Information Retrieval Foundations and Trends in Information Retrieval, 2009 Christopher J.C. Burges From RankNet to LambdaRank to LambdaMART: An Overview Microsoft Research Technical Report, 2010 Ilya Markov Information Retrieval 53
54 Course overview Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov Information Retrieval 54
55 See you tomorrow! Offline Data Acquisition Data Processing Data Storage Online Query Processing Ranking Evaluation Advanced Aggregated Search Click Models Present and Future of IR Ilya Markov Information Retrieval 55
WebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles
More informationSearch Engines and Learning to Rank
Search Engines and Learning to Rank Joseph (Yossi) Keshet Query processor Ranker Cache Forward index Inverted index Link analyzer Indexer Parser Web graph Crawler Representations TF-IDF To get an effective
More informationFall Lecture 16: Learning-to-rank
Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationRanking with Query-Dependent Loss for Web Search
Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query
More informationLearning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,
Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan, 2011.10 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan Liu @
More informationarxiv: v1 [stat.ap] 14 Mar 2018
arxiv:1803.05127v1 [stat.ap] 14 Mar 2018 Feature Selection and Model Comparison on Microsoft Learning-to-Rank Data Sets Sen LEI, Xinzhi HAN Submitted for the PSTAT 231 (Fall 2017) Final Project ONLY University
More informationLETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1
LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1 1 Microsoft Research Asia, No.49 Zhichun Road, Haidian
More informationLearning to Rank. from heuristics to theoretic approaches. Hongning Wang
Learning to Rank from heuristics to theoretic approaches Hongning Wang Congratulations Job Offer from Bing Core Ranking team Design the ranking module for Bing.com CS 6501: Information Retrieval 2 How
More informationOne-Pass Ranking Models for Low-Latency Product Recommendations
One-Pass Ranking Models for Low-Latency Product Recommendations Martin Saveski @msaveski MIT (Amazon Berlin) One-Pass Ranking Models for Low-Latency Product Recommendations Amazon Machine Learning Team,
More informationLearning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER
Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER Chalmers University of Technology University of Gothenburg
More informationA Few Things to Know about Machine Learning for Web Search
AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies Talk Outline My projects at MSRA Some conclusions from our research
More informationLearning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia
Learning to Rank for Information Retrieval Tie-Yan Liu Lead Researcher Microsoft Research Asia 4/20/2008 Tie-Yan Liu @ Tutorial at WWW 2008 1 The Speaker Tie-Yan Liu Lead Researcher, Microsoft Research
More informationUMass at TREC 2017 Common Core Track
UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer
More informationLearning to Rank for Faceted Search Bridging the gap between theory and practice
Learning to Rank for Faceted Search Bridging the gap between theory and practice Agnes van Belle @ Berlin Buzzwords 2017 Job-to-person search system Generated query Match indicator Faceted search Multiple
More informationarxiv: v1 [cs.ir] 16 Oct 2017
DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, Xueqi Cheng pl8787@gmail.com,{lanyanyan,guojiafeng,junxu,cxq}@ict.ac.cn,xujingfang@sogou-inc.com
More informationNortheastern University in TREC 2009 Million Query Track
Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College
More informationFeature selection. LING 572 Fei Xia
Feature selection LING 572 Fei Xia 1 Creating attribute-value table x 1 x 2 f 1 f 2 f K y Choose features: Define feature templates Instantiate the feature templates Dimensionality reduction: feature selection
More informationMining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data
Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Misha Bilenko and Ryen White presented by Matt Richardson Microsoft Research Search = Modeling User Behavior
More informationA Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising
A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising ABSTRACT Maryam Karimzadehgan Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL
More informationLinking Entities in Tweets to Wikipedia Knowledge Base
Linking Entities in Tweets to Wikipedia Knowledge Base Xianqi Zou, Chengjie Sun, Yaming Sun, Bingquan Liu, and Lei Lin School of Computer Science and Technology Harbin Institute of Technology, China {xqzou,cjsun,ymsun,liubq,linl}@insun.hit.edu.cn
More informationEntity and Knowledge Base-oriented Information Retrieval
Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationArama Motoru Gelistirme Dongusu: Siralamayi Ogrenme ve Bilgiye Erisimin Degerlendirilmesi. Retrieval Effectiveness and Learning to Rank
Arama Motoru Gelistirme Dongusu: Siralamayi Ogrenme ve Bilgiye Erisimin Degerlendirilmesi etrieval Effectiveness and Learning to ank EMIE YILMAZ Professor and Turing Fellow University College London esearch
More informationEffective Latent Space Graph-based Re-ranking Model with Global Consistency
Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case
More informationRankDE: Learning a Ranking Function for Information Retrieval using Differential Evolution
RankDE: Learning a Ranking Function for Information Retrieval using Differential Evolution Danushka Bollegala 1 Nasimul Noman 1 Hitoshi Iba 1 1 The University of Tokyo Abstract: Learning a ranking function
More informationApache Solr Learning to Rank FTW!
Apache Solr Learning to Rank FTW! Berlin Buzzwords 2017 June 12, 2017 Diego Ceccarelli Software Engineer, News Search dceccarelli4@bloomberg.net Michael Nilsson Software Engineer, Unified Search mnilsson23@bloomberg.net
More informationLizhe Sun. November 17, Florida State University. Ranking in Statistics and Machine Learning. Lizhe Sun. Introduction
in in Florida State University November 17, 2017 Framework in 1. our life 2. Early work: Model Examples 3. webpage Web page search modeling Data structure Data analysis with machine learning algorithms
More informationLearning Non-linear Ranking Functions for Web Search using Probabilistic Model Building GP
Learning Non-linear Ranking Functions for Web Search using Probabilistic Model Building GP Hiroyuki Sato, Danushka Bollegala, Yoshihiko Hasegawa and Hitoshi Iba The University of Tokyo, Tokyo, Japan 113-8654
More informationLearning to Rank: A New Technology for Text Processing
TFANT 07 Tokyo Univ. March 2, 2007 Learning to Rank: A New Technology for Text Processing Hang Li Microsoft Research Asia Talk Outline What is Learning to Rank? Ranking SVM Definition Search Ranking SVM
More informationStructured Ranking Learning using Cumulative Distribution Networks
Structured Ranking Learning using Cumulative Distribution Networks Jim C. Huang Probabilistic and Statistical Inference Group University of Toronto Toronto, ON, Canada M5S 3G4 jim@psi.toronto.edu Brendan
More informationOptimizing Search Engines using Click-through Data
Optimizing Search Engines using Click-through Data By Sameep - 100050003 Rahee - 100050028 Anil - 100050082 1 Overview Web Search Engines : Creating a good information retrieval system Previous Approaches
More informationLearning to Rank Only Using Training Data from Related Domain
Learning to Rank Only Using Training Data from Related Domain Wei Gao, Peng Cai 2, Kam-Fai Wong, and Aoying Zhou 2 The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, China {wgao, kfwong}@se.cuhk.edu.hk
More informationLearning to Rank with Deep Neural Networks
Learning to Rank with Deep Neural Networks Dissertation presented by Goeric HUYBRECHTS for obtaining the Master s degree in Computer Science and Engineering Options: Artificial Intelligence Computing and
More informationInformation Retrieval
Information Retrieval WS 2016 / 2017 Lecture 2, Tuesday October 25 th, 2016 (Ranking, Evaluation) Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University
More informationfor Searching Social Media Posts
Mining the Temporal Statistics of Query Terms for Searching Social Media Posts ICTIR 17 Amsterdam Oct. 1 st 2017 Jinfeng Rao Ferhan Ture Xing Niu Jimmy Lin Task: Ad-hoc Search on Social Media domain Stream
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationInformation Retrieval Lecture 4: Web Search. Challenges of Web Search 2. Natural Language and Information Processing (NLIP) Group
Information Retrieval Lecture 4: Web Search Computer Science Tripos Part II Simone Teufel Natural Language and Information Processing (NLIP) Group sht25@cl.cam.ac.uk (Lecture Notes after Stephen Clark)
More informationAutomatic Summarization
Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization
More informationLearning to Rank for Information Retrieval
Learning to Rank for Information Retrieval Tie-Yan Liu Learning to Rank for Information Retrieval Tie-Yan Liu Microsoft Research Asia Bldg #2, No. 5, Dan Ling Street Haidian District Beijing 100080 People
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationModern Retrieval Evaluations. Hongning Wang
Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance
More informationLearning Temporal-Dependent Ranking Models
Learning Temporal-Dependent Ranking Models Miguel Costa, Francisco Couto, Mário Silva LaSIGE @ Faculty of Sciences, University of Lisbon IST/INESC-ID, University of Lisbon 37th Annual ACM SIGIR Conference,
More informationPredicting Query Performance on the Web
Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,
More informationChallenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track
Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Alejandro Bellogín 1,2, Thaer Samar 1, Arjen P. de Vries 1, and Alan Said 1 1 Centrum Wiskunde
More informationPerformance Measures for Multi-Graded Relevance
Performance Measures for Multi-Graded Relevance Christian Scheel, Andreas Lommatzsch, and Sahin Albayrak Technische Universität Berlin, DAI-Labor, Germany {christian.scheel,andreas.lommatzsch,sahin.albayrak}@dai-labor.de
More informationGraph mining assisted semi-supervised learning for fraudulent cash-out detection
Graph mining assisted semi-supervised learning for fraudulent cash-out detection Yuan Li Yiheng Sun Noshir Contractor Aug 2, 2017 Outline Introduction Method Experiments and Results Conculsion and Future
More informationCollective classification in network data
1 / 50 Collective classification in network data Seminar on graphs, UCSB 2009 Outline 2 / 50 1 Problem 2 Methods Local methods Global methods 3 Experiments Outline 3 / 50 1 Problem 2 Methods Local methods
More informationAdvanced Click Models & their Applications to IR
Advanced Click Models & their Applications to IR (Afternoon block 1) Aleksandr Chuklin, Ilya Markov Maarten de Rijke a.chuklin@uva.nl i.markov@uva.nl derijke@uva.nl University of Amsterdam Google Switzerland
More informationLecture 3: Improving Ranking with
Modeling User Behavior and Interactions Lecture 3: Improving Ranking with Behavior Data 1 * &!" ) & $ & 6+ $ & 6+ + "& & 8 > " + & 2 Lecture 3 Plan $ ( ' # $ #! & $ #% #"! #! -( #", # + 4 / 0 3 21 0 /
More informationEffect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching
Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna
More informationInformation Retrieval Spring Web retrieval
Information Retrieval Spring 2016 Web retrieval The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically infinite due to the dynamic
More informationQuery Independent Scholarly Article Ranking
Query Independent Scholarly Article Ranking Shuai Ma, Chen Gong, Renjun Hu, Dongsheng Luo, Chunming Hu, Jinpeng Huai SKLSDE Lab, Beihang University, China Beijing Advanced Innovation Center for Big Data
More informationWCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data
WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data Otávio D. A. Alcântara 1, Álvaro R. Pereira Jr. 3, Humberto M. de Almeida 1, Marcos A. Gonçalves 1, Christian Middleton
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 6: Flat Clustering Hinrich Schütze Center for Information and Language Processing, University of Munich 04-06- /86 Overview Recap
More informationEnd-to-End Neural Ad-hoc Ranking with Kernel Pooling
End-to-End Neural Ad-hoc Ranking with Kernel Pooling Chenyan Xiong 1,Zhuyun Dai 1, Jamie Callan 1, Zhiyuan Liu, and Russell Power 3 1 :Language Technologies Institute, Carnegie Mellon University :Tsinghua
More informationRSDC 09: Tag Recommendation Using Keywords and Association Rules
RSDC 09: Tag Recommendation Using Keywords and Association Rules Jian Wang, Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University, Bethlehem, PA 18015 USA
More informationLearning to Reweight Terms with Distributed Representations
Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results
More informationBalancing Speed and Quality in Online Learning to Rank for Information Retrieval
Balancing Speed and Quality in Online Learning to Rank for Information Retrieval ABSTRACT Harrie Oosterhuis University of Amsterdam Amsterdam, The Netherlands oosterhuis@uva.nl In Online Learning to Rank
More informationA Formal Approach to Score Normalization for Meta-search
A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003
More informationLink Analysis and Web Search
Link Analysis and Web Search Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna http://www.moreno.marzolla.name/ based on material by prof. Bing Liu http://www.cs.uic.edu/~liub/webminingbook.html
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationChapter 6: Information Retrieval and Web Search. An introduction
Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods
More informationA General Approximation Framework for Direct Optimization of Information Retrieval Measures
A General Approximation Framework for Direct Optimization of Information Retrieval Measures Tao Qin, Tie-Yan Liu, Hang Li October, 2008 Abstract Recently direct optimization of information retrieval (IR)
More informationIITH at CLEF 2017: Finding Relevant Tweets for Cultural Events
IITH at CLEF 2017: Finding Relevant Tweets for Cultural Events Sreekanth Madisetty and Maunendra Sankar Desarkar Department of CSE, IIT Hyderabad, Hyderabad, India {cs15resch11006, maunendra}@iith.ac.in
More informationInformation Retrieval
Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,
More informationarxiv: v1 [cs.ir] 19 Sep 2016
Enhancing LambdaMART Using Oblivious Trees Marek Modrý 1 and Michal Ferov 2 arxiv:1609.05610v1 [cs.ir] 19 Sep 2016 1 Seznam.cz, Radlická 3294/10, 150 00 Praha 5, Czech Republic marek.modry@firma.seznam.cz
More informationPersonalized Web Search
Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents
More informationCPSC 340: Machine Learning and Data Mining. Ranking Fall 2016
CPSC 340: Machine Learning and Data Mining Ranking Fall 2016 Assignment 5: Admin 2 late days to hand in Wednesday, 3 for Friday. Assignment 6: Due Friday, 1 late day to hand in next Monday, etc. Final:
More informationOpinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track
Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track Anastasia Giachanou 1,IlyaMarkov 2 and Fabio Crestani 1 1 Faculty of Informatics, University of Lugano, Switzerland
More informationA Comparing Pointwise and Listwise Objective Functions for Random Forest based Learning-to-Rank
A Comparing Pointwise and Listwise Objective Functions for Random Forest based Learning-to-Rank MUHAMMAD IBRAHIM, Monash University, Australia MARK CARMAN, Monash University, Australia Current random forest
More informationAn Investigation of Basic Retrieval Models for the Dynamic Domain Task
An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu
More informationInformation Retrieval. Lecture 7 - Evaluation in Information Retrieval. Introduction. Overview. Standard test collection. Wintersemester 2007
Information Retrieval Lecture 7 - Evaluation in Information Retrieval Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1 / 29 Introduction Framework
More informationInformation Retrieval
Information Retrieval Lecture 7 - Evaluation in Information Retrieval Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 29 Introduction Framework
More informationUniversity of Delaware at Diversity Task of Web Track 2010
University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments
More informationS-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking
S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional
More informationEvaluation. Evaluate what? For really large amounts of data... A: Use a validation set.
Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?
More informationOn the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled Collections
On the Effectiveness of Query Weighting for Adapting Rank Learners to New Unlabelled Collections Pengfei Li RMIT University, Australia li.pengfei@rmit.edu.au Mark Sanderson RMIT University, Australia mark.sanderson@rmit.edu.au
More informationDeveloping Focused Crawlers for Genre Specific Search Engines
Developing Focused Crawlers for Genre Specific Search Engines Nikhil Priyatam Thesis Advisor: Prof. Vasudeva Varma IIIT Hyderabad July 7, 2014 Examples of Genre Specific Search Engines MedlinePlus Naukri.com
More informationA Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search
A Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search Vinod Nair Yahoo! Labs Bangalore vnair@yahoo-inc.com Dhruv Mahajan Yahoo! Labs Bangalore dkm@yahoo-inc.com
More informationAdapting Ranking Functions to User Preference
Adapting Ranking Functions to User Preference Keke Chen, Ya Zhang, Zhaohui Zheng, Hongyuan Zha, Gordon Sun Yahoo! {kchen,yazhang,zhaohui,zha,gzsun}@yahoo-inc.edu Abstract Learning to rank has become a
More informationA Machine Learning Approach for Improved BM25 Retrieval
A Machine Learning Approach for Improved BM25 Retrieval Krysta M. Svore and Christopher J. C. Burges Microsoft Research One Microsoft Way Redmond, WA 98052 {ksvore,cburges}@microsoft.com Microsoft Research
More informationMultimedia Information Systems
Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 6: Text Information Retrieval 1 Digital Video Library Meta-Data Meta-Data Similarity Similarity Search Search Analog Video Archive
More informationCPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017
CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining
More informationDiversification of Query Interpretations and Search Results
Diversification of Query Interpretations and Search Results Advanced Methods of IR Elena Demidova Materials used in the slides: Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova,
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationProximity Prestige using Incremental Iteration in Page Rank Algorithm
Indian Journal of Science and Technology, Vol 9(48), DOI: 10.17485/ijst/2016/v9i48/107962, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Proximity Prestige using Incremental Iteration
More informationVisoLink: A User-Centric Social Relationship Mining
VisoLink: A User-Centric Social Relationship Mining Lisa Fan and Botang Li Department of Computer Science, University of Regina Regina, Saskatchewan S4S 0A2 Canada {fan, li269}@cs.uregina.ca Abstract.
More informationInformation Retrieval May 15. Web retrieval
Information Retrieval May 15 Web retrieval What s so special about the Web? The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically
More informationUniversity of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015
University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationInformation Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.
Introduction to Information Retrieval Ethan Phelps-Goodman Some slides taken from http://www.cs.utexas.edu/users/mooney/ir-course/ Information Retrieval (IR) The indexing and retrieval of textual documents.
More informationAutomatically Building Research Reading Lists
Automatically Building Research Reading Lists Michael D. Ekstrand 1 Praveen Kanaan 1 James A. Stemper 2 John T. Butler 2 Joseph A. Konstan 1 John T. Riedl 1 ekstrand@cs.umn.edu 1 GroupLens Research Department
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationA Deep Relevance Matching Model for Ad-hoc Retrieval
A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationLEARNING to rank is a kind of learning based information
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XX, NO. X, MARCH 2010 1 Ranking Model Adaptation for Domain-Specific Search Bo Geng, Member, IEEE, Linjun Yang, Member, IEEE, Chao Xu, Xian-Sheng
More informationRanking Algorithms For Digital Forensic String Search Hits
DIGITAL FORENSIC RESEARCH CONFERENCE Ranking Algorithms For Digital Forensic String Search Hits By Nicole Beebe and Lishu Liu Presented At The Digital Forensic Research Conference DFRWS 2014 USA Denver,
More informationMulti-label classification using rule-based classifier systems
Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar
More informationSearch Engines Chapter 8 Evaluating Search Engines Felix Naumann
Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition
More informationFrom Passages into Elements in XML Retrieval
From Passages into Elements in XML Retrieval Kelly Y. Itakura David R. Cheriton School of Computer Science, University of Waterloo 200 Univ. Ave. W. Waterloo, ON, Canada yitakura@cs.uwaterloo.ca Charles
More information