Arama Motoru Gelistirme Dongusu: Siralamayi Ogrenme ve Bilgiye Erisimin Degerlendirilmesi. Retrieval Effectiveness and Learning to Rank

Size: px
Start display at page:

Download "Arama Motoru Gelistirme Dongusu: Siralamayi Ogrenme ve Bilgiye Erisimin Degerlendirilmesi. Retrieval Effectiveness and Learning to Rank"

Transcription

1 Arama Motoru Gelistirme Dongusu: Siralamayi Ogrenme ve Bilgiye Erisimin Degerlendirilmesi etrieval Effectiveness and Learning to ank EMIE YILMAZ Professor and Turing Fellow University College London esearch Consultant Microsoft esearch

2 LEAIG TO AK (LT)... the task to automatically construct a ranking model using training data, such that the model can sort new objects according to their degrees of relevance, preference, or importance. - Liu [2009] LT models represent a rankable item e.g., a document given some context e.g., a user-issued query as a numerical vector Ԧx n The ranking model f: Ԧx is trained to map the vector to a real-valued score such that relevant items are scored higher. Tie-Yan Liu. Learning to rank for information retrieval. Foundation and Trends in Information etrieval, 2009.

3 APPLICATIOS OF LEAIG TO AK Search (Document Search, Entity Search, etc.) ecommender Systems (Collaborative Filtering) Question Answering Document Summarization Opinion Mining Machine Translation

4 APPLICATIOS OF LEAIG TO AK Search (Document Search, Entity Search, etc.) ecommender Systems (Collaborative Filtering) Question Answering Document Summarization Opinion Mining Machine Translation

5 LEAIG TO AK Search Engine BM25,tf*idf, Pageank, esults Evaluate Judges

6 FEATUES Traditional learning to rank models employ hand-crafted features that encode insights about the problem They can often be categorized as: Query-independent or static features e.g., incoming link count, page-rank score Query-dependent or dynamic features e.g., BM25

7 FEATUES Traditional learning to rank models employ hand-crafted features that encode insights about the problem They can often be categorized as: Query-independent or static features e.g., incoming link count, page-rank score More semantic representations in the recent years Query-dependent or dynamic features e.g., BM25

8 TEM EMBEDDIGS FO SEACH estimate relevance estimate relevance Compare query and document directly in the embedding space Use embeddings to generate suitable query expansions

9 APPOACHES Pointwise approach elevance label y q,d is a number derived from binary or graded human judgments or implicit user feedback (e.g., CT). Typically, a regression or Liu [2009] categorizes different LT approaches based on training objectives: classification model is trained to predict y q,d given Ԧx q,d. Pairwise approach Pairwise preference between documents for a query (d i d j w.r.t. q) as label. educes to binary classification to predict more relevant document. Listwise approach (Modern Systems) Directly optimize for rank-based metrics evaluating user satisfaction (more to be discussed later) Tie-Yan Liu. Learning to rank for information retrieval. Foundation and Trends in Information etrieval, 2009.

10 EVALUATIO METICS: PECISIO VS. ECALL etrieved list Ṛ.

11 VISUALIZIG ETIEVAL PEFOMACE: PECISIO-ECALL CUVES List:

12 EVALUATIO METICS: AVEAGE PECISIO List:

13 EVALUATIO METICS: DCG Some documents more relevant than others User receives some gain from each document Discount gain based on rank DCG r 1 G( r) D( r) ormalized discounted cumulative gain DCG DCG OptDCG G( r) rel( r),2 D( r) rel( r), ,,... log ( r) r b

14 EVALUATIO METICS Two categories User-oriented metrics PC(k), DCG(k) System-oriented metrics AP, DCG

15 APPOACHES Pointwise approach elevance label y q,d is a number derived from binary or graded human judgments or implicit user feedback (e.g., CT). Typically, a regression or classification model is trained to predict y q,d given Ԧx q,d. Pairwise approach Pairwise preference between documents for a query (d i d j w.r.t. q) as label. educes to binary classification to predict more relevant document. Listwise approach (Modern Systems) Directly optimize for rank-based metric, such as DCG difficult because these metrics are often not differentiable w.r.t. model parameters. Tie-Yan Liu. Learning to rank for information retrieval. Foundation and Trends in Information etrieval, 2009.

16 PAIWISE OBJECTIVES 0 1 anket loss Pairwise loss function proposed by Burges et al. [2005] an industry favourite [Burges, 2015] pairwise preference Predicted probabilities: p ij = p s i > s j Desired probabilities: pҧ ij = 1 and p ji ҧ = e γ. s i s j score Computing cross-entropy between p and pҧ L anket = pҧ ij. log p ij p ji ҧ. log p ji Use neural network as the model, and gradient descent as the algorithm, to optimize the cross-entropy loss. Chris Burges, Tal Shaked, Erin enshaw, Ari Lazier, Matt Deeds, icole Hamilton, and Greg Hullender. Learning to rank using gradient descent. In ICML, Chris Burges. anket: A ranking retrospective

17 LISTWISE OBJECTIVES Blue: relevant Gray: non-relevant DCG higher for left but pairwise errors less for right Due to strong position-based discounting in I measures, errors at higher ranks are much more problematic than at lower ranks But listwise metrics are non-continuous and non-differentiable [Burges, 2010] Christopher JC Burges. From ranknet to lambdarank to lambdamart: An overview. Learning, 2010.

18 LISTWISE OBJECTIVES: LAMBDAAK Burges et al. [2010] make two observations: 1. To train a model we don t need the costs themselves, only the gradients (of the costs w.r.t model scores) Lambdaank loss Multiply actual gradients with the change in DCG by swapping the rank positions of the two documents 2. It is desired that the gradient be bigger for pairs of documents that produces a bigger impact in DCG by swapping positions Christopher JC Burges. From ranknet to lambdarank to lambdamart: An overview. Learning, 2010.

19 LISTWISE OBJECTIVES: LAMBDAAK Empirically shown to optimize the objective metric Most current learning to rank models based on different variations of the idea Winner of the Yahoo! Learning to ank Challenge Christopher JC Burges. From ranknet to lambdarank to lambdamart: An overview. Learning, 2010.

20 LEAIG TO AK Search Engine BM25,tf*idf, Pageank, esults Many different evaluation metrics Which metric? Evaluate Judges

21 OPTIMIZIG FO A METIC Empirical isk Minimization X evaluates user satisfaction Optimize for X A common misconception! Informative vs. uninformative metrics

22 TAIIG WITH MOE IFOMATIO Train for a more informative metric X : user satisfaction

23 TAIIG WITH MOE IFOMATIO Train for a more informative metric X : user satisfaction Y : more informative than X

24 TAIIG WITH MOE IFOMATIO Train for a more informative metric X : user satisfaction Y : more informative than X Train for Y! Better test set X than training for X!

25 WHAT MAKES A METIC MOE IFOMATIVE? List 1 List 2.. Metrics respond to flips in the same way Some metrics may ignore some flips Metrics sensitive to many flips more informative Metrics weight flips differently Some metrics give too much weight to some flips

26 IFOMATIVEESS AD LEAIG TO AK Evaluation metrics summarize the training data Query Learning Algorithm list d1 d2 d3. d4

27 IFOMATIVEESS AD LEAIG TO AK Evaluation metrics summarize the training data Query Learning Algorithm list d1 d2 d3. d4.

28 IFOMATIVEESS AD LEAIG TO AK Evaluation metrics summarize the training data Query Learning Algorithm list d1 d2 d3. d4. Evaluate

29 IFOMATIVEESS AD LEAIG TO AK Evaluation metrics summarize the training data Query Learning Algorithm list d1 d2 d3. d4???.? Evaluate

30 EVALUATIO METICS AD IFOMATIVEESS For the same part of the ranking Some metrics insensitive to some flips at this part Two lists with identical PC(10) values

31 EVALUATIO METICS AD IFOMATIVEESS Some metrics ignore some parts of ranking Two lists with identical PC(5) and DCG(5) values

32 IFOMATIVEESS OF METICS [YILMAZ AD OBETSO, IJ 10] Quality of a ranked list elevance of documents in the list How much does a metric reduce one s uncertainty in the underlying list? Informative metrics: large reduction in uncertainty on-informative metrics: little or no reduction in uncertainty

33 IFOMATIVEESS OF METICS Evaluation Metric Maximum Entropy Framework elevance Precision-recall Curve Other Metrics Pr(rel rank) 1 Pr() 2 Pr() 3 Pr()

34 SETUP FO AP METIC Goal: Given the average precision value (ap) of a list, infer probability of relevance of document at rank i Maximum entropy setup: Maximize Subject to

35 IFOMATIVEESS OF METICS

36 IFOMATIVEESS OF METICS

37 WHICH EVALUATIO METIC? Which evaluation metric? Informative Metrics: AP, DCG Less Informative Metrics: DCG(10), PC(10) DCG(10) more informative than PC(10) AP vs. DCG AP more informative than DCG regarding binary relevance of documents

38 IFOMATIVEESS AD LEAIG TO AK Hypothesis: Optimizing for a more informative metric Y gives better test set X than optimizing for X directly Learning algorithms Lambdaank Softank (Optimize for smooth versions of metrics) Evaluation Metrics Informative Metrics: AP, DCG Less Informative Metrics: DCG(10), PC(10)

39 TAIIG O DIFFEET METICS: LAMBDAAK Optim. Metric TestMetric AP PC(10) DCG(10) AP * PC(10) DCG * 62.90* DCG(10)

40 TAIIG O DIFFEET METICS: SOFTAK Optim. Metric TestMetric AP PC(10) DCG(10) AP * 63.03* PC(10) DCG * 62.98* DCG(10) * 62.41

41 SUMMAY Be careful about which metric you use! Optimize for informative metrics Similar conclusions in classification Informative metric design Graded Average Precision What is the ultimate metric for learning to rank?

42 CUET ESEACH: TASK BASED I Task: Invest in Bitcoins Find cryptocurrencies exchanges to buy/sell Find trading fee details Taxation rules for crypto Task: Buy a paintbrush ecommend: painting courses Buy painting guides Find combo offers Users use online systems to achieve some real world tasks

43 CUET ESEACH: TASK BASED I Task: Invest in Bitcoins Find cryptocurrencies 25 Page Views per task exchanges to buy/sell Find trading fee details Taxation rules for crypto 44 Minutes per task Task: Buy a paintbrush ecommend: painting courses Buy painting guides Find combo offers 7 Queries per task Users use online systems to achieve some real world tasks Significant effort required using existing systems

44 CUET ESEACH: TASK BASED I Task: Invest in Bitcoins Find cryptocurrencies 25 Page Views per task exchanges to buy/sell Find trading fee details Taxation rules for crypto 44 Minutes per task Task: Buy a paintbrush ecommend: painting courses Buy painting guides Find combo offers 7 Queries per task Devise next generation intelligent online services than can Go beyond the input from the user Automatically detect the task the user trying to achieve Provide the user with contextual task completion assistance

45 ESEACH CHALLEGES FO TASK BASED I SYSTEMS Task extraction/representation Design of task based retrieval interfaces Task based personalization Task based evaluation of retrieval systems

46 Usage logs (questions asked, queries issued, pages viewed, etc.) contain information about tasks users use the online systems for Usage Logs Mine information from usage logs using machine learning techniques to infer the representations of tasks Subtasks Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian onparametric Approach.. Mehrotra and E. Yilmaz. In Proceedings of ACM SIGI Deep Sequential Models for Task Satisfaction Prediction. Mehrotra, E. Yilmaz et al. In Proceedings of ACM CIKM Task Embeddings: Learning Query Embeddings using Task Context.. Mehrotra and E. Yilmaz. In Proceedings of ACM CIKM 2017.

47 MOE O CUET/FUTUE WOK Conversational I system design and evaluation Stance detection (fake news detection) Understanding user behaviour across different devices AI for education Predicting cryptocurrency price change using resources from the web

48 Thank You! Dr. Manisha Verma Dr. Jiyin He Bhaskar Mitra Sahan Bulathwela Dr. ishabh Mehrotra Dr. Shangsong Liang Qiang Zhang Andrew Burnie

WebSci and Learning to Rank for IR

WebSci and Learning to Rank for IR WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles

More information

Information Retrieval

Information Retrieval Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data

More information

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang Learning to Rank from heuristics to theoretic approaches Hongning Wang Congratulations Job Offer from Bing Core Ranking team Design the ranking module for Bing.com CS 6501: Information Retrieval 2 How

More information

Structured Ranking Learning using Cumulative Distribution Networks

Structured Ranking Learning using Cumulative Distribution Networks Structured Ranking Learning using Cumulative Distribution Networks Jim C. Huang Probabilistic and Statistical Inference Group University of Toronto Toronto, ON, Canada M5S 3G4 jim@psi.toronto.edu Brendan

More information

Lizhe Sun. November 17, Florida State University. Ranking in Statistics and Machine Learning. Lizhe Sun. Introduction

Lizhe Sun. November 17, Florida State University. Ranking in Statistics and Machine Learning. Lizhe Sun. Introduction in in Florida State University November 17, 2017 Framework in 1. our life 2. Early work: Model Examples 3. webpage Web page search modeling Data structure Data analysis with machine learning algorithms

More information

Personalized Web Search

Personalized Web Search Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents

More information

Active Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract)

Active Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract) Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Active Evaluation of Ranking Functions based on Graded Relevance (Extended Abstract) Christoph Sawade sawade@cs.uni-potsdam.de

More information

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia Learning to Rank for Information Retrieval Tie-Yan Liu Lead Researcher Microsoft Research Asia 4/20/2008 Tie-Yan Liu @ Tutorial at WWW 2008 1 The Speaker Tie-Yan Liu Lead Researcher, Microsoft Research

More information

Retrieval Evaluation. Hongning Wang

Retrieval Evaluation. Hongning Wang Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User

More information

Ranking with Query-Dependent Loss for Web Search

Ranking with Query-Dependent Loss for Web Search Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query

More information

Ranking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification.

Ranking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification. Table of Content anking and Learning Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification 290 UCSB, Tao Yang, 2013 Partially based on Manning, aghavan,

More information

Predictive Analysis: Evaluation and Experimentation. Heejun Kim

Predictive Analysis: Evaluation and Experimentation. Heejun Kim Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training

More information

UMass at TREC 2017 Common Core Track

UMass at TREC 2017 Common Core Track UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer

More information

One-Pass Ranking Models for Low-Latency Product Recommendations

One-Pass Ranking Models for Low-Latency Product Recommendations One-Pass Ranking Models for Low-Latency Product Recommendations Martin Saveski @msaveski MIT (Amazon Berlin) One-Pass Ranking Models for Low-Latency Product Recommendations Amazon Machine Learning Team,

More information

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional

More information

Learning to Rank for Faceted Search Bridging the gap between theory and practice

Learning to Rank for Faceted Search Bridging the gap between theory and practice Learning to Rank for Faceted Search Bridging the gap between theory and practice Agnes van Belle @ Berlin Buzzwords 2017 Job-to-person search system Generated query Match indicator Faceted search Multiple

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

Improving Recommendations Through. Re-Ranking Of Results

Improving Recommendations Through. Re-Ranking Of Results Improving Recommendations Through Re-Ranking Of Results S.Ashwini M.Tech, Computer Science Engineering, MLRIT, Hyderabad, Andhra Pradesh, India Abstract World Wide Web has become a good source for any

More information

Chapter 8. Evaluating Search Engine

Chapter 8. Evaluating Search Engine Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can

More information

Search Engines and Learning to Rank

Search Engines and Learning to Rank Search Engines and Learning to Rank Joseph (Yossi) Keshet Query processor Ranker Cache Forward index Inverted index Link analyzer Indexer Parser Web graph Crawler Representations TF-IDF To get an effective

More information

Predicting Query Performance on the Web

Predicting Query Performance on the Web Predicting Query Performance on the Web No Author Given Abstract. Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However,

More information

for Searching Social Media Posts

for Searching Social Media Posts Mining the Temporal Statistics of Query Terms for Searching Social Media Posts ICTIR 17 Amsterdam Oct. 1 st 2017 Jinfeng Rao Ferhan Ture Xing Niu Jimmy Lin Task: Ad-hoc Search on Social Media domain Stream

More information

Learning Dense Models of Query Similarity from User Click Logs

Learning Dense Models of Query Similarity from User Click Logs Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational

More information

Supervised Hashing for Image Retrieval via Image Representation Learning

Supervised Hashing for Image Retrieval via Image Representation Learning Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar

More information

A Machine Learning Approach for Improved BM25 Retrieval

A Machine Learning Approach for Improved BM25 Retrieval A Machine Learning Approach for Improved BM25 Retrieval Krysta M. Svore and Christopher J. C. Burges Microsoft Research One Microsoft Way Redmond, WA 98052 {ksvore,cburges}@microsoft.com Microsoft Research

More information

Apache Solr Learning to Rank FTW!

Apache Solr Learning to Rank FTW! Apache Solr Learning to Rank FTW! Berlin Buzzwords 2017 June 12, 2017 Diego Ceccarelli Software Engineer, News Search dceccarelli4@bloomberg.net Michael Nilsson Software Engineer, Unified Search mnilsson23@bloomberg.net

More information

Fall Lecture 16: Learning-to-rank

Fall Lecture 16: Learning-to-rank Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin

More information

Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University

Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University Advanced Search Techniques for Large Scale Data Analytics Pavel Zezula and Jan Sedmidubsky Masaryk University http://disa.fi.muni.cz The Cranfield Paradigm Retrieval Performance Evaluation Evaluation Using

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

1 Machine Learning System Design

1 Machine Learning System Design Machine Learning System Design Prioritizing what to work on: Spam classification example Say you want to build a spam classifier Spam messages often have misspelled words We ll have a labeled training

More information

Modern Retrieval Evaluations. Hongning Wang

Modern Retrieval Evaluations. Hongning Wang Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance

More information

A Dynamic Bayesian Network Click Model for Web Search Ranking

A Dynamic Bayesian Network Click Model for Web Search Ranking A Dynamic Bayesian Network Click Model for Web Search Ranking Olivier Chapelle and Anne Ya Zhang Apr 22, 2009 18th International World Wide Web Conference Introduction Motivation Clicks provide valuable

More information

Clickthrough Log Analysis by Collaborative Ranking

Clickthrough Log Analysis by Collaborative Ranking Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Clickthrough Log Analysis by Collaborative Ranking Bin Cao 1, Dou Shen 2, Kuansan Wang 3, Qiang Yang 1 1 Hong Kong

More information

Adapting Ranking Functions to User Preference

Adapting Ranking Functions to User Preference Adapting Ranking Functions to User Preference Keke Chen, Ya Zhang, Zhaohui Zheng, Hongyuan Zha, Gordon Sun Yahoo! {kchen,yazhang,zhaohui,zha,gzsun}@yahoo-inc.edu Abstract Learning to rank has become a

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

Learning from Click Model and Latent Factor Model for Relevance Prediction Challenge

Learning from Click Model and Latent Factor Model for Relevance Prediction Challenge Learning from Click Model and Latent Factor Model for Relevance Prediction Challenge Botao Hu Hong Kong University of Science and Technology Clear Water Bay, Hong Kong botao@cse.ust.hk Nathan N. Liu Hong

More information

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan, Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan, 2011.10 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan Liu @

More information

High Accuracy Retrieval with Multiple Nested Ranker

High Accuracy Retrieval with Multiple Nested Ranker High Accuracy Retrieval with Multiple Nested Ranker Irina Matveeva University of Chicago 5801 S. Ellis Ave Chicago, IL 60637 matveeva@uchicago.edu Chris Burges Microsoft Research One Microsoft Way Redmond,

More information

Photo Collection Summarization

Photo Collection Summarization Master Thesis Radboud University Photo Collection Summarization Author: Luuk Scholten s4126424 First supervisor/assessor: prof. dr. ir. A.P. de Vries (Arjen) Second assessor: prof. dr. M.A. Larson (Martha)

More information

arxiv: v2 [cs.ir] 4 Dec 2018

arxiv: v2 [cs.ir] 4 Dec 2018 Qingyao Ai CICS, UMass Amherst Amherst, MA, USA aiqy@cs.umass.edu Xuanhui Wang xuanhui@google.com Nadav Golbandi nadavg@google.com arxiv:1811.04415v2 [cs.ir] 4 Dec 2018 ABSTRACT Michael Bendersky bemike@google.com

More information

Session Based Click Features for Recency Ranking

Session Based Click Features for Recency Ranking Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Session Based Click Features for Recency Ranking Yoshiyuki Inagaki and Narayanan Sadagopan and Georges Dupret and Ciya

More information

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Misha Bilenko and Ryen White presented by Matt Richardson Microsoft Research Search = Modeling User Behavior

More information

arxiv: v1 [cs.ir] 19 Sep 2016

arxiv: v1 [cs.ir] 19 Sep 2016 Enhancing LambdaMART Using Oblivious Trees Marek Modrý 1 and Michal Ferov 2 arxiv:1609.05610v1 [cs.ir] 19 Sep 2016 1 Seznam.cz, Radlická 3294/10, 150 00 Praha 5, Czech Republic marek.modry@firma.seznam.cz

More information

Learning to Rank: A New Technology for Text Processing

Learning to Rank: A New Technology for Text Processing TFANT 07 Tokyo Univ. March 2, 2007 Learning to Rank: A New Technology for Text Processing Hang Li Microsoft Research Asia Talk Outline What is Learning to Rank? Ranking SVM Definition Search Ranking SVM

More information

Deep Learning for Recommender Systems

Deep Learning for Recommender Systems join at Slido.com with #bigdata2018 Deep Learning for Recommender Systems Oliver Gindele @tinyoli oliver.gindele@datatonic.com Big Data Conference Vilnius 28.11.2018 Who is Oliver? + Head of Machine Learning

More information

Model Adaptation via Model Interpolation and Boosting for Web Search Ranking

Model Adaptation via Model Interpolation and Boosting for Web Search Ranking Model Adaptation via Model Interpolation and Boosting for Web Search Ranking Jianfeng Gao *, Qiang Wu *, Chris Burges *, Krysta Svore *, Yi Su #, azan Khan $, Shalin Shah $, Hongyan Zhou $ *Microsoft Research,

More information

Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER

Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER Learning to rank, a supervised approach for ranking of documents Master Thesis in Computer Science - Algorithms, Languages and Logic KRISTOFER TAPPER Chalmers University of Technology University of Gothenburg

More information

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners Data Mining 3.5 (Instance-Based Learners) Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction k-nearest-neighbor Classifiers References Introduction Introduction Lazy vs. eager learning Eager

More information

Jan Pedersen 22 July 2010

Jan Pedersen 22 July 2010 Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries

More information

Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track

Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track Opinions in Federated Search: University of Lugano at TREC 2014 Federated Web Search Track Anastasia Giachanou 1,IlyaMarkov 2 and Fabio Crestani 1 1 Faculty of Informatics, University of Lugano, Switzerland

More information

CPSC 340: Machine Learning and Data Mining. Multi-Class Classification Fall 2017

CPSC 340: Machine Learning and Data Mining. Multi-Class Classification Fall 2017 CPSC 340: Machine Learning and Data Mining Multi-Class Classification Fall 2017 Assignment 3: Admin Check update thread on Piazza for correct definition of trainndx. This could make your cross-validation

More information

Learning to Rank on Network Data

Learning to Rank on Network Data Learning to Rank on Network Data Majid Yazdani Idiap Research Institute/EPFL 1920 Martigny, Switzerland majid.yazdani@idiap.ch Ronan Collobert Idiap Research Institute 1920 Martigny, Switzerland ronan.collobert@idiap.ch

More information

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression

A Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression Journal of Data Analysis and Information Processing, 2016, 4, 55-63 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jdaip http://dx.doi.org/10.4236/jdaip.2016.42005 A Comparative Study

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Evaluation Rank-Based Measures Binary relevance Precision@K (P@K) Mean Average Precision (MAP) Mean Reciprocal Rank (MRR) Multiple levels of relevance Normalized Discounted

More information

A new click model for relevance prediction in web search

A new click model for relevance prediction in web search A new click model for relevance prediction in web search Alexander Fishkov 1 and Sergey Nikolenko 2,3 1 St. Petersburg State Polytechnical University jetsnguns@gmail.com 2 Steklov Mathematical Institute,

More information

arxiv: v1 [cs.ir] 2 Feb 2015

arxiv: v1 [cs.ir] 2 Feb 2015 Context Models For Web Search Personalization Maksims N. Volkovs University of Toronto 40 St. George Street Toronto, ON M5S 2E4 mvolkovs@cs.toronto.edu arxiv:502.00527v [cs.ir] 2 Feb 205 ABSTRACT We present

More information

arxiv: v2 [cs.ir] 27 Feb 2019

arxiv: v2 [cs.ir] 27 Feb 2019 Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm arxiv:1809.05818v2 [cs.ir] 27 Feb 2019 ABSTRACT Ziniu Hu University of California, Los Angeles, USA bull@cs.ucla.edu Recently a number

More information

Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology

Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology Evaluating search engines CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

Analysis on the technology improvement of the library network information retrieval efficiency

Analysis on the technology improvement of the library network information retrieval efficiency Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):2198-2202 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Analysis on the technology improvement of the

More information

ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH

ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH ASCERTAINING THE RELEVANCE MODEL OF A WEB SEARCH-ENGINE BIPIN SURESH Abstract We analyze the factors contributing to the relevance of a web-page as computed by popular industry web search-engines. We also

More information

Final Report: Keyword extraction from text

Final Report: Keyword extraction from text Final Report: Keyword extraction from text Gowtham Rangarajan R and Sainyam Galhotra December 2015 Abstract We have devised a model for tagging stack overflow questions with keywords which help classify

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning, Pandu Nayak and Prabhakar Raghavan Evaluation 1 Situation Thanks to your stellar performance in CS276, you

More information

Relevance and Effort: An Analysis of Document Utility

Relevance and Effort: An Analysis of Document Utility Relevance and Effort: An Analysis of Document Utility Emine Yilmaz, Manisha Verma Dept. of Computer Science University College London {e.yilmaz, m.verma}@cs.ucl.ac.uk Nick Craswell, Filip Radlinski, Peter

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

A Few Things to Know about Machine Learning for Web Search

A Few Things to Know about Machine Learning for Web Search AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies Talk Outline My projects at MSRA Some conclusions from our research

More information

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function: 1 Query-adaptive Image Retrieval by Deep Weighted Hashing Jian Zhang and Yuxin Peng arxiv:1612.2541v2 [cs.cv] 9 May 217 Abstract Hashing methods have attracted much attention for large scale image retrieval.

More information

Automatic Domain Partitioning for Multi-Domain Learning

Automatic Domain Partitioning for Multi-Domain Learning Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Information Search and Management Prof. Chris Clifton 27 August 2018 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 AD-hoc IR: Basic Process Information

More information

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Search Evaluation Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Table of Content Search Engine Evaluation Metrics for relevancy Precision/recall F-measure MAP NDCG Difficulties in Evaluating

More information

Linking Entities in Tweets to Wikipedia Knowledge Base

Linking Entities in Tweets to Wikipedia Knowledge Base Linking Entities in Tweets to Wikipedia Knowledge Base Xianqi Zou, Chengjie Sun, Yaming Sun, Bingquan Liu, and Lei Lin School of Computer Science and Technology Harbin Institute of Technology, China {xqzou,cjsun,ymsun,liubq,linl}@insun.hit.edu.cn

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences

More information

arxiv: v1 [cs.ir] 16 Sep 2018

arxiv: v1 [cs.ir] 16 Sep 2018 A Novel Algorithm for Unbiased Learning to Rank arxiv:1809.05818v1 [cs.ir] 16 Sep 2018 ABSTRACT Ziniu Hu University of California, Los Angeles acbull@g.ucla.edu Qu Peng ByteDance Technology pengqu@bytedance.com

More information

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

TriRank: Review-aware Explainable Recommendation by Modeling Aspects TriRank: Review-aware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia

More information

Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India

Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India Experiment Design and Evaluation for Information Retrieval Rishiraj Saha Roy Computer Scientist, Adobe Research Labs India rroy@adobe.com 2014 Adobe Systems Incorporated. All Rights Reserved. 1 Introduction

More information

Modeling Query Term Dependencies in Information Retrieval with Markov Random Fields

Modeling Query Term Dependencies in Information Retrieval with Markov Random Fields Modeling Query Term Dependencies in Information Retrieval with Markov Random Fields Donald Metzler metzler@cs.umass.edu W. Bruce Croft croft@cs.umass.edu Department of Computer Science, University of Massachusetts,

More information

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation C.J. Norsigian Department of Bioengineering cnorsigi@eng.ucsd.edu Vishwajith Ramesh Department of Bioengineering vramesh@eng.ucsd.edu

More information

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 1)"

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 1) CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 1)" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Retrieval Models" Provide

More information

A Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search

A Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search A Unified Approach to Learning Task-Specific Bit Vector Representations for Fast Nearest Neighbor Search Vinod Nair Yahoo! Labs Bangalore vnair@yahoo-inc.com Dhruv Mahajan Yahoo! Labs Bangalore dkm@yahoo-inc.com

More information

arxiv: v2 [cs.lg] 2 Jun 2016

arxiv: v2 [cs.lg] 2 Jun 2016 arxiv:1511.06411v2 [cs.lg] 2 Jun 2016 Yang Song Dept. of Physics, Tsinghua University, Beijing 100084, China SONGYANG12@MAILS.TSINGHUA.EDU.CN Alexander G. Schwing ASCHWING@CS.TORONTO.EDU Richard S. Zemel

More information

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation"

CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation CSCI 599: Applications of Natural Language Processing Information Retrieval Evaluation" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Evaluation" Evaluation is key to building

More information

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 22 2019 Outline Practical issues in Linear Regression Outliers Categorical variables Lab

More information

CS224W Final Report Emergence of Global Status Hierarchy in Social Networks

CS224W Final Report Emergence of Global Status Hierarchy in Social Networks CS224W Final Report Emergence of Global Status Hierarchy in Social Networks Group 0: Yue Chen, Jia Ji, Yizheng Liao December 0, 202 Introduction Social network analysis provides insights into a wide range

More information

A Survey on Postive and Unlabelled Learning

A Survey on Postive and Unlabelled Learning A Survey on Postive and Unlabelled Learning Gang Li Computer & Information Sciences University of Delaware ligang@udel.edu Abstract In this paper we survey the main algorithms used in positive and unlabeled

More information

Learning-to-rank with Prior Knowledge as Global Constraints

Learning-to-rank with Prior Knowledge as Global Constraints Learning-to-rank with Prior Knowledge as Global Constraints Tiziano Papini and Michelangelo Diligenti 1 Abstract. A good ranking function is the core of any Information Retrieval system. The ranking function

More information

Name of the lecturer Doç. Dr. Selma Ayşe ÖZEL

Name of the lecturer Doç. Dr. Selma Ayşe ÖZEL Y.L. CENG-541 Information Retrieval Systems MASTER Doç. Dr. Selma Ayşe ÖZEL Information retrieval strategies: vector space model, probabilistic retrieval, language models, inference networks, extended

More information

Deep Reinforcement Learning

Deep Reinforcement Learning Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3. Policy Gradient and Gradient Estimators 4. Q-prop: Sample Efficient Policy Gradient and an Off-policy Critic

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

MTTTS17 Dimensionality Reduction and Visualization. Spring 2018 Jaakko Peltonen. Lecture 11: Neighbor Embedding Methods continued

MTTTS17 Dimensionality Reduction and Visualization. Spring 2018 Jaakko Peltonen. Lecture 11: Neighbor Embedding Methods continued MTTTS17 Dimensionality Reduction and Visualization Spring 2018 Jaakko Peltonen Lecture 11: Neighbor Embedding Methods continued This Lecture Neighbor embedding by generative modeling Some supervised neighbor

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Data Mining: Models and Methods

Data Mining: Models and Methods Data Mining: Models and Methods Author, Kirill Goltsman A White Paper July 2017 --------------------------------------------------- www.datascience.foundation Copyright 2016-2017 What is Data Mining? Data

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Learning Ranking Functions with Implicit Feedback

Learning Ranking Functions with Implicit Feedback Learning Ranking Functions with Implicit Feedback CS4780 Machine Learning Fall 2011 Pannaga Shivaswamy Cornell University These slides are built on an earlier set of slides by Prof. Joachims. Current Search

More information

Deep Character-Level Click-Through Rate Prediction for Sponsored Search

Deep Character-Level Click-Through Rate Prediction for Sponsored Search Deep Character-Level Click-Through Rate Prediction for Sponsored Search Bora Edizel - Phd Student UPF Amin Mantrach - Criteo Research Xiao Bai - Oath This work was done at Yahoo and will be presented as

More information

Pre-Requisites: CS2510. NU Core Designations: AD

Pre-Requisites: CS2510. NU Core Designations: AD DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Information Retrieval. Information Retrieval and Web Search

Information Retrieval. Information Retrieval and Web Search Information Retrieval and Web Search Introduction to IR models and methods Information Retrieval The indexing and retrieval of textual documents. Searching for pages on the World Wide Web is the most recent

More information

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur 603 203 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER CS6007-INFORMATION RETRIEVAL Regulation 2013 Academic Year 2018

More information

Deep Learning for Embedded Security Evaluation

Deep Learning for Embedded Security Evaluation Deep Learning for Embedded Security Evaluation Emmanuel Prouff 1 1 Laboratoire de Sécurité des Composants, ANSSI, France April 2018, CISCO April 2018, CISCO E. Prouff 1/22 Contents 1. Context and Motivation

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information