Lecture 3: Improving Ranking with

Size: px
Start display at page:

Download "Lecture 3: Improving Ranking with"

Transcription

1 Modeling User Behavior and Interactions Lecture 3: Improving Ranking with Behavior Data 1

2 * &!" ) & $ & 6+ $ & 6+ + "& & 8 > " + & 2 Lecture 3 Plan $ ( ' # $ #! & $ #% #"! #! -( #", # + 4 / / 0. $ #! *7 5 # &# (" -( 8* ( ( # 5 # < ; : 9 $ # # %!? (" # %, = &

3 . -, " 98 ' : 7-2 =#? C : ) BE =? K L PN O KJ d c d c [Z Y W X U K L Q N K M f Z m U l Review: Learning to Rank / # * & )( ' & &% $ #! $; 2 $% $ " * -, "# 4 6 4"5 - " # # # 3, % : : ) & ;? ) 9 ) $; & > $ $% 8 3 < 7 % $ A? ' $ > 2)%. D 7 F H K PR J Q H K H K L MN HG I `a b b e _ `a b b ^_ ]\ U V W X S T U V H j h i K O ghk K NM J N T U V S T U V [ kx 3

4 * & $ $! = ; 8 ) ) 7 6 0/ K G J J F F RM JN OL M J JN OL N M L G KK G JIH \ F M G L l G S lg S N J i J PM M O D N M G JI OL h M M J } zy M Q G JI OL O J qt ˆ M \ G JN J N R O\ i hj M M J } f T V N J RM Ordinal Regression Approaches # "! + ( ' ) " $ % # "! B > < 9 : 8 ) ) ,-. D C Z [ Y X T W T UV V FL S O NF Q GF PM E FG t s sr gp qpz o S O l m n k [ j N h J gf e cd b _`a ] ^ C w v u s} k T p GRF N G q PM JEF O g N~ PM L O G S VQ NO GF GPL [ c{ b x d a ƒ v u N M K P ~M N~ J J l Š~L l S O S ~ R O l ~ S n L G J N T ~ F j N ~ ~\ Z M ~ j N O g N~ PM L O G S VQ NO G F Q GPL [ y x d a Žd Œ d s sr Z p o MFV ~ [ 4

5 ( ' # # " & , B A A 2.- F 9? C # "! ', / 2 - / E 4 2 U 4 5. A.? 4 5. Learning to Rank Summary % " $ /?. >7 9 =* :;< / )* D9.- 9 >?. H4 1G- E? 98 L " % K J " I! 7 ST 54 Q R N OP /? >? 5 3 F?. > 5 8 M - / 93 > 1 E 84 V 54 /?. > 5 8 E - 5

6 # # " < 4 # " # 4 / - 7? < A 4 / A A A Approaches to Use Behavior Data J! & $ > 3 9 A? > 3 9 A? >V E7> 59 4! " K ' J E 1> < E > 5 - / - 1> 8 / - > V? E - 5 > - 5 >

7 Recap: Available Behavior Data 7

8 Training Examples from Click Data [ Joachims 2002 ] 8

9 Loss Function [ Joachims 2002 ] 9

10 Learned Retrieval Function [ Joachims 2002 ] 10

11 Features [ Joachims 2002 ] 11

12 Results [ Joachims 2002 ] Summary: Learned outperforms all base methods in experiment Learning from clickthrough data is possible Relative preferences are useful training data. 12

13 Extension: Query Chains [Radlinski & Joachims, KDD 2005] 13

14 Query Chains (Cont d) [Radlinski & Joachims, KDD 2005] 14

15 Query Chains (Results) [Radlinski & Joachims, KDD 2005] 15

16 G S LG G R O\ L G O K G PM M OL G PM j J M LR J S O G i J PM S J F F M G GFL i J PM S J % PM M J S NG J J FL G L PM M O g F \ G O P I M OM 16 Lecture 3 Plan S F S O N J GRM O N~ [ h J j N GROl F N[! Z R j N O G Ol NO F O Ij N O G $ # " L O G R IJ K ~ h J Nn [

17 Incorporating Behavior for Static Rank [Richardson et al., WWW2006] Web Crawl Build Index Answer Queries Which pages to crawl Efficient index order Informs dynamic ranking Static Rank 17

18 # " # # $ # I % " ' $ % L I ' [Richardson et al., WWW2006] frank Machine Learning Model 18 Web

19 K # " J Features: Summary [Richardson et al., WWW2006] 19

20 [ c \ \ \ ] ] e \ \ Features: Popularity [Richardson et al., WWW2006] Function Exact URL No Params Page URL-1 URL-2 Domain Domain+1 Example cnn.com/2005/tech/wikipedia.html?v=mobile cnn.com/2005/tech/wikipedia.html wikipedia.html cnn.com/2005/tech cnn.com/2005 cnn.com cnn.com/2005 Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 20

21 1 > 52 /.? 1 54 F E / 4 ;?.? A V 7.? 7 G / D A F? V / 2 Features: Anchor, Page, Domain [Richardson et al., WWW2006] / / / -+ /? 52? 34 / - 9 A E>5 F>5? G A 7 G? > 5?. E? 4 7? 5 - -, / - 3 A / -.? 5 3 / F 3 9?? -. / - 9 A E>5? - 84 H4.- > 54D 7? 3 > H.- 21

22 F? M 9, 1/ M C F< >E 2 4 < >G 9 52 = [Richardson et al., WWW2006] Data E ?? 7? 3 54 < > 5 G 9 C < /? - 9? 1 =.-2 / 4 F?.? / / - 4 < - 54 E - H - > 5 8. A 7 -..? 7 -+ > 5 /? E,, 4 E/ E -, >.2 9- > E?

23 U O Q U TPU P Becoming Query Independent [Richardson et al., WWW2006] OU S U N R N N PO R PU 23

24 [Richardson et al., WWW2006] Measure p H S p pairwise accuracy = H p 24

25 RankNet, Burges et al., ICML 2005 [Richardson et al., WWW2006] Feature Vector Label NN output Error is function of label and output 25

26 RankNet [Burges et al. 2005] [Richardson et al., WWW2006] Feature Vector1 Label1 NN output 1 26

27 RankNet [Burges et al. 2005] [Richardson et al., WWW2006] Feature Vector2 Label2 NN output 1 NN output 2 27

28 RankNet [Burges et al. 2005] [Richardson et al., WWW2006] NN output 1 NN output 2 Error is function of both outputs (Desire output1 > output2) 28

29 RankNet [Burges et al. 2005] [Richardson et al., WWW2006] Feature Vector1 NN output 29

30 Experimental Methodology [Richardson et al., WWW2006] 30

31 Accuracy of Each Feature Set [Richardson et al., WWW2006] PageRank Popularity Anchor Page Domain (1) (14) (5) (8) (7) All Feature Set Accuracy (%) PageRank Popularity Anchor Page Domain All Features

32 ! #! % # %! & Qualitative Evaluation [Richardson et al., WWW2006] # #! "! # $ #! % % #! " $! $!! $ $ Technology Oriented Consumer Oriented 32

33 Behavior for Dynamic Ranking [Agichtein et al., SIGIR2006] ResultPosition QueryTitleOverlap DeliberationTime ClickFrequency ClickDeviation DwellTime DwellTimeDeviation Presentation Position of the URL in Current ranking Fraction of query terms in result Title Clickthrough Seconds between query and first click Fraction of all clicks landing on page Deviation from expected click frequency Browsing Result page dwell time Deviation from expected dwell time for query Sample Behavior Features (from Lecture 2) 33

34 ! 9 = Feature Merging: Details [Agichtein et al., SIGIR2006] Query: SIGIR, fake results w/ fake feature values Result URL BM25 PageRank Clicks DwellTime sigir2007.org ?? Sigir2006.org acm.org/sigs/sigir/ "!# ( ) ( ) % ' & $ % $ # <+ ;. - : / 0. -, * + 34

35 Results for Incorporating Behavior into Ranking [Agichtein et al., SIGIR2006] NDCG RN Rerank-All RN+All K MAP Gain RN RN+ALL (19.13%) BM BM25+ALL (23.71%) 35

36 Which Queries Benefit Most [Agichtein et al., SIGIR2006] Frequency Average Gain Most gains are for queries with poor original ranking 36

37 ! Lecture 3 Plan Automatic relevance labels Enriching feature space Dealing with data sparseness Dealing with Scale " Active learning Ranking for diversity Fun and games 37

38 Extension to Unseen [Bilenko and White, WWW 2008] Queries/Documents: Search Trails Trails start with a search engine query Continue until a terminating event Another search Visit to an unrelated site (social networks, webmail) Timeout, browser homepage, browser closing 38

39 Probabilistic Model [Bilenko and White, WWW 2008] IR via language modeling [Zhai-Lafferty, Lavrenko] Query-term distribution gives more mass to rare terms: Term-website weights 39

40 Results: Learning to Rank [Bilenko and White, WWW 2008] Add ( ) as a feature to RankNet Baseline Baseline+Heuristic Baseline+Probabilistic 0.58 NDCG@1 NDCG@3 NDCG@10 Baseline+Probabilistic+RW 40

41 Scalability: (Peta?)bytes of Click Data BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009 query URL 1 URL 2 URL 3 URL 4 S 1 S 2 S 3 S 4 Relevance E 1 E 2 E 3 E 4 Examine Snippet C 1 C 2 C 3 C 4 ClickThroughs 41

42 Training BBM: One-Pass Counting BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009 Find R j 42

43 Training BBM on MapReduce BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009 Map: emit((q,u), idx) Reduce: construct the count vector 43

44 Model Comparison on Efficiency BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD times faster 44

45 Large-Scale Experiment BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009 Setup: 8 weeks data, 8 jobs Job k takes first k- week data Experiment platform SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets [Chaiken et al, VLDB 08] 45

46 Scalability of BBM BBM: Bayesian Browsing Model from Petabyte-scale Data, Liu et al, KDD 2009 Increasing computation load more queries, more URLs, more impressions Near-constant elapsed time 3 hours Scan 265 terabyte data Full posteriors for 1.15 billion (query, url) pairs Computation Overload Elapse Time on SCOPE 46

47 Lecture 3 Plan Automatic relevance labels Enriching feature space Dealing with data sparseness Dealing with Scale Active learning Ranking for diversity 47

48 New Direction: Active Learning Goal: Learn the relevanceswith as little training data as possible. Search involves a three step process: [Radlinski & Joachims, KDD 2007] 2. Given a ranking, users provide feedback: User clicks provide pairwise relevance judgments. 3. Given feedback, update the relevance estimates. 48

49 Overview of Approach [Radlinski & Joachims, KDD 2007] Available information: 1. Have an estimate of the relevance of each result. 2. Can obtain pairwise comparisons of the top few results. 3. Do have absolute relevance information. Goal: Learn the document relevance quickly. Will address four questions: 1. How to represent knowledge about doc relevance. 2. How to maintain this knowledge as we collect data. 3. Given our knowledge, what is the best ranking? 4. What rankings do we show users to get useful data? 49

50 1: Representing Document Relevance Given a query, q, let M = (µ 1,..., µ C ) M be the true relevance values of the documents. Model knowledge of M with a Bayesian: P (M D) = P (D M ) P (M )/P (D) [Radlinski & Joachims, KDD 2007] Assume P (M D) is spherical multivariate normal: P (M D) = N (ν 1,..., ν C ; σ 12,..., σ C 2 ) Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 50

51 1: Representing Document Relevance [Radlinski & Joachims, KDD 2007] Given a fixed query, maintain knowledge about relevance as clicks are observed. This tells us which documents we are sure about, and which ones need more data. Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 51

52 2: Maintaining P(M D) [Radlinski & Joachims, KDD 2007] Model noisy pairwise judgments w [Bradley-Terry 52] Adding a Gaussian prior, apply off-the-shelf algorithm to maintain, commonly used for chess [Glickman 1999] Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 52

53 3: Ranking (Inference) [Radlinski & Joachims, KDD 2007] Want to assign relevancesm = (µ 1,..., µ C ) such that L(M, M ) is small, but M is unknown. Minimize loss (pairwise): Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 53

54 4: Getting Useful Data [Radlinski & Joachims, KDD 2007] : could present the ranking based on best estimate of relevance. Then the data we get would always be about the documents already ranked highly. Instead, ranking shown users: 1. Pick top two docs to minimize 2. Append current best estimate ranking. Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 54

55 4: Exploration Strategies [Radlinski & Joachims, KDD 2007] Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 55

56 4: Loss Functions [Radlinski & Joachims, KDD 2007] Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 56

57 Results: TREC Data [Radlinski & Joachims, KDD 2007] Optimizing for better than for Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 57

58 Need for Diversity (in IR) Ambiguous Queries Users with different information needs issuing the same textual query ( Jaguar ) Informational (Exploratory) Queries: [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] User interested in a specific detail or entire breadth of knowledge available [Swaminathan et al., 2008] Want results with high information diversity 58

59 Optimizing for Diversity Long interest in IR community Requires [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] Impossible given current learning to rank methods Problem: no consensus on how to measure diversity. Formulate as predicting diverse subsets Experiment: Use training data with explicitly labeled subtopics (TREC 6-8 Interactive Track) Use loss function to encode subtopic loss Train using structural SVMs [Tsochantaridis et al., 2005] 59

60 Representing Diversity Existing datasets with manual subtopic labels E.g., Use of robots in the world today Nanorobots Space mission robots Underwater robots [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] Manual partitioning of the total information regarding a query Relatively reliable 60

61 Example [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] Choose K documents with maximal information coverage. For K = 3, optimal set is {D1, D2, D10} 61

62 Maximizing Subtopic Coverage select K documents which collectively cover as many subtopics as possible. [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] Perfect selection takes n choose K time. Set cover problem. Greedy gives (1-1/e)-approximation bound. Special case of Max Coverage (Khulleret al, 1997) 62

63 Weighted Word Coverage More distinct words = more information Weight word importance Does not depend on human labels select K documents which collectively cover as many distinct (weighted) words as possible [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] Greedy selection also yields (1-1/e) bound. Need to find good weighting function (learning problem). 63

64 Example [Predicting Diverse Subsets Using Structural SVMs, Y. Yue and Joachims, ICML 2008] Document Word Counts Word Benefit V1 V2 V3 V4 V5 V1 1 D1 X X X V2 2 D2 X X X V3 3 D3 X X X X V4 4 V5 5 Marginal Benefit D1 D2 D3 Best Iter D1 Iter 2 64

65 [Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008] Document Word Counts Word Benefit V1 V2 V3 V4 V5 V1 1 D1 X X X V2 2 D2 X X X V3 3 D3 X X X X V4 4 V5 5 Marginal Benefit D1 D2 D3 Best Iter D1 Iter D3 65

66 12/4/1 train/valid/test split Approx 500 documents in training set [Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008] Permuted until all 17 queries were tested once Set K=5 (some queries have very few documents) SVM-div uses term frequency thresholds to define importance levels SVM-div2 in addition uses TFIDF thresholds 66

67 [Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008] Method Loss Random Okapi Unweighted Model Essential Pages SVM-div SVM-div Methods SVM-div vs Ess. Pages SVM-div2 vs Ess. Pages SVM-div vs SVM-div2 W / T / L 14 / 0 / 3 ** 13 / 0 / 4 9 / 6 / 2 67

68 [Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008] Can expect further benefit from having more training data. Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 68

69 Predicting Diverse Subsets Using Structural SVMs Y. Yue and Joachims, ICML 2008 Formulated diversified retrieval as predicting diverse subsets Efficient training and prediction algorithms Used weighted word coverage as proxy to information coverage. Encode diversity criteria using loss function Weighted subtopic loss 69

70 Lecture 3 Summary Automatic relevance labels Enriching feature space Dealing with data sparseness Dealing with Scale Active learning Ranking for diversity 70

71 # $ * %! )! / Key References and Further Reading, T , KDD 2002, E., Brill, E., Dumais, S. # $ # "!, SIGIR 2006 %! % +, F. and Joachims, T., KDD 2005, F. and Joachims, T. ) (' & # % -, ) (' &, KDD 2007 % " + % % /, M and White, R, (., WWW 2008 % # "! % $ $ # 5!,Yand Joachims,, ICML 2008,C., Guo, F., and Faloutsos, C., KDD / / " 9 $ : # $ 3 % 71

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data Misha Bilenko and Ryen White presented by Matt Richardson Microsoft Research Search = Modeling User Behavior

More information

CS 572: Information Retrieval. Implicit Searcher Feedback

CS 572: Information Retrieval. Implicit Searcher Feedback CS 572: Information Retrieval Implicit Searcher Feedback Information Retrieval Process Overview Credit: Jimmy Lin, Doug Oard, Source Selection Resource Search Engine Result Page (SERP) Query Formulation

More information

Modeling User Behavior and Interactions. Lecture 2: Interpreting Behavior Data. Eugene Agichtein

Modeling User Behavior and Interactions. Lecture 2: Interpreting Behavior Data. Eugene Agichtein Modeling User Behavior and Interactions ti Lecture 2: Interpreting Behavior Data Eugene Agichtein Emory University Lecture 2 Plan Explicit Feedback in IR Query expansion User control Click From Clicks

More information

WebSci and Learning to Rank for IR

WebSci and Learning to Rank for IR WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

Modeling Contextual Factors of Click Rates

Modeling Contextual Factors of Click Rates Modeling Contextual Factors of Click Rates Hila Becker Columbia University 500 W. 120th Street New York, NY 10027 Christopher Meek Microsoft Research 1 Microsoft Way Redmond, WA 98052 David Maxwell Chickering

More information

Learning to Rank (part 2)

Learning to Rank (part 2) Learning to Rank (part 2) NESCAI 2008 Tutorial Filip Radlinski Cornell University Recap of Part 1 1. Learning to rank is widely used for information retrieval, and by web search engines. 2. There are many

More information

University of Delaware at Diversity Task of Web Track 2010

University of Delaware at Diversity Task of Web Track 2010 University of Delaware at Diversity Task of Web Track 2010 Wei Zheng 1, Xuanhui Wang 2, and Hui Fang 1 1 Department of ECE, University of Delaware 2 Yahoo! Abstract We report our systems and experiments

More information

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task Chieh-Jen Wang, Yung-Wei Lin, *Ming-Feng Tsai and Hsin-Hsi Chen Department of Computer Science and Information Engineering,

More information

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang

Learning to Rank. from heuristics to theoretic approaches. Hongning Wang Learning to Rank from heuristics to theoretic approaches Hongning Wang Congratulations Job Offer from Bing Core Ranking team Design the ranking module for Bing.com CS 6501: Information Retrieval 2 How

More information

Learning Socially Optimal Information Systems from Egoistic Users

Learning Socially Optimal Information Systems from Egoistic Users Learning Socially Optimal Information Systems from Egoistic Users Karthik Raman Thorsten Joachims Department of Computer Science Cornell University, Ithaca NY karthik@cs.cornell.edu www.cs.cornell.edu/

More information

Information Retrieval

Information Retrieval Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data

More information

Characterizing Search Intent Diversity into Click Models

Characterizing Search Intent Diversity into Click Models Characterizing Search Intent Diversity into Click Models Botao Hu 1,2, Yuchen Zhang 1,2, Weizhu Chen 2,3, Gang Wang 2, Qiang Yang 3 Institute for Interdisciplinary Information Sciences, Tsinghua University,

More information

Letter Pair Similarity Classification and URL Ranking Based on Feedback Approach

Letter Pair Similarity Classification and URL Ranking Based on Feedback Approach Letter Pair Similarity Classification and URL Ranking Based on Feedback Approach P.T.Shijili 1 P.G Student, Department of CSE, Dr.Nallini Institute of Engineering & Technology, Dharapuram, Tamilnadu, India

More information

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan,

Learning to Rank. Tie-Yan Liu. Microsoft Research Asia CCIR 2011, Jinan, Learning to Rank Tie-Yan Liu Microsoft Research Asia CCIR 2011, Jinan, 2011.10 History of Web Search Search engines powered by link analysis Traditional text retrieval engines 2011/10/22 Tie-Yan Liu @

More information

Personalized Web Search

Personalized Web Search Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents

More information

A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation

A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation A Task Level Metric for Measuring Web Search Satisfaction and its Application on Improving Relevance Estimation Ahmed Hassan Microsoft Research Redmond, WA hassanam@microsoft.com Yang Song Microsoft Research

More information

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann Search Engines Chapter 8 Evaluating Search Engines 9.7.2009 Felix Naumann Evaluation 2 Evaluation is key to building effective and efficient search engines. Drives advancement of search engines When intuition

More information

Structured Ranking Learning using Cumulative Distribution Networks

Structured Ranking Learning using Cumulative Distribution Networks Structured Ranking Learning using Cumulative Distribution Networks Jim C. Huang Probabilistic and Statistical Inference Group University of Toronto Toronto, ON, Canada M5S 3G4 jim@psi.toronto.edu Brendan

More information

Optimizing Search Engines using Click-through Data

Optimizing Search Engines using Click-through Data Optimizing Search Engines using Click-through Data By Sameep - 100050003 Rahee - 100050028 Anil - 100050082 1 Overview Web Search Engines : Creating a good information retrieval system Previous Approaches

More information

Evaluating the Accuracy of. Implicit feedback. from Clicks and Query Reformulations in Web Search. Learning with Humans in the Loop

Evaluating the Accuracy of. Implicit feedback. from Clicks and Query Reformulations in Web Search. Learning with Humans in the Loop Evaluating the Accuracy of Implicit Feedback from Clicks and Query Reformulations in Web Search Thorsten Joachims, Filip Radlinski, Geri Gay, Laura Granka, Helene Hembrooke, Bing Pang Department of Computer

More information

Clickthrough Log Analysis by Collaborative Ranking

Clickthrough Log Analysis by Collaborative Ranking Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Clickthrough Log Analysis by Collaborative Ranking Bin Cao 1, Dou Shen 2, Kuansan Wang 3, Qiang Yang 1 1 Hong Kong

More information

Towards Predicting Web Searcher Gaze Position from Mouse Movements

Towards Predicting Web Searcher Gaze Position from Mouse Movements Towards Predicting Web Searcher Gaze Position from Mouse Movements Qi Guo Emory University 400 Dowman Dr., W401 Atlanta, GA 30322 USA qguo3@emory.edu Eugene Agichtein Emory University 400 Dowman Dr., W401

More information

Web Search. Lecture Objectives. Text Technologies for Data Science INFR Learn about: 11/14/2017. Instructor: Walid Magdy

Web Search. Lecture Objectives. Text Technologies for Data Science INFR Learn about: 11/14/2017. Instructor: Walid Magdy Text Technologies for Data Science INFR11145 Web Search Instructor: Walid Magdy 14-Nov-2017 Lecture Objectives Learn about: Working with Massive data Link analysis (PageRank) Anchor text 2 1 The Web Document

More information

Inferring Searcher Intent

Inferring Searcher Intent , 11 July 2010 Inferring Searcher Intent Tutorial Website (for expanded and updated bibliography): http://ir.mathcs.emory.edu/intent_tutorial/ Instructor contact information: Email: eugene@mathcs.emory.edu

More information

Adapting Ranking Functions to User Preference

Adapting Ranking Functions to User Preference Adapting Ranking Functions to User Preference Keke Chen, Ya Zhang, Zhaohui Zheng, Hongyuan Zha, Gordon Sun Yahoo! {kchen,yazhang,zhaohui,zha,gzsun}@yahoo-inc.edu Abstract Learning to rank has become a

More information

Mining long-lasting exploratory user interests from search history

Mining long-lasting exploratory user interests from search history Mining long-lasting exploratory user interests from search history Bin Tan, Yuanhua Lv and ChengXiang Zhai Department of Computer Science, University of Illinois at Urbana-Champaign bintanuiuc@gmail.com,

More information

ICTNET at Web Track 2010 Diversity Task

ICTNET at Web Track 2010 Diversity Task ICTNET at Web Track 2010 Diversity Task Yuanhai Xue 1,2, Zeying Peng 1,2, Xiaoming Yu 1, Yue Liu 1, Hongbo Xu 1, Xueqi Cheng 1 1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

CSCI 5417 Information Retrieval Systems. Jim Martin!

CSCI 5417 Information Retrieval Systems. Jim Martin! CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 7 9/13/2011 Today Review Efficient scoring schemes Approximate scoring Evaluating IR systems 1 Normal Cosine Scoring Speedups... Compute the

More information

Learning Dense Models of Query Similarity from User Click Logs

Learning Dense Models of Query Similarity from User Click Logs Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational

More information

Learning Ranking Functions with SVMs

Learning Ranking Functions with SVMs Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2012 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference

More information

Adaptive Search Engines Learning Ranking Functions with SVMs

Adaptive Search Engines Learning Ranking Functions with SVMs Adaptive Search Engines Learning Ranking Functions with SVMs CS478/578 Machine Learning Fall 24 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings

More information

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Search Evaluation Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Table of Content Search Engine Evaluation Metrics for relevancy Precision/recall F-measure MAP NDCG Difficulties in Evaluating

More information

A Dynamic Bayesian Network Click Model for Web Search Ranking

A Dynamic Bayesian Network Click Model for Web Search Ranking A Dynamic Bayesian Network Click Model for Web Search Ranking Olivier Chapelle and Anne Ya Zhang Apr 22, 2009 18th International World Wide Web Conference Introduction Motivation Clicks provide valuable

More information

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia

Learning to Rank for Information Retrieval. Tie-Yan Liu Lead Researcher Microsoft Research Asia Learning to Rank for Information Retrieval Tie-Yan Liu Lead Researcher Microsoft Research Asia 4/20/2008 Tie-Yan Liu @ Tutorial at WWW 2008 1 The Speaker Tie-Yan Liu Lead Researcher, Microsoft Research

More information

Learning Similarity Metrics for Event Identification in Social Media. Hila Becker, Luis Gravano

Learning Similarity Metrics for Event Identification in Social Media. Hila Becker, Luis Gravano Learning Similarity Metrics for Event Identification in Social Media Hila Becker, Luis Gravano Columbia University Mor Naaman Rutgers University Event Content in Social Media Sites Event Content in Social

More information

Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks

Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks Ben Carterette Center for Intelligent Information Retrieval University of Massachusetts Amherst Amherst, MA 01003 carteret@cs.umass.edu

More information

Learning Ranking Functions with SVMs

Learning Ranking Functions with SVMs Learning Ranking Functions with SVMs CS4780/5780 Machine Learning Fall 2014 Thorsten Joachims Cornell University T. Joachims, Optimizing Search Engines Using Clickthrough Data, Proceedings of the ACM Conference

More information

Chapter 8. Evaluating Search Engine

Chapter 8. Evaluating Search Engine Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can

More information

Northeastern University in TREC 2009 Million Query Track

Northeastern University in TREC 2009 Million Query Track Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College

More information

Fall Lecture 16: Learning-to-rank

Fall Lecture 16: Learning-to-rank Fall 2016 CS646: Information Retrieval Lecture 16: Learning-to-rank Jiepu Jiang University of Massachusetts Amherst 2016/11/2 Credit: some materials are from Christopher D. Manning, James Allan, and Honglin

More information

Information Retrieval Spring Web retrieval

Information Retrieval Spring Web retrieval Information Retrieval Spring 2016 Web retrieval The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically infinite due to the dynamic

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 7: Document Clustering May 25, 2011 Wolf-Tilo Balke and Joachim Selke Institut für Informationssysteme Technische Universität Braunschweig Homework

More information

High Accuracy Retrieval with Multiple Nested Ranker

High Accuracy Retrieval with Multiple Nested Ranker High Accuracy Retrieval with Multiple Nested Ranker Irina Matveeva University of Chicago 5801 S. Ellis Ave Chicago, IL 60637 matveeva@uchicago.edu Chris Burges Microsoft Research One Microsoft Way Redmond,

More information

Beyond PageRank: Machine Learning for Static Ranking

Beyond PageRank: Machine Learning for Static Ranking Beyond PageRank: Machine Learning for Static Ranking Matthew Richardson 1, Amit Prakash 1 Eric Brill 2 1 Microsoft Research 2 MSN World Wide Web Conference, 2006 Outline 1 2 3 4 5 6 Types of Ranking Dynamic

More information

Ambiguity. Potential for Personalization. Personalization. Example: NDCG values for a query. Computing potential for personalization

Ambiguity. Potential for Personalization. Personalization. Example: NDCG values for a query. Computing potential for personalization Ambiguity Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning and Pandu Nayak Personalization Unlikely that a short query can unambiguously describe a user s

More information

Estimating Credibility of User Clicks with Mouse Movement and Eye-tracking Information

Estimating Credibility of User Clicks with Mouse Movement and Eye-tracking Information Estimating Credibility of User Clicks with Mouse Movement and Eye-tracking Information Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma Department of Computer Science and Technology, Tsinghua University Background

More information

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

TriRank: Review-aware Explainable Recommendation by Modeling Aspects TriRank: Review-aware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia

More information

Success Index: Measuring the efficiency of search engines using implicit user feedback

Success Index: Measuring the efficiency of search engines using implicit user feedback Success Index: Measuring the efficiency of search engines using implicit user feedback Apostolos Kritikopoulos, Martha Sideri, Iraklis Varlamis Athens University of Economics and Business Patision 76,

More information

Efficient Multiple-Click Models in Web Search

Efficient Multiple-Click Models in Web Search Efficient Multiple-Click Models in Web Search Fan Guo Carnegie Mellon University Pittsburgh, PA 15213 fanguo@cs.cmu.edu Chao Liu Microsoft Research Redmond, WA 98052 chaoliu@microsoft.com Yi-Min Wang Microsoft

More information

Multi-label Classification. Jingzhou Liu Dec

Multi-label Classification. Jingzhou Liu Dec Multi-label Classification Jingzhou Liu Dec. 6 2016 Introduction Multi-class problem, Training data (x $, y $ ) ( ), x $ X R., y $ Y = 1,2,, L Learn a mapping f: X Y Each instance x $ is associated with

More information

Improving Patent Search by Search Result Diversification

Improving Patent Search by Search Result Diversification Improving Patent Search by Search Result Diversification Youngho Kim University of Massachusetts Amherst yhkim@cs.umass.edu W. Bruce Croft University of Massachusetts Amherst croft@cs.umass.edu ABSTRACT

More information

Predictive Indexing for Fast Search

Predictive Indexing for Fast Search Predictive Indexing for Fast Search Sharad Goel, John Langford and Alex Strehl Yahoo! Research, New York Modern Massive Data Sets (MMDS) June 25, 2008 Goel, Langford & Strehl (Yahoo! Research) Predictive

More information

Apache Solr Learning to Rank FTW!

Apache Solr Learning to Rank FTW! Apache Solr Learning to Rank FTW! Berlin Buzzwords 2017 June 12, 2017 Diego Ceccarelli Software Engineer, News Search dceccarelli4@bloomberg.net Michael Nilsson Software Engineer, Unified Search mnilsson23@bloomberg.net

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval Lecture 5: Evaluation Ruixuan Li http://idc.hust.edu.cn/~rxli/ Sec. 6.2 This lecture How do we know if our results are any good? Evaluating a search engine Benchmarks

More information

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK 1 Mount Steffi Varish.C, 2 Guru Rama SenthilVel Abstract - Image Mining is a recent trended approach enveloped in

More information

THIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user

THIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user EVALUATION Sec. 6.2 THIS LECTURE How do we know if our results are any good? Evaluating a search engine Benchmarks Precision and recall Results summaries: Making our good results usable to a user 2 3 EVALUATING

More information

Introduction to Information Retrieval. Hongning Wang

Introduction to Information Retrieval. Hongning Wang Introduction to Information Retrieval Hongning Wang CS@UVa What is information retrieval? 2 Why information retrieval Information overload It refers to the difficulty a person can have understanding an

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning, Pandu Nayak and Prabhakar Raghavan Evaluation 1 Situation Thanks to your stellar performance in CS276, you

More information

Scalable Diversified Ranking on Large Graphs

Scalable Diversified Ranking on Large Graphs 211 11th IEEE International Conference on Data Mining Scalable Diversified Ranking on Large Graphs Rong-Hua Li Jeffrey Xu Yu The Chinese University of Hong ong, Hong ong {rhli,yu}@se.cuhk.edu.hk Abstract

More information

08 An Introduction to Dense Continuous Robotic Mapping

08 An Introduction to Dense Continuous Robotic Mapping NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy

More information

Automatic Identification of User Goals in Web Search [WWW 05]

Automatic Identification of User Goals in Web Search [WWW 05] Automatic Identification of User Goals in Web Search [WWW 05] UichinLee @ UCLA ZhenyuLiu @ UCLA JunghooCho @ UCLA Presenter: Emiran Curtmola@ UC San Diego CSE 291 4/29/2008 Need to improve the quality

More information

COMP6237 Data Mining Searching and Ranking

COMP6237 Data Mining Searching and Ranking COMP6237 Data Mining Searching and Ranking Jonathon Hare jsh2@ecs.soton.ac.uk Note: portions of these slides are from those by ChengXiang Cheng Zhai at UIUC https://class.coursera.org/textretrieval-001

More information

Structured Learning of Two-Level Dynamic Rankings

Structured Learning of Two-Level Dynamic Rankings Structured earning of Two-evel Dynamic Rankings Karthik Raman Dept. of Computer Science Cornell University Ithaca NY 4850 USA karthik@cs.cornell.edu Thorsten Joachims Dept. of Computer Science Cornell

More information

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s Summary agenda Summary: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University March 13, 2013 A Ardö, EIT Summary: EITN01 Web Intelligence

More information

A User Preference Based Search Engine

A User Preference Based Search Engine A User Preference Based Search Engine 1 Dondeti Swedhan, 2 L.N.B. Srinivas 1 M-Tech, 2 M-Tech 1 Department of Information Technology, 1 SRM University Kattankulathur, Chennai, India Abstract - In this

More information

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min) Where we are Background (15 min) Graph models, subgraph isomorphism, subgraph mining, graph clustering Eploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

More information

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian Demystifying movie ratings 224W Project Report Amritha Raghunath (amrithar@stanford.edu) Vignesh Ganapathi Subramanian (vigansub@stanford.edu) 9 December, 2014 Introduction The past decade or so has seen

More information

UMass at TREC 2017 Common Core Track

UMass at TREC 2017 Common Core Track UMass at TREC 2017 Common Core Track Qingyao Ai, Hamed Zamani, Stephen Harding, Shahrzad Naseri, James Allan and W. Bruce Croft Center for Intelligent Information Retrieval College of Information and Computer

More information

A Machine Learning Approach for Improved BM25 Retrieval

A Machine Learning Approach for Improved BM25 Retrieval A Machine Learning Approach for Improved BM25 Retrieval Krysta M. Svore and Christopher J. C. Burges Microsoft Research One Microsoft Way Redmond, WA 98052 {ksvore,cburges}@microsoft.com Microsoft Research

More information

Diversification of Query Interpretations and Search Results

Diversification of Query Interpretations and Search Results Diversification of Query Interpretations and Search Results Advanced Methods of IR Elena Demidova Materials used in the slides: Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova,

More information

LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1

LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1 LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval Tie-Yan Liu 1, Jun Xu 1, Tao Qin 2, Wenying Xiong 3, and Hang Li 1 1 Microsoft Research Asia, No.49 Zhichun Road, Haidian

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

CS6322: Information Retrieval Sanda Harabagiu. Lecture 13: Evaluation

CS6322: Information Retrieval Sanda Harabagiu. Lecture 13: Evaluation Sanda Harabagiu Lecture 13: Evaluation Sec. 6.2 This lecture How do we know if our results are any good? Evaluating a search engine Benchmarks Precision and recall Results summaries: Making our good results

More information

CS54701: Information Retrieval

CS54701: Information Retrieval CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful

More information

Capturing User Interests by Both Exploitation and Exploration

Capturing User Interests by Both Exploitation and Exploration Capturing User Interests by Both Exploitation and Exploration Ka Cheung Sia 1, Shenghuo Zhu 2, Yun Chi 2, Koji Hino 2, and Belle L. Tseng 2 1 University of California, Los Angeles, CA 90095, USA kcsia@cs.ucla.edu

More information

An Exploration of Query Term Deletion

An Exploration of Query Term Deletion An Exploration of Query Term Deletion Hao Wu and Hui Fang University of Delaware, Newark DE 19716, USA haowu@ece.udel.edu, hfang@ece.udel.edu Abstract. Many search users fail to formulate queries that

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN: IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 CS 347 Notes 12 5 Web Search Engine Crawling

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 Web Search Engine Crawling Indexing Computing

More information

Fighting Search Engine Amnesia: Reranking Repeated Results

Fighting Search Engine Amnesia: Reranking Repeated Results Fighting Search Engine Amnesia: Reranking Repeated Results ABSTRACT Milad Shokouhi Microsoft milads@microsoft.com Paul Bennett Microsoft Research pauben@microsoft.com Web search engines frequently show

More information

Modern Retrieval Evaluations. Hongning Wang

Modern Retrieval Evaluations. Hongning Wang Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance

More information

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551

More information

Detecting Multilingual and Multi-Regional Query Intent in Web Search

Detecting Multilingual and Multi-Regional Query Intent in Web Search Detecting Multilingual and Multi-Regional Query Intent in Web Search Yi Chang, Ruiqiang Zhang, Srihari Reddy Yahoo! Labs 701 First Avenue Sunnyvale, CA 94089 {yichang,ruiqiang,sriharir}@yahoo-inc.com Yan

More information

Information Retrieval. Lecture 7

Information Retrieval. Lecture 7 Information Retrieval Lecture 7 Recap of the last lecture Vector space scoring Efficiency considerations Nearest neighbors and approximations This lecture Evaluating a search engine Benchmarks Precision

More information

Combining Implicit and Explicit Topic Representations for Result Diversification

Combining Implicit and Explicit Topic Representations for Result Diversification Combining Implicit and Explicit Topic Representations for Result Diversification Jiyin He J.He@cwi.nl Vera Hollink V.Hollink@cwi.nl Centrum Wiskunde en Informatica Science Park 123, 1098XG Amsterdam, the

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

A new click model for relevance prediction in web search

A new click model for relevance prediction in web search A new click model for relevance prediction in web search Alexander Fishkov 1 and Sergey Nikolenko 2,3 1 St. Petersburg State Polytechnical University jetsnguns@gmail.com 2 Steklov Mathematical Institute,

More information

Analysis of Trail Algorithms for User Search Behavior

Analysis of Trail Algorithms for User Search Behavior Analysis of Trail Algorithms for User Search Behavior Surabhi S. Golechha, Prof. R.R. Keole Abstract Web log data has been the basis for analyzing user query session behavior for a number of years. Web

More information

In the Mood to Click? Towards Inferring Receptiveness to Search Advertising

In the Mood to Click? Towards Inferring Receptiveness to Search Advertising In the Mood to Click? Towards Inferring Receptiveness to Search Advertising Qi Guo Eugene Agichtein Mathematics & Computer Science Department Emory University Atlanta, USA {qguo3,eugene}@mathcs.emory.edu

More information

Jan Pedersen 22 July 2010

Jan Pedersen 22 July 2010 Jan Pedersen 22 July 2010 Outline Problem Statement Best effort retrieval vs automated reformulation Query Evaluation Architecture Query Understanding Models Data Sources Standard IR Assumptions Queries

More information

Session Based Click Features for Recency Ranking

Session Based Click Features for Recency Ranking Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Session Based Click Features for Recency Ranking Yoshiyuki Inagaki and Narayanan Sadagopan and Georges Dupret and Ciya

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval http://informationretrieval.org IIR 6: Flat Clustering Wiltrud Kessler & Hinrich Schütze Institute for Natural Language Processing, University of Stuttgart 0-- / 83

More information

Information Retrieval May 15. Web retrieval

Information Retrieval May 15. Web retrieval Information Retrieval May 15 Web retrieval What s so special about the Web? The Web Large Changing fast Public - No control over editing or contents Spam and Advertisement How big is the Web? Practically

More information

arxiv: v1 [cs.ir] 2 Feb 2015

arxiv: v1 [cs.ir] 2 Feb 2015 Context Models For Web Search Personalization Maksims N. Volkovs University of Toronto 40 St. George Street Toronto, ON M5S 2E4 mvolkovs@cs.toronto.edu arxiv:502.00527v [cs.ir] 2 Feb 205 ABSTRACT We present

More information

Information Integration of Partially Labeled Data

Information Integration of Partially Labeled Data Information Integration of Partially Labeled Data Steffen Rendle and Lars Schmidt-Thieme Information Systems and Machine Learning Lab, University of Hildesheim srendle@ismll.uni-hildesheim.de, schmidt-thieme@ismll.uni-hildesheim.de

More information

Learn from Web Search Logs to Organize Search Results

Learn from Web Search Logs to Organize Search Results Learn from Web Search Logs to Organize Search Results Xuanhui Wang Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 xwang20@cs.uiuc.edu ChengXiang Zhai Department

More information

WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data

WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data WCL2R: A Benchmark Collection for Learning to Rank Research with Clickthrough Data Otávio D. A. Alcântara 1, Álvaro R. Pereira Jr. 3, Humberto M. de Almeida 1, Marcos A. Gonçalves 1, Christian Middleton

More information

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis Motivation Lots of (semi-)structured data at Google URLs: Contents, crawl metadata, links, anchors, pagerank,

More information

A Few Things to Know about Machine Learning for Web Search

A Few Things to Know about Machine Learning for Web Search AIRS 2012 Tianjin, China Dec. 19, 2012 A Few Things to Know about Machine Learning for Web Search Hang Li Noah s Ark Lab Huawei Technologies Talk Outline My projects at MSRA Some conclusions from our research

More information