Refining searches. Refine initially: query. Refining after search. Explicit user feedback. Explicit user feedback

Similar documents
Information Retrieval. Techniques for Relevance Feedback

Information Retrieval and Web Search

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity

Query reformulation CE-324: Modern Information Retrieval Sharif University of Technology

Relevance Feedback and Query Expansion. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

Query reformulation CE-324: Modern Information Retrieval Sharif University of Technology

The IR Black Box. Anomalous State of Knowledge. The Information Retrieval Cycle. Different Types of Interactions. Upcoming Topics.

Collaborative Filtering and Recommender Systems. Definitions. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

Chapter 6: Information Retrieval and Web Search. An introduction

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 1)"

Relevance Feedback & Other Query Expansion Techniques

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues

Part 11: Collaborative Filtering. Francesco Ricci

COMP6237 Data Mining Making Recommendations. Jonathon Hare

Lecture 7: Relevance Feedback and Query Expansion

Introduction to Data Mining

Information Retrieval

Evaluation. David Kauchak cs160 Fall 2009 adapted from:

CS 124/LINGUIST 180 From Languages to Information

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Ordered Pairs. 2 f(2) = 4 (2, 4) 1 f(1) = 1 (1, 1) Outputs. Inputs

Information Retrieval. Information Retrieval and Web Search

Part 11: Collaborative Filtering. Francesco Ricci

Recommender Systems - Content, Collaborative, Hybrid

Retrieval Evaluation. Hongning Wang

Information Retrieval. Lecture 7

CS 124/LINGUIST 180 From Languages to Information

Instructor: Stefan Savev

Chapter 8. Evaluating Search Engine

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Retrieval by Content. Part 3: Text Retrieval Latent Semantic Indexing. Srihari: CSE 626 1

Information Retrieval and Web Search

Information Retrieval

Evaluation of Retrieval Systems

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Data Mining Lecture 2: Recommender Systems

- Content-based Recommendation -

Preference Learning in Recommender Systems

Query Operations. Relevance Feedback Query Expansion Query interpretation

Recommender Systems (RSs)

5/13/2009. Introduction. Introduction. Introduction. Introduction. Introduction

Searching non-text information objects

CS 124/LINGUIST 180 From Languages to Information

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Search Engines Chapter 8 Evaluating Search Engines Felix Naumann

Diversity in Recommender Systems Week 2: The Problems. Toni Mikkola, Andy Valjakka, Heng Gui, Wilson Poon

Information retrieval

Modern Information Retrieval

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems

CSCI 5417 Information Retrieval Systems. Jim Martin!

Gene Clustering & Classification

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

Recommender Systems. Techniques of AI

Sec. 8.7 RESULTS PRESENTATION

Recommender Systems 6CCS3WSN-7CCSMWAL

Boolean Model. Hongning Wang

Information Retrieval

CMPSCI 646, Information Retrieval (Fall 2003)

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Machine vision. Summary # 5: Morphological operations

Information Retrieval

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Information Retrieval

Combining Information Retrieval and Relevance Feedback for Concept Location

Learning Ranking Functions with Implicit Feedback

Search Engine Architecture. Hongning Wang

INFSCI 2140 Information Storage and Retrieval Lecture 6: Taking User into Account. Ad-hoc IR in text-oriented DS

vector space retrieval many slides courtesy James Amherst

Semantic Search in s

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

BBS654 Data Mining. Pinar Duygulu

Optimal Query. Assume that the relevant set of documents C r. 1 N C r d j. d j. Where N is the total number of documents.

EPL660: INFORMATION RETRIEVAL AND SEARCH ENGINES. Slides by Manning, Raghavan, Schutze

Evaluation of Retrieval Systems

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Web Information Retrieval. Exercises Evaluation in information retrieval

An Adaptive Agent for Web Exploration Based on Concept Hierarchies

Clustering. Huanle Xu. Clustering 1

THIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user

Optimizing Search Engines using Click-through Data

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

Ranking and Learning. Table of Content. Weighted scoring for ranking Learning to rank: A simple example Learning to ranking as classification.

Collaborative Filtering based on User Trends

SOCIAL MEDIA MINING. Data Mining Essentials

Neighborhood-Based Collaborative Filtering

PERSONALIZED TAG RECOMMENDATION

Collaborative Filtering and Evaluation of Recommender Systems

Available online at ScienceDirect. Procedia Technology 17 (2014 )

The Principle and Improvement of the Algorithm of Matrix Factorization Model based on ALS

Text Categorization (I)

This lecture: IIR Sections Ranked retrieval Scoring documents Term frequency Collection statistics Weighting schemes Vector space scoring

Keywords: geolocation, recommender system, machine learning, Haversine formula, recommendations

Evaluation of Retrieval Systems

A Review on Relevance Feedback in CBIR

Transcription:

Refine initially: query Refining searches Commonly, query epansion add synonyms Improve recall Hurt precision? Sometimes done automatically Modify based on pri searches Not automatic All pri searches vs your pri searches Eample: Yahoo Google does too Refining after search Use user feedback Approimate feedback with first results Pseudo-feedback Eample: Yahoo assist change ranking of current results search again with modified query Eplicit user feedback User must participate User marks (some) relevant results User changes der of results Pros and cons? 4 Eplicit user feedback User must participate User marks (some) relevant results User changes der of results Can be me nuanced than relevant not Can be less accurate than relevant not Eample: User moves th item to first says th better than first 9 Does not say which, if any, of first 9 relevant User feedback in classic vect model User marks top p documents f relevance p = to typical Construct new weights f terms in query vect Modifies query Could use just on initial results to re-rank 6

Deriving new query f vect model F collection C of n doc.s Let C r denote set all relevant docs in collection, Perfect knowledge Goal: Vect q opt = / C r * (sum of all vects d j in C r ) - /(n- C r ) * (sum of all vects d k not in C r ) centroids 7 Deriving new query f vect model: Rocchio algithm Give query q and relevance judgments f a subset of retrieved docs Let D r denote set of docs judged relevant Let D nr denote set of docs judged not relevant Modified query: Vect q new = αq + β/ D r * (sum of all vects d j in D r ) - γ/( D nr ) * (sum of all vects d k in D nr ) F tunable weights α, β, γ 8 Remarks on new query α: imptance iginal query β: imptance effect of terms in relevant docs γ: imptance effect of terms in docs not relevant Usually terms of docs not relevant are least imptant Reasonable values α=, β=.7, γ=. Reweighting terms leads to long queries Many me non-zero elements in query vect q new Can reweight only most imptant (frequent?) terms Most useful to improve recall Users don t like: wk + wait f new results 9 Simple eample user feedback in vect model q = (,,,) Relevant: d = (,,,) d = (,,,) Not relevant: d=(,,,) α, β, γ = q new = (,,,) + (, /,, ) - (,,,) = (, /,, ) Term weights change New term Observe: Can get negative weights Re-ranking Eample - status? Google eperiment: only affects repeat of same search learned in class is now SearchWiki feature f Google accounts Algithms usually based on machine learning Learn ranking function that best matches partial ranking given Comparing rankings Kendall Tau measure compares two derings of the same set of n objects: Let A = # pairs whose ders agree in the two derings Let Ι = # pairs whose ders disagree in the two dering = # inversions Kendall s Tau (der, der) = (A- Ι)/(A+ Ι) Ordinal crelation = since A+Ι = /(n)(n-) Ι (¼(n)(n-) )

Implicit user feedback Click-throughs Use as relevance judgment Use as reranking: When click result, moves it ahead of all results didn t click that come befe it Problems? Better? Single user feedback vs group Compare Recommender Systems Items Users Recommend Items to Users Recommend new items based on similarity to items that: User liked in past: Content-based Liked by other users similar to this user: Collabative Filtering just liked by other users - easier case Documents matching search = items? 4 Recommender System attributes Need eplicit implicit ratings by user Purchase is / rating Movie tickets Books Have focused categy eamples: music, courses, restaurants hard to cross categies with content-based easier to cross categies with collabative-based users share tastes across categies? Content-based recommendation Items must have characteristics user values item values characteristics of item model each item as vect of weights of characteristics much like vect-based IR user can give eplicit preferences f certain characteristics 6 Content-based eample user bought book and book what if actually rated? Average books bought = (,,., ) Sce new books dot product gives: sce(a) =.; sce (B)= decide threshhold f recommendation book book new book A new book B st person romance. mystery sci-fi. 7 Eample with eplicit user preferences How use sces of books bought? Try: preference vect p where component k = user pref f characteristic k if avg. comp. k of books bought when user pref = pref f user = don t care p=(,,., -) New sces? p A =. p B = user pref book book new A new B st per rom. mys sci-fi -. 8

Content-based: issues Vect-based one alternative Maj alternatives based on machine-learning F vect based how build a preference vect how combined vects f items rated by user our eample only / rating how include eplicit user preferences what metric use f similarity between new items and preference vect nmalization threshold? 9 Limitations of Content-based Can only recommend items similar to those user rated highly New users Insufficient number of rated items Only consider features eplicitly associated with items Do not include attributes of user Collabative Filtering Recommend new items liked by other users similar to this user need items already rated by user and other users don t need characteristics of items each rating by individual user becomes characteristic Can combine with item characterisitics hybrid content/collabative Method types (see Adomavicius and Tuzhilin paper) Memy-Based Similar to vect model Use (user item) matri Use similarity function Prediction based on previously rated items Model-Based Machine-learning methods Model of probabilities of (users items) Memy-Based: Preliminaries Notation r(u,i) = rating of i th item by user u I u = set of items rated by user u I u,v = set of items rated by both users u and v U i,j = set of users that rated items i and j Adjust scales f user differences Use average rating by user u: r avg u = (/ I u ) * r(u,i) i in I u Adjusted ratings: r adj (u,i) = r(u,i) - r u avg One Memy-Based method: User Similarities similarity between users u and v Pearson crelation coefficient (r adj (u,i)*r adj (v, i) ) i in I u,v sim(u,v) = ( (r adj (u,i)) * (r adj (v, i)) ) ½ i in I u,v i in I u,v 4 4

One Memy-Based Method: Item Similarities similarity between items i and j vect of ratings of users in U i,j cosine measure using adjusted ratings sim(i,j) = (r adj (u,i)*r adj (u, j) ) u in U i,j ( (r adj (u,i)) (r adj (u, j)) ) ½ u in U i,j u in U i,j Predicting User s rating of new item: User-based F item i not rated by user u r pred (u,i) = r u avg + (sim(u,v)*r adj (v, i)) v in S sim(u,v) v in S S can be all users just users most similar to u 6 Predicting User s rating of new item: Item-based F item i not rated by user u r item-pred (u,i) = (sim(i,j)*r(u, j)) j in T sim(i,j) j in T T can be all items just items most similar to i Prediction uses only u s ratings, but similarity uses other users ratings 7 user Collabative filtering eample ratings adj. user ratings user user user user 4 user user user user 4 book 4 book book book - - - book book - book 4? book 4 -? 8 Collabative filtering eample sim(u,u4) = (6+)/(*8) / =.894 sim(u,u4) = (-)/(*4) / = -.447 sim(u,u4) = (+)/(*8) / = predict r(u4, book4) = + (-)*.894 +*(-.447) + *.894 +.447 + = -.9 Limitations May not have enough ratings f new users New items may not be rated by enough users Need critical mass of users All similarities based on user ratings 9

Recommendation techiques and search Content-based query refinement with user feedback item document characteristic term user preferences initial query user rating of relevance ratings f previous items initial results Recommendation techiques and search: Collabative filtering analogy with product recommendation? users behavi on same search - i.e. same query item search result rating clicked/not clicked on predict whether user will click on based on behavi of similar users user similarity based on what have both clicked on f this search me general predictions of best results based on notions of user similarity hybrid content and collabation 6