Information Filtering SI650: Information Retrieval

Size: px
Start display at page:

Download "Information Filtering SI650: Information Retrieval"

Transcription

1 Information Filtering SI650: Information Retrieval Winter 2010 School of Information University of Michigan Many slides are from Prof. ChengXiang Zhai s lecture 1

2 Lecture Plan Filtering vs. Retrieval Content-based filtering (adaptive filtering) Collaborative filtering (recommender systems) 2

3 Short vs. Long Term Info Need Short-term information need (Ad hoc retrieval) Temporary need, e.g., info about used cars Information source is relatively static User pulls information Application example: library search, Web search Long-term information need (Filtering) Stable need, e.g., new data mining algorithms Information source is dynamic System pushes information to user Applications: news filter, recommender systems 3

4 Examples of Information Filtering News filtering filtering Movie/book recommenders Literature recommenders And many others 4

5 Content-based Filtering vs. Collaborative Filtering Basic filtering question: Will user U like item X? Two different ways of answering it Look at what U likes Look at who likes X Can be combined characterize X content-based filtering characterize U collaborative filtering 5

6 Content-based filtering is also called 1. Adaptive Information Filtering in TREC 2. Selective Dissemination of Information (SDI) in Library & Information Science Collaborative filtering is also called Recommender Systems 6

7 Why Bother? Why do we need recommendation? Why should amazon provide recommendation? Why does Netflix spend millions to improve their recommendation algorithm? Why can t we rely on a search engine for all the recommendation? 7

8 The Long Tail Need a way to discover items that match our interests & tastes among tens or hundreds of thousands Chris Anderson, The Long Tail, Wired, Issue October

9 Part I: Adaptive Filtering 9

10 Adaptive Information Filtering Stable & long term interest, dynamic info. source System must make a delivery decision immediately as a document arrives my interest: Filtering System 10

11 AIF vs. Retrieval & Categorization Like retrieval over a dynamic stream of docs, but (global) ranking is impossible Like online binary categorization, but with no initial training data and with limited feedback Typically evaluated with a utility function Each delivered doc gets a utility value Good doc gets a positive value Bad doc gets a negative value E.g., Utility = 3* #good - 2 *#bad (linear utility) 11

12 A Typical AIF System Initialization User profile text... Doc Source Binary Classifier User Interest Profile Accepted Docs User utility func. Accumulated Docs Learning Feedback 12

13 Three Basic Problems in AIF Making filtering decision (Binary classifier) Doc text, profile text yes/no Initialization Initialize the filter based on only the profile text or very few examples Learning from Limited relevance judgments (only on yes docs) Accumulated documents All trying to maximize the utility 13

14 Major Approaches to AIF Extended retrieval systems Reuse retrieval techniques to score documents Use a score threshold for filtering decision Learn to improve scoring with traditional feedback New approaches to threshold setting and learning Modified categorization systems Adapt to binary, unbalanced categorization New approaches to initialization Train with censored training examples 14

15 Difference between Ranking, Classification, and Recommendation What is the query? Ranking: no query Classification: labeled data (a.k.a, training data) Recommendation: user + items What is the target? Ranking: order of objects Classification: labels of objects Recommendation: Score? Label? Order? 15

16 A General Vector-Space Approach doc vector Scoring Thresholding no yes Utility Evaluation profile vector threshold Vector Learning Threshold Learning Feedback Information 16

17 Difficulties in Threshold Learning 36.5 R 33.4 N 32.1 R 29.9? 27.3?... =30.0 Censored data Little/none labeled data Scoring bias due to vector learning Exploration vs. Exploitation 17

18 Threshold Setting in Extended Retrieval Systems Utility-independent approaches (generally not working well, not covered in this lecture) Indirect (linear) utility optimization Logistic regression (score prob. of relevance) Direct utility optimization Empirical utility optimization Expected utility optimization given score distributions All try to learn the optimal threshold 18

19 Logistic Regression (Robertson & Walker. 00) General idea: convert score of D to p(r D) logo( R D) s( D) Fit the model using feedback data Linear utility is optimized with a fixed prob. cutoff But, Possibly incorrect parametric assumptions No positive examples initially Censored data and limited positive feedback Doesn t address the issue of exploration 19

20 Given Direct Utility Optimization A utility function U(C R+,C R-,C N+,C N- ) Training data D={<s i, {R,N,?}>} Formulate utility as a function of the threshold and training data: U=F(,D) Choose the threshold by optimizing F(,D), i.e., arg max F(, D) 20

21 Basic idea Empirical Utility Optimization Compute the utility on the training data for each candidate threshold (score of a training doc) Choose the threshold that gives the maximum utility Difficulty: Biased training sample! We can only get an upper bound for the true optimal threshold. Solutions: Heuristic adjustment(lowering) of threshold Lead to beta-gamma threshold learning 21

22 Score Distribution Approaches ( Aramptzis & Hameren 01; Zhang & Callan 01) Assume generative model of scores p(s R), p(s N) Estimate the model with training data Find the threshold by optimizing the expected utility under the estimated model Specific methods differ in the way of defining and estimating the scoring distributions 22

23 Gaussian-Exponential Distributions P(s R) ~ N(, 2 ) p(s-s 0 N) ~ E() ) ( ) ( ) ( ) ( s s s e N s s p e R s p (From Zhang & Callan 2001) 23

24 Score Distribution Approaches (cont.) Pros Principled approach Arbitrary utility Empirically effective Cons May be sensitive to the scoring function Exploration not addressed 24

25 Part II: Collaborative Filtering 25

26 What is Collaborative Filtering (CF)? Making filtering decisions for an individual user based on the judgments of other users Inferring individual s interest/preferences from that of other similar users General idea Given a user u, find similar users {u 1,, u m } Predict u s preferences based on the preferences of u 1,, u m 26

27 CF: Assumptions Users with a common interest will have similar preferences Users with similar preferences probably share the same interest Examples interest is IR favor SIGIR papers favor SIGIR papers interest is IR Sufficiently large number of user preferences are available 27

28 CF: Intuitions User similarity (Paul Resnick vs. Rahul Sami) If Paul liked the paper, Rahul will like the paper? If Paul liked the movie, Rahul will like the movie Suppose Paul and Rahul viewed similar movies in the past six months Item similarity Since 90% of those who liked Star Wars also liked Independence Day, and, you liked Star Wars You may also like Independence Day The content of items didn t matter! 28

29 Rating-based vs. Preference-based Rating-based: User s preferences are encoded using numerical ratings on items Complete ordering Absolute values can be meaningful But, values must be normalized to combine Preference-based: User s preferences are represented by partial ordering of items (Learning to Rank!) Partial ordering Easier to exploit implicit preferences 29

30 A Formal Framework for Rating Objects: O Users: U o 1 o 2 o j o n X ij =f(u i,o j )=? u 1 u 2 u i... u m ? Unknown function f: U x O R The task Assume known f values for some (u,o) s Predict f values for other (u,o) s Essentially function approximation, like other learning problems 30

31 Where are the intuitions? Similar users have similar preferences If u u, then for all o s, f(u,o) f(u,o) Similar objects have similar user preferences If o o, then for all u s, f(u,o) f(u,o ) In general, f is locally constant If u u and o o, then f(u,o) f(u,o ) Local smoothness makes it possible to predict unknown values by interpolation or extrapolation What does local mean? 31

32 Two Groups of Approaches Memory-based approaches f(u, o) = g(u)(o) g(u )(o) if u u (g =preference function) Find neighbors of u and combine g(u )(o) s Model-based approaches (not covered) Assume structures/model: object cluster, user cluster, f defined on clusters f(u, o) = f (c u, c o ) Estimation & Probabilistic inference 32

33 General ideas: Memory-based Approaches (Resnick et al. 94) x ij : rating of object j by user i n i : average rating of all objects by user i Normalized ratings: Memory-based prediction v k m w( a, i) k / w( a, i) dj v ij i1 m 1 xaj vaj na i1 Specific approaches differ in w(a, i) -- the distance/similarity between user a and i v ij x ij n i 33

34 User Similarity Measures Pearson correlation coefficient (sum over commonly rated items) Cosine measure Many other possibilities! j i ij j a aj j i ij a aj p n x n x n x n x i a w 2 2 ) ( ) ( ) )( ( ), ( n j ij n j aj n j ij aj c x x x x i a w ), ( 34

35 Improving User Similarity Measures (Breese et al. 98) Dealing with missing values: default ratings Inverse User Frequency (IUF): similar to IDF Case Amplification: use w(a, i) p, e.g., p=2.5 35

36 A Network Interpretation If your neighbors love it, you are likely to love it Closer neighbors make a larger impact Measure the closeness by the items you and her liked/disliked 36

37 Network based approaches No clear notion of distance now, but the scores are estimated based on some propagation process. Score propagation based on random walk Two approaches: Starting from the user, how likely/how long can I reach the object? Starting from the object, how likely/how long can I hit the user? 37

38 Random Walk and Hitting Time P = k A i P = j Hitting Time T A : the first time that the random walk is at a vertex in A Mean Hitting Time h ia : expectation of T A given that the walk starts from vertex i 38

39 Computing Hitting Time h i A = 0.7 h j A h k A i 0.7 k j h = 0 A T A : the first time that the random walk is at a vertex in A T A min{ t : X t A, t 0} h ia : expectation of T A given that the walk starting from vertex i Apparently, h i A = 0 for those ia A h i jv A p( i j) 1, for ia h j 0, for ia Iterative Computation 39

40 A i Bipartite Graph and Hitting Time V 1 k V w(i, j) = 3 w( i, j) 3 p( j i) dw j ( i, j) (3w ( 1) k, j) p( i k) w( i, j) 3 jv 2 p( di i j) d j (3 7) j d i Bipartite Graph: - Edges between V 1 and V 2 - No edge inside V 1 or V 2 - Edges are weighted - e.g., V1 = query; V2 = Url Expected proximity of query i to the query A : hitting time of i A, h i A convert to a directed graph, even collapse one group 40

41 Example: Query Suggestion Welcome to the hotel california Suggestions hotel california eagles hotel california hotel california band hotel california by the eagles hotel california song lyrics of hotel california listen hotel california eagle

42 Generating Query Suggestion using Click Query aa american airline A mexiana freq(q, url) Graph (Mei et al. 08) Url planner_main.jsp en.wikipedia.org/wiki/mexicana Construct a (knn) subgraph from the query log data (of a predefined number of queries/urls) Compute transition probabilities p(i j) Compute hitting time h i A Rank candidate queries using h i A 42

43 Query = friends Result: Query Suggestion Google friendship friends poem friendster friends episode guide friends scripts how to make friends true friends Yahoo secret friends friends reunited hide friends hi 5 friends find friends poems for friends friends quotes Hitting time wikipedia friends friends tv show wikipedia friends home page friends warner bros the friends series friends official site friends(1994) 43

44 Summary Filtering is easy The user s expectation is low Any recommendation is better than none Making it practically useful Filtering is hard Must make a binary decision, though ranking is also possible Data sparseness Cold start 44

45 Example: NetFlix 45

46 References (Adaptive Filtering) General papers on TREC filtering evaluation D. Hull, The TREC-7 Filtering Track: Description and Analysis, TREC-7 Proceedings. D. Hull and S. Robertson, The TREC-8 Filtering Track Final Report, TREC-8 Proceedings. S. Robertson and D. Hull, The TREC-9 Filtering Track Final Report, TREC-9 Proceedings. S. Robertson and I. Soboroff, The TREC 2001 Filtering Track Final Report, TREC-10 Proceedings Papers on specific adaptive filtering methods Stephen Robertson and Stephen Walker (2000), Threshold Setting in Adaptive Filtering. Journal of Documentation, 56: , 2000 Chengxiang Zhai, Peter Jansen, and David A. Evans, Exploration of a heuristic approach to threshold learning in adaptive filtering, 2000 ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'00), Poster presentation. Avi Arampatzis and Andre van Hameren The Score-Distributional Threshold Optimization for Adaptive Binary Classification Tasks, SIGIR'2001. Yi Zhang and Jamie Callan, 2001, Maximum Likelihood Estimation for Filtering Threshold, SIGIR T. Ault and Y. Yang, 2002, knn, Rocchio and Metrics for Information Filtering 46 at TREC-10, TREC-10 Proceedings.

47 References (Collaborative Filtering) Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J., "GroupLens: An Open Architecture for Collaborative Filtering of Netnews," CSCW P Resnick, HR Varian, Recommender Systems, CACM 1997 Cohen, W.W., Schapire, R.E., and Singer, Y. (1999) "Learning to Order Things", Journal of AI Research, Volume 10, pages Freund, Y., Iyer, R.,Schapire, R.E., & Singer, Y. (1999). An efficient boosting algorithm for combining preferences. Machine Learning Journal Breese, J. S., Heckerman, D., and Kadie, C. (1998). Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the 14th Conference on Uncertainty in Articial Intelligence, pp Alexandrin Popescul and Lyle H. Ungar, Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments, UAI N. Good, J.B. Schafer, J. Konstan, A. Borchers, B. Sarwar, J. Herlocker, and J. Riedl. "Combining Collaborative Filtering with Personal Agents for Better Recommendations." Proceedings AAAI-99. pp Qiaozhu Mei, Dengyong Zhou, Kenneth Church. Query Suggestion Using Hitting Time, CIKM'08, pages

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Query Sugges*ons Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Search engines User needs some information search engine tries to bridge this gap ssumption: the

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #7: Recommendation Content based & Collaborative Filtering Seoul National University In This Lecture Understand the motivation and the problem of recommendation Compare

More information

arxiv: v1 [cs.ir] 1 Jul 2016

arxiv: v1 [cs.ir] 1 Jul 2016 Memory Based Collaborative Filtering with Lucene arxiv:1607.00223v1 [cs.ir] 1 Jul 2016 Claudio Gennaro claudio.gennaro@isti.cnr.it ISTI-CNR, Pisa, Italy January 8, 2018 Abstract Memory Based Collaborative

More information

A Time-based Recommender System using Implicit Feedback

A Time-based Recommender System using Implicit Feedback A Time-based Recommender System using Implicit Feedback T. Q. Lee Department of Mobile Internet Dongyang Technical College Seoul, Korea Abstract - Recommender systems provide personalized recommendations

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Text Categorization (I)

Text Categorization (I) CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization

More information

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University  Infinite data. Filtering data streams /9/7 Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them

More information

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman http://www.mmds.org Overview of Recommender Systems Content-based Systems Collaborative Filtering J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive

More information

Performance Comparison of Algorithms for Movie Rating Estimation

Performance Comparison of Algorithms for Movie Rating Estimation Performance Comparison of Algorithms for Movie Rating Estimation Alper Köse, Can Kanbak, Noyan Evirgen Research Laboratory of Electronics, Massachusetts Institute of Technology Department of Electrical

More information

BBS654 Data Mining. Pinar Duygulu

BBS654 Data Mining. Pinar Duygulu BBS6 Data Mining Pinar Duygulu Slides are adapted from J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org Mustafa Ozdal Example: Recommender Systems Customer X Buys Metallica

More information

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering EigenRank: A Ranking-Oriented Approach to Collaborative Filtering ABSTRACT Nathan N. Liu Department of Computer Science and Engineering Hong Kong University of Science and Technology, Hong Kong, China

More information

Part 11: Collaborative Filtering. Francesco Ricci

Part 11: Collaborative Filtering. Francesco Ricci Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating

More information

Boolean Model. Hongning Wang

Boolean Model. Hongning Wang Boolean Model Hongning Wang CS@UVa Abstraction of search engine architecture Indexed corpus Crawler Ranking procedure Doc Analyzer Doc Representation Query Rep Feedback (Query) Evaluation User Indexer

More information

A Machine Learning Approach for Information Retrieval Applications. Luo Si. Department of Computer Science Purdue University

A Machine Learning Approach for Information Retrieval Applications. Luo Si. Department of Computer Science Purdue University A Machine Learning Approach for Information Retrieval Applications Luo Si Department of Computer Science Purdue University Why Information Retrieval: Information Overload: Since the introduction of digital

More information

A Constrained Spreading Activation Approach to Collaborative Filtering

A Constrained Spreading Activation Approach to Collaborative Filtering A Constrained Spreading Activation Approach to Collaborative Filtering Josephine Griffith 1, Colm O Riordan 1, and Humphrey Sorensen 2 1 Dept. of Information Technology, National University of Ireland,

More information

Automatically Building Research Reading Lists

Automatically Building Research Reading Lists Automatically Building Research Reading Lists Michael D. Ekstrand 1 Praveen Kanaan 1 James A. Stemper 2 John T. Butler 2 Joseph A. Konstan 1 John T. Riedl 1 ekstrand@cs.umn.edu 1 GroupLens Research Department

More information

Jeff Howbert Introduction to Machine Learning Winter

Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest es Neighbor Approach Jeff Howbert Introduction to Machine Learning Winter 2012 1 Bad news Netflix Prize data no longer available to public. Just after contest t ended d

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu Customer X Buys Metalica CD Buys Megadeth CD Customer Y Does search on Metalica Recommender system suggests Megadeth

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys Metallica

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of

More information

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen Recommendation Algorithms: Collaborative Filtering CSE 6111 Presentation Advanced Algorithms Fall. 2013 Presented by: Farzana Yasmeen 2013.11.29 Contents What are recommendation algorithms? Recommendations

More information

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem Stefan Hauger 1, Karen H. L. Tso 2, and Lars Schmidt-Thieme 2 1 Department of Computer Science, University of

More information

Collaborative Filtering using a Spreading Activation Approach

Collaborative Filtering using a Spreading Activation Approach Collaborative Filtering using a Spreading Activation Approach Josephine Griffith *, Colm O Riordan *, Humphrey Sorensen ** * Department of Information Technology, NUI, Galway ** Computer Science Department,

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu //8 Jure Leskovec, Stanford CS6: Mining Massive Datasets High dim. data Graph data Infinite data Machine learning

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Feature selection. LING 572 Fei Xia

Feature selection. LING 572 Fei Xia Feature selection LING 572 Fei Xia 1 Creating attribute-value table x 1 x 2 f 1 f 2 f K y Choose features: Define feature templates Instantiate the feature templates Dimensionality reduction: feature selection

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of

More information

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Recommender Systems. Collaborative Filtering & Content-Based Recommending Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on

More information

Towards Time-Aware Semantic enriched Recommender Systems for movies

Towards Time-Aware Semantic enriched Recommender Systems for movies Towards Time-Aware Semantic enriched Recommender Systems for movies Marko Harasic, Pierre Ahrendt,Alexandru Todor, Adrian Paschke FU Berlin Abstract. With the World Wide Web moving from passive to active,

More information

Survey on Collaborative Filtering Technique in Recommendation System

Survey on Collaborative Filtering Technique in Recommendation System Survey on Collaborative Filtering Technique in Recommendation System Omkar S. Revankar, Dr.Mrs. Y.V.Haribhakta Department of Computer Engineering, College of Engineering Pune, Pune, India ABSTRACT This

More information

Community-Based Recommendations: a Solution to the Cold Start Problem

Community-Based Recommendations: a Solution to the Cold Start Problem Community-Based Recommendations: a Solution to the Cold Start Problem Shaghayegh Sahebi Intelligent Systems Program University of Pittsburgh sahebi@cs.pitt.edu William W. Cohen Machine Learning Department

More information

Federated Search. Jaime Arguello INLS 509: Information Retrieval November 21, Thursday, November 17, 16

Federated Search. Jaime Arguello INLS 509: Information Retrieval November 21, Thursday, November 17, 16 Federated Search Jaime Arguello INLS 509: Information Retrieval jarguell@email.unc.edu November 21, 2016 Up to this point... Classic information retrieval search from a single centralized index all ueries

More information

Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach

Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach Al Mamunur Rashid George Karypis John Riedl Abstract Recommender systems have been shown to help users find items of interest

More information

Web Personalization & Recommender Systems

Web Personalization & Recommender Systems Web Personalization & Recommender Systems COSC 488 Slides are based on: - Bamshad Mobasher, Depaul University - Recent publications: see the last page (Reference section) Web Personalization & Recommender

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Data Matrices and Vector Space Model Denis Helic KTI, TU Graz Nov 6, 2014 Denis Helic (KTI, TU Graz) KDDM1 Nov 6, 2014 1 / 55 Big picture: KDDM Probability

More information

Recommender Systems: Attack Types and Strategies

Recommender Systems: Attack Types and Strategies Recommender Systems: Attack Types and Strategies Michael P. O Mahony and Neil J. Hurley and Guénolé C.M. Silvestre University College Dublin Belfield, Dublin 4 Ireland michael.p.omahony@ucd.ie Abstract

More information

Part 11: Collaborative Filtering. Francesco Ricci

Part 11: Collaborative Filtering. Francesco Ricci Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating

More information

Privacy-Preserving Collaborative Filtering using Randomized Perturbation Techniques

Privacy-Preserving Collaborative Filtering using Randomized Perturbation Techniques Privacy-Preserving Collaborative Filtering using Randomized Perturbation Techniques Huseyin Polat and Wenliang Du Systems Assurance Institute Department of Electrical Engineering and Computer Science Syracuse

More information

A Formal Approach to Score Normalization for Meta-search

A Formal Approach to Score Normalization for Meta-search A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003

More information

Project Report. An Introduction to Collaborative Filtering

Project Report. An Introduction to Collaborative Filtering Project Report An Introduction to Collaborative Filtering Siobhán Grayson 12254530 COMP30030 School of Computer Science and Informatics College of Engineering, Mathematical & Physical Sciences University

More information

Query Likelihood with Negative Query Generation

Query Likelihood with Negative Query Generation Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer

More information

Web Personalization & Recommender Systems

Web Personalization & Recommender Systems Web Personalization & Recommender Systems COSC 488 Slides are based on: - Bamshad Mobasher, Depaul University - Recent publications: see the last page (Reference section) Web Personalization & Recommender

More information

Semantic feedback for hybrid recommendations in Recommendz

Semantic feedback for hybrid recommendations in Recommendz Semantic feedback for hybrid recommendations in Recommendz Matthew Garden and Gregory Dudek McGill University Centre For Intelligent Machines 3480 University St, Montréal, Québec, Canada H3A 2A7 {mgarden,

More information

amount of available information and the number of visitors to Web sites in recent years

amount of available information and the number of visitors to Web sites in recent years Collaboration Filtering using K-Mean Algorithm Smrity Gupta Smrity_0501@yahoo.co.in Department of computer Science and Engineering University of RAJIV GANDHI PROUDYOGIKI SHWAVIDYALAYA, BHOPAL Abstract:

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

COMP6237 Data Mining Making Recommendations. Jonathon Hare

COMP6237 Data Mining Making Recommendations. Jonathon Hare COMP6237 Data Mining Making Recommendations Jonathon Hare jsh2@ecs.soton.ac.uk Introduction Recommender systems 101 Taxonomy of recommender systems Collaborative Filtering Collecting user preferences as

More information

Recap: Project and Practicum CS276B. Recommendation Systems. Plan for Today. Sample Applications. What do RSs achieve? Given a set of users and items

Recap: Project and Practicum CS276B. Recommendation Systems. Plan for Today. Sample Applications. What do RSs achieve? Given a set of users and items CS276B Web Search and Mining Winter 2005 Lecture 5 (includes slides borrowed from Jon Herlocker) Recap: Project and Practicum We hope you ve been thinking about projects! Revised concrete project plan

More information

Ranking Algorithms For Digital Forensic String Search Hits

Ranking Algorithms For Digital Forensic String Search Hits DIGITAL FORENSIC RESEARCH CONFERENCE Ranking Algorithms For Digital Forensic String Search Hits By Nicole Beebe and Lishu Liu Presented At The Digital Forensic Research Conference DFRWS 2014 USA Denver,

More information

Author(s): Rahul Sami, 2009

Author(s): Rahul Sami, 2009 Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial Share Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/

More information

A Scalable, Accurate Hybrid Recommender System

A Scalable, Accurate Hybrid Recommender System A Scalable, Accurate Hybrid Recommender System Mustansar Ali Ghazanfar and Adam Prugel-Bennett School of Electronics and Computer Science University of Southampton Highfield Campus, SO17 1BJ, United Kingdom

More information

CS371R: Final Exam Dec. 18, 2017

CS371R: Final Exam Dec. 18, 2017 CS371R: Final Exam Dec. 18, 2017 NAME: This exam has 11 problems and 16 pages. Before beginning, be sure your exam is complete. In order to maximize your chance of getting partial credit, show all of your

More information

Alleviating the Sparsity Problem in Collaborative Filtering by using an Adapted Distance and a Graph-based Method

Alleviating the Sparsity Problem in Collaborative Filtering by using an Adapted Distance and a Graph-based Method Alleviating the Sparsity Problem in Collaborative Filtering by using an Adapted Distance and a Graph-based Method Beau Piccart, Jan Struyf, Hendrik Blockeel Abstract Collaborative filtering (CF) is the

More information

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems A probabilistic model to resolve diversity-accuracy challenge of recommendation systems AMIN JAVARI MAHDI JALILI 1 Received: 17 Mar 2013 / Revised: 19 May 2014 / Accepted: 30 Jun 2014 Recommendation systems

More information

Solving the Sparsity Problem in Recommender Systems Using Association Retrieval

Solving the Sparsity Problem in Recommender Systems Using Association Retrieval 1896 JOURNAL OF COMPUTERS, VOL. 6, NO. 9, SEPTEMBER 211 Solving the Sparsity Problem in Recommender Systems Using Association Retrieval YiBo Chen Computer school of Wuhan University, Wuhan, Hubei, China

More information

CS54701: Information Retrieval

CS54701: Information Retrieval CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful

More information

Collaborative Filtering using Euclidean Distance in Recommendation Engine

Collaborative Filtering using Euclidean Distance in Recommendation Engine Indian Journal of Science and Technology, Vol 9(37), DOI: 10.17485/ijst/2016/v9i37/102074, October 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Collaborative Filtering using Euclidean Distance

More information

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments

An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments An Axiomatic Approach to IR UIUC TREC 2005 Robust Track Experiments Hui Fang ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign Abstract In this paper, we report

More information

Variable Selection 6.783, Biomedical Decision Support

Variable Selection 6.783, Biomedical Decision Support 6.783, Biomedical Decision Support (lrosasco@mit.edu) Department of Brain and Cognitive Science- MIT November 2, 2009 About this class Why selecting variables Approaches to variable selection Sparsity-based

More information

Data Mining Lecture 2: Recommender Systems

Data Mining Lecture 2: Recommender Systems Data Mining Lecture 2: Recommender Systems Jo Houghton ECS Southampton February 19, 2019 1 / 32 Recommender Systems - Introduction Making recommendations: Big Money 35% of Amazons income from recommendations

More information

Context based Re-ranking of Web Documents (CReWD)

Context based Re-ranking of Web Documents (CReWD) Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}

More information

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised

More information

Comparing State-of-the-Art Collaborative Filtering Systems

Comparing State-of-the-Art Collaborative Filtering Systems Comparing State-of-the-Art Collaborative Filtering Systems Laurent Candillier, Frank Meyer, Marc Boullé France Telecom R&D Lannion, France lcandillier@hotmail.com Abstract. Collaborative filtering aims

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Chapter 8. Evaluating Search Engine

Chapter 8. Evaluating Search Engine Chapter 8 Evaluating Search Engine Evaluation Evaluation is key to building effective and efficient search engines Measurement usually carried out in controlled laboratory experiments Online testing can

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval SCCS414: Information Storage and Retrieval Christopher Manning and Prabhakar Raghavan Lecture 10: Text Classification; Vector Space Classification (Rocchio) Relevance

More information

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems Recommender Systems - Introduction Making recommendations: Big Money 35% of amazons income from recommendations Netflix recommendation engine worth $ Billion per year And yet, Amazon seems to be able to

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 60 - Section - Fall 06 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Recommender Systems The Long Tail (from: https://www.wired.com/00/0/tail/)

More information

Risk Minimization and Language Modeling in Text Retrieval Thesis Summary

Risk Minimization and Language Modeling in Text Retrieval Thesis Summary Risk Minimization and Language Modeling in Text Retrieval Thesis Summary ChengXiang Zhai Language Technologies Institute School of Computer Science Carnegie Mellon University July 21, 2002 Abstract This

More information

second_language research_teaching sla vivian_cook language_department idl

second_language research_teaching sla vivian_cook language_department idl Using Implicit Relevance Feedback in a Web Search Assistant Maria Fasli and Udo Kruschwitz Department of Computer Science, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom fmfasli

More information

Preference Learning in Recommender Systems

Preference Learning in Recommender Systems Preference Learning in Recommender Systems M. de Gemmis L. Iaquinta P. Lops C. Musto F. Narducci G. Semeraro Department of Computer Science University of Bari, Italy ECML/PKDD workshop on Preference Learning,

More information

Study on Recommendation Systems and their Evaluation Metrics PRESENTATION BY : KALHAN DHAR

Study on Recommendation Systems and their Evaluation Metrics PRESENTATION BY : KALHAN DHAR Study on Recommendation Systems and their Evaluation Metrics PRESENTATION BY : KALHAN DHAR Agenda Recommendation Systems Motivation Research Problem Approach Results References Business Motivation What

More information

CAN DISSIMILAR USERS CONTRIBUTE TO ACCURACY AND DIVERSITY OF PERSONALIZED RECOMMENDATION?

CAN DISSIMILAR USERS CONTRIBUTE TO ACCURACY AND DIVERSITY OF PERSONALIZED RECOMMENDATION? International Journal of Modern Physics C Vol. 21, No. 1 (21) 1217 1227 c World Scientific Publishing Company DOI: 1.1142/S1291831115786 CAN DISSIMILAR USERS CONTRIBUTE TO ACCURACY AND DIVERSITY OF PERSONALIZED

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Recommender Systems II Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Recommender Systems Recommendation via Information Network Analysis Hybrid Collaborative Filtering

More information

Collaborative Filtering using Weighted BiPartite Graph Projection A Recommendation System for Yelp

Collaborative Filtering using Weighted BiPartite Graph Projection A Recommendation System for Yelp Collaborative Filtering using Weighted BiPartite Graph Projection A Recommendation System for Yelp Sumedh Sawant sumedh@stanford.edu Team 38 December 10, 2013 Abstract We implement a personal recommendation

More information

The Pennsylvania State University. The Graduate School. Harold and Inge Marcus Department of Industrial and Manufacturing Engineering

The Pennsylvania State University. The Graduate School. Harold and Inge Marcus Department of Industrial and Manufacturing Engineering The Pennsylvania State University The Graduate School Harold and Inge Marcus Department of Industrial and Manufacturing Engineering STUDY ON BIPARTITE NETWORK IN COLLABORATIVE FILTERING RECOMMENDER SYSTEM

More information

CS425: Algorithms for Web Scale Data

CS425: Algorithms for Web Scale Data CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org J.

More information

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues

10/10/13. Traditional database system. Information Retrieval. Information Retrieval. Information retrieval system? Information Retrieval Issues COS 597A: Principles of Database and Information Systems Information Retrieval Traditional database system Large integrated collection of data Uniform access/modifcation mechanisms Model of data organization

More information

Machine Learning. Supervised Learning. Manfred Huber

Machine Learning. Supervised Learning. Manfred Huber Machine Learning Supervised Learning Manfred Huber 2015 1 Supervised Learning Supervised learning is learning where the training data contains the target output of the learning system. Training data D

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Outline Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Lecture 10 CS 410/510 Information Retrieval on the Internet Query reformulation Sources of relevance for feedback Using

More information

COMP6237 Data Mining Searching and Ranking

COMP6237 Data Mining Searching and Ranking COMP6237 Data Mining Searching and Ranking Jonathon Hare jsh2@ecs.soton.ac.uk Note: portions of these slides are from those by ChengXiang Cheng Zhai at UIUC https://class.coursera.org/textretrieval-001

More information

Robustness and Accuracy Tradeoffs for Recommender Systems Under Attack

Robustness and Accuracy Tradeoffs for Recommender Systems Under Attack Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference Robustness and Accuracy Tradeoffs for Recommender Systems Under Attack Carlos E. Seminario and

More information

BordaRank: A Ranking Aggregation Based Approach to Collaborative Filtering

BordaRank: A Ranking Aggregation Based Approach to Collaborative Filtering BordaRank: A Ranking Aggregation Based Approach to Collaborative Filtering Yeming TANG Department of Computer Science and Technology Tsinghua University Beijing, China tym13@mails.tsinghua.edu.cn Qiuli

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Slope One Predictors for Online Rating-Based Collaborative Filtering

Slope One Predictors for Online Rating-Based Collaborative Filtering Slope One Predictors for Online Rating-Based Collaborative Filtering Daniel Lemire Anna Maclachlan February 7, 2005 Abstract Rating-based collaborative filtering is the process of predicting how a user

More information

An Efficient Neighbor Searching Scheme of Distributed Collaborative Filtering on P2P Overlay Network 1

An Efficient Neighbor Searching Scheme of Distributed Collaborative Filtering on P2P Overlay Network 1 An Efficient Neighbor Searching Scheme of Distributed Collaborative Filtering on P2P Overlay Network 1 Bo Xie, Peng Han, Fan Yang, Ruimin Shen Department of Computer Science and Engineering, Shanghai Jiao

More information

Graph-based Analysis for E-commerce Recommendation

Graph-based Analysis for E-commerce Recommendation Graph-based Analysis for E-commerce Recommendation Abstract Zan Huang, The University of Arizona Recommender systems automate the process of recommending products and services to consumers based on various

More information

WebSci and Learning to Rank for IR

WebSci and Learning to Rank for IR WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles

More information

LECTURE 12. Web-Technology

LECTURE 12. Web-Technology LECTURE 12 Web-Technology Household issues Course evaluation on Caracal o https://caracal.uu.nl o Between 3-4-2018 and 29-4-2018 Assignment 3 deadline extension No lecture/practice on Friday 30/03/18 2

More information

The Tourism Recommendation of Jingdezhen Based on Unifying User-based and Item-based Collaborative filtering

The Tourism Recommendation of Jingdezhen Based on Unifying User-based and Item-based Collaborative filtering The Tourism Recommendation of Jingdezhen Based on Unifying User-based and Item-based Collaborative filtering Tao Liu 1, a, Mingang Wu 1, b and Donglan Ying 2, c 1 School of Information Engineering, Jingdezhen

More information

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system. Introduction to Information Retrieval Ethan Phelps-Goodman Some slides taken from http://www.cs.utexas.edu/users/mooney/ir-course/ Information Retrieval (IR) The indexing and retrieval of textual documents.

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org A 3.3 B 38.4 C 34.3 D 3.9 E 8.1 F 3.9 1.6 1.6 1.6 1.6 1.6 2 y 0.8 ½+0.2 ⅓ M 1/2 1/2 0 0.8 1/2 0 0 + 0.2 0 1/2 1 [1/N]

More information

A Recursive Prediction Algorithm for Collaborative Filtering Recommender Systems

A Recursive Prediction Algorithm for Collaborative Filtering Recommender Systems A Recursive rediction Algorithm for Collaborative Filtering Recommender Systems ABSTRACT Jiyong Zhang Human Computer Interaction Group, Swiss Federal Institute of Technology (EFL), CH-1015, Lausanne, Switzerland

More information

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes CS630 Representing and Accessing Digital Information Information Retrieval: Retrieval Models Information Retrieval Basics Data Structures and Access Indexing and Preprocessing Retrieval Models Thorsten

More information

Recommender Systems - Content, Collaborative, Hybrid

Recommender Systems - Content, Collaborative, Hybrid BOBBY B. LYLE SCHOOL OF ENGINEERING Department of Engineering Management, Information and Systems EMIS 8331 Advanced Data Mining Recommender Systems - Content, Collaborative, Hybrid Scott F Eisenhart 1

More information

A Survey on Various Techniques of Recommendation System in Web Mining

A Survey on Various Techniques of Recommendation System in Web Mining A Survey on Various Techniques of Recommendation System in Web Mining 1 Yagnesh G. patel, 2 Vishal P.Patel 1 Department of computer engineering 1 S.P.C.E, Visnagar, India Abstract - Today internet has

More information

CS 6320 Natural Language Processing

CS 6320 Natural Language Processing CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic

More information

CompSci 1: Overview CS0. Collaborative Filtering: A Tutorial. Collaborative Filtering. Everyday Examples of Collaborative Filtering...

CompSci 1: Overview CS0. Collaborative Filtering: A Tutorial. Collaborative Filtering. Everyday Examples of Collaborative Filtering... CompSci 1: Overview CS0 Collaborative Filtering: A Tutorial! Audioscrobbler and last.fm! Collaborative filtering! What is a neighbor?! What is the network? Drawn from tutorial by William W. Cohen Center

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6 - Section - Spring 7 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Project Project Deadlines Feb: Form teams of - people 7 Feb:

More information

Personalized Web Search

Personalized Web Search Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents

More information