Data Mining Lecture 2: Recommender Systems

Similar documents
Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

COMP6237 Data Mining Making Recommendations. Jonathon Hare

Recommendation Systems

Recommender Systems. Techniques of AI

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Introduction to Data Mining

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

BBS654 Data Mining. Pinar Duygulu

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Recommender Systems 6CCS3WSN-7CCSMWAL

Knowledge Discovery and Data Mining 1 (VO) ( )

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Part 11: Collaborative Filtering. Francesco Ricci

Recommender Systems New Approaches with Netflix Dataset

Distributed Itembased Collaborative Filtering with Apache Mahout. Sebastian Schelter twitter.com/sscdotopen. 7.

Data Mining Techniques

Property1 Property2. by Elvir Sabic. Recommender Systems Seminar Prof. Dr. Ulf Brefeld TU Darmstadt, WS 2013/14

Data Mining Techniques

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Web Personalization & Recommender Systems

CS249: ADVANCED DATA MINING

Similarity and recommender systems

Hybrid Recommendation System Using Clustering and Collaborative Filtering

Introduction. Chapter Background Recommender systems Collaborative based filtering

CptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007

Using Data Mining to Determine User-Specific Movie Ratings

Performance Comparison of Algorithms for Movie Rating Estimation

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci

Web Personalization & Recommender Systems

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

Recommender Systems - Content, Collaborative, Hybrid

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Part 11: Collaborative Filtering. Francesco Ricci

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #16: Recommenda2on Systems

A Survey on Various Techniques of Recommendation System in Web Mining

Predicting Popular Xbox games based on Search Queries of Users

Data Mining Lecture 8: Decision Trees

Jeff Howbert Introduction to Machine Learning Winter

ECS289: Scalable Machine Learning

By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin

Inf2b Learning and Data

Recommender Systems: User Experience and System Issues

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1

Introduction to Data Mining and Data Analytics

CPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017

Mining Web Data. Lijun Zhang

Diversity in Recommender Systems Week 2: The Problems. Toni Mikkola, Andy Valjakka, Heng Gui, Wilson Poon

Prowess Improvement of Accuracy for Moving Rating Recommendation System

Movie Recommender System - Hybrid Filtering Approach

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering

Recommendation Systems

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

A Preference-based Recommender System

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Machine Learning using MapReduce

Singular Value Decomposition, and Application to Recommender Systems

COMP 465: Data Mining Recommender Systems

Topic 1 Classification Alternatives

Collaborative Filtering using Euclidean Distance in Recommendation Engine

Multi-domain Predictive AI. Correlated Cross-Occurrence with Apache Mahout and GPUs

Recommendation and Advertising. Shannon Quinn (with thanks to J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University)

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

Movie Recommendation Using OLAP and Multidimensional Data Model

Collaborative Filtering and Recommender Systems. Definitions. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

Community-Based Recommendations: a Solution to the Cold Start Problem

Comparative performance of opensource recommender systems

Recap: Project and Practicum CS276B. Recommendation Systems. Plan for Today. Sample Applications. What do RSs achieve? Given a set of users and items

Supervised Random Walks

Mining Web Data. Lijun Zhang

The influence of social filtering in recommender systems

Lec 8: Adaptive Information Retrieval 2

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

A Scalable, Accurate Hybrid Recommender System

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

Collaborative Filtering based on User Trends

The Principle and Improvement of the Algorithm of Matrix Factorization Model based on ALS

Yelp Recommendation System

Data Mining Algorithms

A Recommender System Based on Improvised K- Means Clustering Algorithm

CSCI 204 Introduction to Computer Science II

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

International Journal of Advance Engineering and Research Development. A Facebook Profile Based TV Shows and Movies Recommendation System

City, University of London Institutional Repository. This version of the publication may differ from the final published version.

A Genetic Algorithm Approach to Recommender. System Cold Start Problem. Sanjeevan Sivapalan. Bachelor of Science, Ryerson University, 2011.

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

CS570: Introduction to Data Mining

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM

AI Dining Suggestion App. CS 297 Report Bao Pham ( ) Advisor: Dr. Chris Pollett

Predicting user preference for movies using movie lens dataset

Neighborhood-Based Collaborative Filtering

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #19: Machine Learning 1

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Transcription:

Data Mining Lecture 2: Recommender Systems Jo Houghton ECS Southampton February 19, 2019 1 / 32

Recommender Systems - Introduction Making recommendations: Big Money 35% of Amazons income from recommendations Netflix recommendation engine worth 1$ Billion per year 2 / 32

Recommender Systems - Introduction Making recommendations: Big Money 35% of Amazons income from recommendations Netflix recommendation engine worth 1$ Billion per year When you know very little about that person, suggesting: I things to buy, I films to watch, I books to read, I people to date,..is hard. 2 / 32

Recommender Systems - Introduction Making recommendations: Big Money 35% of Amazons income from recommendations Netflix recommendation engine worth 1$ Billion per year And yet, Amazon seems to be able to recommend stuff I like. When you know very little about that person, suggesting: I things to buy, I films to watch, I books to read, I people to date,..is hard. 2 / 32

Recommender Systems - Problem Formulation You have a set of films and a set of users. The users have rated some of the films. Film Alice Bob Carol Dave Love Really 4 1 4 Deadly Weapon 1 4 5 Fast and Cross 5 5 4 Star Battles 1 5 How can you predict what should go in the blanks? 3 / 32

Recommender Systems - Algorithms They use one of three different types of algorithm: Content based Filtering: eg IMDB, Rotten tomatoes Creates a profile of features for each item, and a profile for each user, with a weighted vector of features for the item. Can use an average value of the rated item vector, or Bayesian classifiers, ANNs, etc to decide. 4 / 32

Recommender Systems - Algorithms They use one of three different types of algorithm: Content based Filtering: eg IMDB, Rotten tomatoes Creates a profile of features for each item, and a profile for each user, with a weighted vector of features for the item. Can use an average value of the rated item vector, or Bayesian classifiers, ANNs, etc to decide. Collaborative filtering:facebook, Amazon.. Does not rely on understanding the film, or book, or the user. With a large amount of information on user preferences, predicts what users will like based on their similarity to other users. eg. Amazon: people who buy x also buy y 4 / 32

Recommender Systems - Algorithms They use one of three different types of algorithm: Content based Filtering: eg IMDB, Rotten tomatoes Creates a profile of features for each item, and a profile for each user, with a weighted vector of features for the item. Can use an average value of the rated item vector, or Bayesian classifiers, ANNs, etc to decide. Collaborative filtering:facebook, Amazon.. Does not rely on understanding the film, or book, or the user. With a large amount of information on user preferences, predicts what users will like based on their similarity to other users. eg. Amazon: people who buy x also buy y Hybrid recommender systems: Netflix Uses combinations of different approaches We will examine Collaborative Filtering (CF) in this lecture. 4 / 32

Recommender Systems - Content based approach Can use a vector of features for each film, eg romance, action Film Alice Bob Carol Dave x 1 romance x 2 action Love Really 4 1 4 1 0.1 Deadly Weapon 1 4 5 0.1 1 Fast and Cross 5 5 4 0.2 0.9 Star Fight 1 5 0.1 1 Each film can be described by the vector X = [1 x 1 x 2 ] (1 is for the bias term) Learn 2D parameter vector θ, where θ T X gives the number of stars for each user. θ = [0 5 0] for someone who really likes romance films 5 / 32

Recommender Systems - Content based approach Use Linear Regression to find user parameter vector θ where m is the number of films rated by that user 1 min θ m m (θ T X i y) 2 (1) i=1 Can also use Bayesian classifiers, MLPs, etc. Problems? 6 / 32

Recommender Systems - Content based approach Use Linear Regression to find user parameter vector θ where m is the number of films rated by that user 1 min θ m m (θ T X i y) 2 (1) i=1 Can also use Bayesian classifiers, MLPs, etc. Problems? requires hand coded knowledge of film not easy to scale up user may not have rated many films 6 / 32

Recommender Systems - Collaborative Filtering Collaborative Filtering example: Alice likes Dr Who, Star Wars and Star Trek Bob likes Dr Who and Star Trek A recommender system would correlate the likes, and suggest that Bob might like Star Wars too. Personal preferences can be correlated. Task: Discover patterns in observed behaviour across a community of users Purchase history Item ratings Click counts Predict new preferences based on those patterns 7 / 32

Recommender Systems - Collaborative Filtering Collaborative filtering uses a range of approaches to accomplish this task Neigbourhood based approach Model based approach Hybrid (Neighbourhood and model) based approach This lecture will cover the Neighborhood based approach 8 / 32

Collaborative Filtering Measure user preferences. Eg. Film recommendation Users rate films between 0 and 5 stars Lady in the Water Snakes on a Plane Just my Luck Superman Returns The Night Listener Lisa 2.5 3.5 3.0 3.5 3.0 2.5 Gene 3.0 3.5 1.5 5.0 3.0 3.5 Mike 2.5 3.0 3.5 4.0 Jane 3.5 3.0 4.0 4.5 2.5 Mick 3.0 4.0 2.0 3.0 3.0 2.0 Jill 3.0 4.0 5.0 3.0 3.5 Toby 4.5 4.0 1.0 The data is sparse, there are missing values You, Me and Dupree 9 / 32

Collaborative Filtering - Sparsity Sparsity can be taken advantage of to speed up computations Most libraries that do matrix algebra are based on LAPACK, written in Fortan90 Computation is done by calls to the Basic Linear Algebra Subprograms (BLAS). This is how the Python numpy library does its linear algebra. 10 / 32

Collaborative Filtering - Sparsity Compresssed Row Storage (CRS) 1 Matrix specified by three arrays: val, row ind and col ptr val stores the non zero values col ind stores column indices of each element in val row ptr stores the index of the elements in val which start arow E.g. What matrix would this give? val = [1, 2, 9, 8, 2, -1, 4, 5, 2, 7] col ind = [1, 2, 4, 3, 4, 1, 2, 4, 2, 3] row ptr = [1, 4, 6, 9] 1 Harwell-Boeing sparse matrix format, Duff et al, ACM Trans. Math. Soft., 15 (1989), pp. 1-14. 11 / 32

Collaborative Filtering - Sparsity Compresssed Row Storage (CRS) 1 Matrix specified by three arrays: val, row ind and col ptr val stores the non zero values col ind stores column indices of each element in val row ptr stores the index of the elements in val which start arow E.g. What matrix would this give? val = [1, 2, 9, 8, 2, -1, 4, 5, 2, 7] col ind = [1, 2, 4, 3, 4, 1, 2, 4, 2, 3] row ptr = [1, 4, 6, 9] 1 2 9 8 2 1 4 5 2 7 1 Harwell-Boeing sparse matrix format, Duff et al, ACM Trans. Math. Soft., 15 (1989), pp. 1-14. 11 / 32

Collaborative Filtering - Feature Extraction Features are stored in a feature vector, a fixed length list of numbers. The length of this vector is the number of dimensions Each vector represents a point and a direction in the featurespace. Each vector must have the same dimensionality A projection of encoded word vectors shows that similar words are close to each other in feature space. We say two things are similar if they have similar feature vectors, i.e. are close to each other in featurespace. 12 / 32

Collaborative Filtering - Distance There are three ways to measure distance in feature space: 4 3 2 1 Euclidean Distance: 5 p 0 0 1 2 3 4 q p and q are N-dimensional feature vectors, p = [p 1, p 2,..., p N ], q = [q 1, q 2,..., q N ] Euclidean distance: p q = N (q i p i ) 2 i=1 13 / 32

Collaborative Filtering - Distance Manhattan Distance: 5 p 4 3 2 p and q are N-dimensional feature vectors, p = [p 1, p 2,..., p N ], q = [q 1, q 2,..., q N ] Manhattan distance: 1 0 0 1 2 3 4 q p q 1 = N (q i p i ) i=1 14 / 32

Collaborative Filtering - Distance 4 3 2 1 Cosine Similarity 5 p θ 0 0 1 2 3 4 Only measures direction, not magnitude of vector. q p and q are N-dimensional feature vectors, p = [p 1, p 2,..., p N ], q = [q 1, q 2,..., q N ] Cosine Similarity: cos(θ) = p.q p q N i=1 = p iq i N N i=1 p2 i i=1 q2 i 15 / 32

Collaborative Filtering - User Similarity Need to define a similarity score, based on the idea that similar users have similar tastes, i.e. like the same movies.) Needs to take in to account sparsity, not all users have seen all movies. Typically between 0 and 1, where 1 is the same, and 0 is totally different Can visualise the users in feature space, using two dimensions at a time Visualisation of users in film space ipynb 16 / 32

Collaborative Filtering - User Similarity There are many ways to compute similarity based on Euclidean distance We could chose: sim L2 (x, y) = 1 1 + 1 I xy (r x,1 r y,i ) 2 where r x,i is the rating from user x for item i I xy is set of items rated by both x and y i.e. when the distance is 0, the similarity is 1, but when the distance is large, similarity 0 17 / 32

Collaborative Filtering - User Similarity Alternatively, calculate correlation of users, based on ratings they share Using Pearson s Correlation: standard measure of dependence between two related variables. i I sim Pearson (x, y) = xy (r x,i r x )(r y,i r y ) i I (r xy x,i r x ) 2 i I (r xy y,i r y ) 2 Where r x is average rating user x gave for all items in I xy Correlation between users in ipynb 18 / 32

Collaborative Filtering - User Similarity Can also use cosine similarity i I sim cos (x, y) = xy r x,i r y,i i I r 2 xy xi i Ix y r yi 2 Only performed over the items which are rated by both users 19 / 32

Collaborative Filtering - User Similarity Users are inconsistent. Some users always give out 5s to films they like, whereas some are more picky. For example, look at Lisa and Jill, both rank the films in the same order, but give different ratings. Pearson correlation corrects for this, but Euclidean similarity doesn t. Data normalisation and mean centering can overcome this. Data standardisation 20 / 32

Collaborative Filtering - User Filtering We now have a set of measures for computing the similarity between users Produce a ranked list of best matches to a target user. Typically want the top-n users May only want to consider a subset of users, i.e. those who rated a particular item. Ranking users by similarity ipynb 21 / 32

Collaborative Filtering - Recommending Now we have a list of similar users, how can we recommend items? Predict rating r u,i of item i by user u as an aggregation of the ratings of item i by users similar to u r u,i = aggrû U (rû,i ) Where U is the set of top users most similar to u that rated item i Multiply the score by the similarity of the user Normalise by sum of similarities (otherwise items rated more often will dominate) r u,i = û U sim(u, û)r û,i û U sim(u, û) This is User Based Filtering Demo:User based recommendation 22 / 32

Collaborative Filtering - User Based Filtering Can also aggregate by computing average over similar users r u,i = 1 N Û U Or by subtracting the average user rating score for all the items they scored, this is to compensate for people that judge generously or meanly. r u,i = r u + rû,i û U sim(u, û)(r û,i rû) sim(u, û) û U 23 / 32

Collaborative Filtering - User Based Filtering We can also compute similarity between items, using the same method. This provides a fuzzy basis for recommending alternative items. There are more structured ways of identifying what products people buy together using Market Basket Analysis Demo: Item Item Similarity 24 / 32

Collaborative Filtering - User Based Filtering Problems? Need to compute the similarity against every user. Doesn t scale up to millions of users. Computationally hard With many items, may be little overlap, making the similarity calculation hard 25 / 32

Collaborative Filtering - Item Based Filtering The comparisons between items will not change as frequently as comparisons between users So? Precompute and store the most similar items for each item To make a recommendation for a user: Look at top rated items Aggregate similar items using precomputed similarities These similarities will change with new ratings, but will change slowly Demo: Precomputing Item Similarity 26 / 32

Collaborative Filtering - Item Based Filtering To compute recommendations using this approach: Estimate the rating for unrated item î that has a top-n similarity to a rated item i: r u,î = i I sim(î, i)r u,i sim(î, i) i I Where I is the subset of all N items similar to î Demo: Item based recommendation 27 / 32

Collaborative Filtering - Comparing Item and User based Filtering User Based Filtering: Easier to implement No maintenance of comparisons Deals well with datasets that frequently change Deals well with small dense datasets Item Based Filtering: Maintenance of comparison data necessary Deals well with small dense datasets Also deals well with larger sparse datasets Deals well with frequently changing users 28 / 32

Collaborative Filtering - Problems Problems? 29 / 32

Collaborative Filtering - Problems Problems? The cold start problem Collaborative filtering will not work for a new user, or new item. 29 / 32

Collaborative Filtering - Solutions For new items: Hybrid approach Use content based features to find similar items Bootstrap ratings for the new item by averaging the ratings users gave to similar items 30 / 32

Collaborative Filtering - Solutions For new users: Harder Bootstrap user profile from cookies Ask new users questions 31 / 32

Collaborative Filtering - Summary Recommender systems are worth millions Collaborative Filtering: Uses peoples behaviour to gather information Doesn t need content based features 32 / 32

Collaborative Filtering - Summary Recommender systems are worth millions Collaborative Filtering: Uses peoples behaviour to gather information Doesn t need content based features User based Neighbourhood approach: Computes similarities between users Predicts unseen item weights using ratings of similar users 32 / 32

Collaborative Filtering - Summary Recommender systems are worth millions Collaborative Filtering: Uses peoples behaviour to gather information Doesn t need content based features User based Neighbourhood approach: Computes similarities between users Predicts unseen item weights using ratings of similar users Item based Neighbourhood approach: Precomputes similarities between items Predicts unseen user ratings using ratings of similar items 32 / 32