COMP 465: Data Mining Recommender Systems

Similar documents
CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Data Mining Techniques

Data Mining Techniques

Recommendation and Advertising. Shannon Quinn (with thanks to J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University)

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS 124/LINGUIST 180 From Languages to Information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #16: Recommenda2on Systems

Machine Learning and Data Mining. Collaborative Filtering & Recommender Systems. Kalev Kask

Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

CS 124/LINGUIST 180 From Languages to Information

CS 572: Information Retrieval

CS 124/LINGUIST 180 From Languages to Information

Introduction to Data Mining

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Recommendation Systems

CSE 258 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

CSE 158 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

Collaborative Filtering Applied to Educational Data Mining

Real-time Recommendations on Spark. Jan Neumann, Sridhar Alla (Comcast Labs) DC Spark Interactive Meetup East May

CptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007

BBS654 Data Mining. Pinar Duygulu

Recommender Systems New Approaches with Netflix Dataset

Additive Regression Applied to a Large-Scale Collaborative Filtering Problem

Yelp Recommendation System

General Instructions. Questions

An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

CS224W Project: Recommendation System Models in Product Rating Predictions

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin

Performance Comparison of Algorithms for Movie Rating Estimation

Using Social Networks to Improve Movie Rating Predictions

Computational Intelligence Meets the NetFlix Prize

Web Personalisation and Recommender Systems

Singular Value Decomposition, and Application to Recommender Systems

Progress Report: Collaborative Filtering Using Bregman Co-clustering

Non-negative Matrix Factorization for Multimodal Image Retrieval

Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

CPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017

Collaborative Filtering for Netflix

CSE 5243 INTRO. TO DATA MINING

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017

Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract

Towards a hybrid approach to Netflix Challenge

Seminar Collaborative Filtering. KDD Cup. Ziawasch Abedjan, Arvid Heise, Felix Naumann

Non-negative Matrix Factorization for Multimodal Image Retrieval

Know your neighbours: Machine Learning on Graphs

Achieving Better Predictions with Collaborative Neighborhood

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S

Variational Bayesian PCA versus k-nn on a Very Sparse Reddit Voting Dataset

Handling Ties. Analysis of Ties in Input and Output Data of Rankings

Recommender System Optimization through Collaborative Filtering

Jeff Howbert Introduction to Machine Learning Winter

Data Mining Lecture 2: Recommender Systems

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

Recommender Systems: User Experience and System Issues

Collaborative Filtering with Temporal Dynamics

Factor in the Neighbors: Scalable and Accurate Collaborative Filtering

Recommender Systems 6CCS3WSN-7CCSMWAL

CPSC 340: Machine Learning and Data Mining

Extension Study on Item-Based P-Tree Collaborative Filtering Algorithm for Netflix Prize

Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis

CS249: ADVANCED DATA MINING

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems

Advances in Collaborative Filtering

THE goal of a recommender system is to make predictions

Assignment 5: Collaborative Filtering

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Predicting Popular Xbox games based on Search Queries of Users

Part 11: Collaborative Filtering. Francesco Ricci

On hybrid modular recommendation systems for video streaming

Personalize Movie Recommendation System CS 229 Project Final Writeup

Chapter 2 Basic Structure of High-Dimensional Spaces

Lecture on Modeling Tools for Clustering & Regression

COSC6376 Cloud Computing Homework 1 Tutorial

Recommender Systems. Master in Computer Engineering Sapienza University of Rome. Carlos Castillo

Advances in Collaborative Filtering

PSS718 - Data Mining

Inf2b Learning and Data

Multiple-Choice Questionnaire Group C

Sparse Estimation of Movie Preferences via Constrained Optimization

Orange3 Data Fusion Documentation. Biolab

Dimension Reduction CS534

Using Data Mining to Determine User-Specific Movie Ratings

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

CSE 5243 INTRO. TO DATA MINING

311 Predictions on Kaggle Austin Lee. Project Description

Cluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010

Chapter 6: DESCRIPTIVE STATISTICS

Transcription:

//0 movies COMP 6: Data Mining Recommender Systems Slides Adapted From: www.mmds.org (Mining Massive Datasets) movies Compare predictions with known ratings (test set T)????? Test Data Set Root-mean-square error (RMSE) r xi r (x,i) T xi N where N = T r xi is predicted rating r xi is the actual rating of x on i

//0 Narrow focus on accuracy sometimes misses the point Prediction Diversity Prediction Context Order of predictions In practice, we care only to predict high ratings: RMSE might penalize a method that does well for high ratings and badly for others Alterative: precision at top k Percentage of predictions in the user s top k withheld ratings 6 Training data 00 million ratings, 80,000, 7,770 movies 6 years of data: 000-00 Test data Last few ratings of each user ( million) Evaluation criterion: Root Mean Square Error (RMSE) = rxi r R (i,x) R xi Netflix s system RMSE: 0 Competition,700+ teams $ million prize for 0% improvement on Netflix Matrix R 7,700 movies 80,000 7 8

//0 Matrix R 7,700 movies Training Data Set?? RMSE = R 80,000??? (i,x) R r,6 Test Data Set rxi r xi Predicted rating True rating of user x on item i 9 Training data 00 million ratings, 80,000, 7,770 movies 6 years of data: 000-00 Test data Last few ratings of each user ( million) Evaluation criterion: Root Mean Square Error (RMSE) = rxi r R (i,x) R xi Netflix s system RMSE: 0 Competition,700+ teams $ million prize for 0% improvement on Netflix 0 The winner of the Netflix Challenge! Multi-scale modeling of the data: Combine top level, regional modeling of the data, with a refined, local view: Global: Overall deviations of /movies Factorization: Addressing regional effects Collaborative filtering: Extract local patterns Global effects Factorization Collaborative filtering Global: Mean movie rating: stars The Sixth Sense is 0. stars above avg. Joe rates 0. stars below avg. Baseline estimation: Joe will rate The Sixth Sense stars Local neighborhood (CF/NN): Joe didn t like related movie Signs Final estimate: Joe will rate The Sixth Sense stars

//0 Earliest and most popular collaborative filtering method Derive unknown ratings from those of similar movies (item-item variant) Define similarity measure s ij of i and j Select k-nearest neighbors, compute the rating N(i; x): most similar to i that were rated by x rˆ xi j N ( i; x) s ij jn ( i; x) r s ij xj s ij similarity of i and j r xj rating of user x on item j N(i;x) set of similar to item i that were rated by x In practice we get better estimates if we model deviations: ^ rxi b xi baseline estimate for r xi b xi = μ + b x + b i μ = overall mean rating b x = rating deviation of user x = (avg. rating of user x) μ b i = (avg. rating of movie i) μ jn ( i; x) s ij ( r jn ( i; x) xj s ij b Problems/Issues: ) Similarity measures are arbitrary ) Pairwise similarities neglect interdependencies among ) Taking a weighted average can be restricting Solution: Instead of s ij use w ij that we estimate directly from data xj ) Basic Collaborative filtering: 0 CF+Biases+learned weights: 0 Global average: 6 User average:.06 Movie average:.0 Netflix: 0 Grand Prize: 06 Goal: Make good recommendations uantify goodness using RMSE: Lower RMSE better recommendations Want to make good recommendations on that user has not yet seen. Can t really do this! Let s set build a system such that it works well on known (user, item) ratings And hope the system will also predict well the unknown ratings 6

//0 SVD on Netflix data: R. -. -.. - - -........ -.. -. R For now let s assume we can approximate the rating matrix R as a product of thin R has missing entries but let s ignore that for now! Basically, we will want the reconstruction error to be small on known ratings and we don t care about the values on the missing ones -. -..... -.. -. -. -. SVD: A = U V T -... -... -. -.. females The Color Purple Sense and Sensibility The Princess Diaries Serious Amadeus Ocean s The Lion King Funny Braveheart Independence Day Lethal Weapon males Dumb and Dumber 7 8 How to estimate the missing rating of user x for item i? r xi = q i p x. -. -.. - -? -........ -.. -. -. -..... -.. 9 -. - = q if p xf. -. f q i = row i of p x = column x of -... -... -. -.. How to estimate the missing rating of user x for item i? r xi = q i p x. -. -.. - -? -........ -.. -. -. -..... -.. 0 -. - = q if p xf. -. f q i = row i of p x = column x of -... -... -. -..

f Factor Factor //0 How to estimate the missing rating of user x for item i? r xi = q i p x. -. -.. - -.? -........ -. f. -. -. -..... -.. -. - = q if p xf. -. f q i = row i of p x = column x of -... -... -. -.. females The Color Purple Sense and Sensibility The Princess Diaries Serious Amadeus The Lion King Funny Braveheart Lethal Weapon Ocean s Factor males Independence Day Dumb and Dumber females The Color Purple Sense and Sensibility The Princess Diaries Serious Amadeus The Lion King Funny Braveheart Lethal Weapon Ocean s Factor males Independence Day Dumb and Dumber SVD: A: Input data matrix U: Left singular vecs V: Right singular vecs : Singular values So in our case: SVD on Netflix data: R A = R, = U, = V T m n A m U n V T r xi = q i p x 6

//0 SVD gives minimum reconstruction error (Sum of Squared Errors): min A ij UΣV T ij U,V,Σ ij A Note two things: SSE and RMSE are monotonically related: RMSE = SSE Great news: SVD is minimizing RMSE c Complication: The sum in SVD error term is over all entries (no-rating in interpreted as zero-rating). But our R has missing entries!. -. -.. - - -........ -.. -. SVD isn t defined when entries are missing! Use specialized methods to find P, -. -. min P, r xi q i p i,x R x rxi = q i p x Note: We don t require cols of P, to be orthogonal/unit length P, map /movies to a latent space The most popular model among Netflix contestants.... 6 -.. -. -. -. -... -... -. -.. Sudden rise in the average movie rating (early 00) Improvements in Netflix GUI improvements Meaning of rating changed Movie age Users prefer new movies without any reasons Older movies are just inherently better than newer ones Y. Koren, Collaborative filtering with temporal dynamics, KDD 09 8 7

RMSE //0 0 CF (no time bias) 0 Basic Latent Factors CF (time bias) 0 Latent Factors w/ Biases 00 + Linear time 0 + Per-day user biases + CF 09 09 08 08 07 0 00 000 0000 Millions of parameters Basic Collaborative filtering: 0 Collaborative filtering++: 0 Latent : 00 Latent +Biases: 09 Latent +Biases+Time: 076 Global average: 6 User average:.06 Movie average:.0 Netflix: 0 Grand Prize: 06 Still no prize! Getting desperate. Try a kitchen sink approach! 0 June 6 th submission triggers 0-day last call Ensemble team formed Group of other teams on leaderboard forms a new team Relies on combining their models uickly also get a qualifying score over 0% BellKor Continue to get small improvements in their scores Realize that they are in direct competition with Ensemble Strategy Both teams carefully monitoring the leaderboard Only sure way to check for improvement is to submit a set of predictions This alerts the other team of your latest score 8

//0 Submissions limited to a day Only final submission could be made in the last h hours before deadline BellKor team member in Austria notices (by chance) that Ensemble posts a score that is slightly better than BellKor s Frantic last hours for both teams Much computer time on final optimization Carefully calibrated to end about an hour before deadline Final submissions BellKor submits a little early (on purpose), 0 mins before deadline Ensemble submits their final entry 0 mins later.and everyone waits. 6 7 9