Data Mining Techniques

Similar documents
Data Mining Techniques

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

COMP 465: Data Mining Recommender Systems

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS 124/LINGUIST 180 From Languages to Information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Machine Learning and Data Mining. Collaborative Filtering & Recommender Systems. Kalev Kask

Recommendation Systems

Recommendation and Advertising. Shannon Quinn (with thanks to J. Leskovec, A. Rajaraman, and J. Ullman of Stanford University)

CS 124/LINGUIST 180 From Languages to Information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #16: Recommenda2on Systems

CS 572: Information Retrieval

CS 124/LINGUIST 180 From Languages to Information

Performance Comparison of Algorithms for Movie Rating Estimation

CSE 158 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

CSE 258 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

Real-time Recommendations on Spark. Jan Neumann, Sridhar Alla (Comcast Labs) DC Spark Interactive Meetup East May

Collaborative Filtering Applied to Educational Data Mining

Using Social Networks to Improve Movie Rating Predictions

Recommender Systems New Approaches with Netflix Dataset

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Web Personalisation and Recommender Systems

An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Introduction to Data Mining

CS224W Project: Recommendation System Models in Product Rating Predictions

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University

Data Mining Lecture 2: Recommender Systems

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

THE goal of a recommender system is to make predictions

CPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017

Deep Learning for Recommender Systems

CptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007

Factor in the Neighbors: Scalable and Accurate Collaborative Filtering

Non-negative Matrix Factorization for Multimodal Image Retrieval

Part 11: Collaborative Filtering. Francesco Ricci

Recommender System Optimization through Collaborative Filtering

Extension Study on Item-Based P-Tree Collaborative Filtering Algorithm for Netflix Prize

Rating Prediction Using Preference Relations Based Matrix Factorization

Recommender Systems. Techniques of AI

ECS289: Scalable Machine Learning

Sparse Estimation of Movie Preferences via Constrained Optimization

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression

Advances in Collaborative Filtering

Variational Bayesian PCA versus k-nn on a Very Sparse Reddit Voting Dataset

Music Recommendation with Implicit Feedback and Side Information

Clustering-Based Personalization

Collaborative Filtering for Netflix

Additive Regression Applied to a Large-Scale Collaborative Filtering Problem

CS249: ADVANCED DATA MINING

Parallel learning of content recommendations using map- reduce

Advances in Collaborative Filtering

Hybrid Recommendation Models for Binary User Preference Prediction Problem

BordaRank: A Ranking Aggregation Based Approach to Collaborative Filtering

Scalable Network Analysis

CS294-1 Assignment 2 Report

BBS654 Data Mining. Pinar Duygulu

Collaborative Filtering using Weighted BiPartite Graph Projection A Recommendation System for Yelp

Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model

General Instructions. Questions

Yelp Recommendation System

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

Performance of Recommender Algorithms on Top-N Recommendation Tasks

Part 11: Collaborative Filtering. Francesco Ricci

arxiv: v4 [cs.ir] 28 Jul 2016

COMP6237 Data Mining Making Recommendations. Jonathon Hare

Recommender Systems. Master in Computer Engineering Sapienza University of Rome. Carlos Castillo

By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin

Recommender Systems (RSs)

Towards a hybrid approach to Netflix Challenge

Singular Value Decomposition, and Application to Recommender Systems

CS570: Introduction to Data Mining

Non-negative Matrix Factorization for Multimodal Image Retrieval

Seminar Collaborative Filtering. KDD Cup. Ziawasch Abedjan, Arvid Heise, Felix Naumann

Predicting User Ratings Using Status Models on Amazon.com

Neighborhood-Based Collaborative Filtering

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Know your neighbours: Machine Learning on Graphs

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

Machine Learning Methods for Recommender Systems

Recommender Systems - Content, Collaborative, Hybrid

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Recent Advances in Recommender Systems and Future Direc5ons

10-701/15-781, Fall 2006, Final

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

Introduction. Chapter Background Recommender systems Collaborative based filtering

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

STREAMING RANKING BASED RECOMMENDER SYSTEMS

Recommender Systems 6CCS3WSN-7CCSMWAL

Introduction to Data Science Lecture 8 Unsupervised Learning. CS 194 Fall 2015 John Canny

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

Transcription:

Data Mining Techniques CS 60 - Section - Fall 06 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6)

Recommender Systems

The Long Tail (from: https://www.wired.com/00/0/tail/)

The Long Tail (from: https://www.wired.com/00/0/tail/)

The Long Tail (from: https://www.wired.com/00/0/tail/)

Problem Setting

Problem Setting

Problem Setting

Problem Setting Task: Predict user preferences for unseen items

Content-based Filtering serious The Color Purple Amadeus Braveheart Geared towards females Sense and Sensibility Ocean s Lethal Weapon Geared towards males Dave The Princess Diaries The Lion King Independence Day Gus Dumb and Dumber escapist

Content-based Filtering serious The Color Purple Amadeus Braveheart Geared towards females Sense and Sensibility Ocean s Lethal Weapon Geared towards males Dave The Princess Diaries The Lion King Independence Day Gus Dumb and Dumber escapist Idea: Predict rating using item features on a per-user basis

Content-based Filtering serious The Color Purple Amadeus Braveheart Geared towards females Sense and Sensibility Ocean s Lethal Weapon Geared towards males Dave The Princess Diaries The Lion King Independence Day Gus Dumb and Dumber escapist Idea: Predict rating using user features on a per-item basis

Collaborative Filtering # # # Joe # Idea: Predict rating based on similarity to other users

Problem Setting Task: Predict user preferences for unseen items Content-based filtering: Model user/item features Collaborative filtering: Implicit similarity of users items

Recommender Systems Movie recommendation (Netflix) Related product recommendation (Amazon) Web page ranking (Google) Social recommendation (Facebook) News content recommendation (Yahoo) Priority inbox & spam filtering (Google) Online dating (OK Cupid) Computational Advertising (Everyone)

Challenges Scalability Millions of objects 00s of millions of users Cold start Changing user base Changing inventory Imbalanced dataset User activity / item reviews power law distributed Ratings are not missing at random

Running Example: Netflix Data Training data Test data user movie date score user movie date score /7/0 6 /6/0? 8//0 96 9//0? /6/0 7 8/8/0? //0 //0? 768 7//0 7 6//0? 76 //0 8//0? 8//00 9//00? 68 9/0/0 8 8/7/0? //0 9 //0? /8/00 7 7/6/0? 6 76 8//0 6 69 //0? 6 6 6//0 6 8 0//0? Released as part of $M competition by Netflix in 006 Prize awarded to BellKor in 009

Running Yardstick: RMSE rmse(s) = s S X (ˆr ui r ui ) (i,u)s

Running Yardstick: RMSE rmse(s) = s S X (i,u)s (ˆr ui r ui ) (doesn t tell you how to actually do recommendation)

Ratings aren t everything Netflix then Netflix now

Content-based Filtering

Item-based Features

Item-based Features

Item-based Features

Per-user Regression Learn a set of regression coefficients for each user w u = argmin w r u Xw

Bias

Bias

Bias Moonrise Kingdom 0. 0.

Bias Moonrise Kingdom 0. 0. Problem: Some movies are universally loved / hated

Bias Moonrise Kingdom 0. 0. Problem: Some movies are universally loved / hated some users are more picky than others

Bias Moonrise Kingdom 0. 0. Problem: Some movies are universally loved / hated some users are more picky than others Solution: Introduce a per-movie and per-user bias

Temporal Effects

Changes in user behavior Netflix changed rating labels 00

Movies get better with time?

Temporal Effects Solution: Model temporal effects in bias not weights

Neighborhood Methods

Neighborhood Based Methods # # # Joe # Users and items form a bipartite graph (edges are ratings)

Neighborhood Based Methods (user, user) similarity predict rating based on average from k-nearest users good if item base is smaller than user base good if item base changes rapidly (item,item) similarity predict rating based on average from k-nearest items good if the user base is small good if user base changes rapidly

Parzen-Window Style CF ˆr ui = b ui + P js k (i,u) s ij(r uj b uj ) P js k (i,u) s ij b ui = µ + b u + b i Define a similarity sij between items Find set sk(i,u) of k-nearest neighbors to i that were rated by user u Predict rating using weighted average over set How should we define sij?

Pearson Correlation Coefficient User ratings for item i:??????????? User ratings for item j:??????????? s ij = Cov[r ui,r uj ] Std[r ui ]Std[r uj ]

(item,item) similarity Empirical estimate of Pearson correlation coefficient P uu(i,j) (r ui b ui )(r uj b uj ) ˆ ij = q P uu(i,j) (r ui b ui ) P uu(i,j) (r uj b uj ) Regularize towards 0 for small support s ij = U(i, j) U(i, j) + ˆ ij Regularize towards baseline for small neighborhood P js ˆr ui = b ui + k (i,u) s ij(r uj b uj ) + P js k (i,u) s ij

Similarity for binary labels Pearson correlation not meaningful for binary labels (e.g. Views, Purchases, Clicks) Jaccard similarity Observed / Expected ratio s ij = m ij + m i + m j m ij s ij = observed expected m ij + m i m j /m m i users acting on i m ij users acting on both i and j m total number of users

Matrix Factorization Methods

Matrix Factorization Moonrise Kingdom 0. 0.

Matrix Factorization Moonrise Kingdom 0. 0. Idea: pose as (biased) matrix factorization problem

Matrix Factorization items. -....6 -... -.... -. -.7..7 - -.9... -..8 -. -.. -... -.. -.7.9. -....7 -.8. -.6.7.8. -..9..7.6 -.. ~ ~ items users users A rank- SVD approximation

Prediction items. -....6 -... -.... -. -.7..7 - -.9... -..8 -. -.. -... -.. -.7.9. -....7 -.8. -.6.7.8. -..9..7.6 -.. ~ ~ items users A rank- SVD approximation users?

Prediction items. -....6 -... -.... -. -.7..7 - -.9... -..8 -. -.. -... -.. -.7.9. -....7 -.8. -.6.7.8. -..9..7.6 -.. ~ ~ items users. A rank- SVD approximation users

SVD with missing values. -....6 -... -.... -. -.7..7 - -.9... -..8 -. -.. -... -.. -.7.9. -....7 -.8. -.6.7.8. -..9..7.6 -.. ~ Pose as regression problem Regularize using Frobenius norm

Alternating Least Squares. -....6 -... -.... -. -.7..7 - -.9... -..8 -. -.. -... -.. -.7.9. -....7 -.8. -.6.7.8. -..9..7.6 -.. ~ (regress wu given X)

Alternating Least Squares. -.. -..6. ~ -.. -.7..... -. -.8. -..7 -....6...7 -.. -. -.9.8. -. -..9.. -.7.8...7. -. -.6 -.9.. -.7. (regress wu given X) L: closed form solution w =(X T X + I) X T y Remember ridge regression?

Alternating Least Squares. -....6 -... -.... -. -.7..7 - -.9... -..8 -. -.. -... -.. -.7.9. -....7 -.8. -.6.7.8. -..9..7.6 -.. ~ (regress xi given W) (regress wu given X)

Stochastic Gradient Descent. -.. -..6. ~ -.. -.7..... -. -.8. -..7 -....6...7 -.. -. -.9.8. -. -..9.. -.7.8...7. -. -.6 -.9.. -.7. No need for locking Multicore updates asynchronously (Recht, Re, Wright, 0 - Hogwild)

Netflix Prize

Netflix Prize Training data 00 million ratings, 80,000 users, 7,770 movies 6 years of data: 000-00 Test data Last few ratings of each user (.8 million) Evaluation criterion: Root Mean Square Error (RMSE) Competition,700+ teams Netflix s system RMSE: 0.9 $ million prize for 0% improvement on Netflix

Improvements RMSE 0.9 0.90 0.9 0.89 0.89 0.88 0.88 Factor models: Error vs. #parameters 0 0 60 90 880 0 00 00 0 00 00 00 00 Add biases 00 00 00 00 0 00 00 00 000 00 NMF BiasSVD SVD++ SVD v. SVD v. SVD v. 0.87 0 00 000 0000 00000 Millions of Parameters Do SGD, but also learn biases μ, bu and bi

Improvements RMSE 0.9 0.90 0.9 0.89 0.89 0.88 0.88 Factor models: Error vs. #parameters 0 0 60 90 880 0 00 00 0 00 00 00 00 who rated what 00 00 00 00 0 00 00 00 000 00 NMF BiasSVD SVD++ SVD v. SVD v. SVD v. 0.87 0 00 000 0000 00000 Millions of Parameters Account for fact that ratings are not missing at random.

Improvements 0.9 0.90 0.9 Factor models: Error vs. #parameters 0 60 90 880 0 00 00 NMF BiasSVD SVD++ RMSE 0.89 0.89 0.88 0.88 0 00 0 00 00 00 00 temporal effects 00 00 00 0 00 00 00 000 00 SVD v. SVD v. SVD v. 0.87 0 00 000 0000 00000 Millions of Parameters Account for drift in user and item biases

Improvements 0.9 0.90 0.9 Factor models: Error vs. #parameters 0 60 90 880 0 00 00 NMF BiasSVD SVD++ RMSE 0.89 0.89 0.88 0.88 0 00 0 00 00 00 00 temporal effects 00 00 00 0 00 00 00 000 00 SVD v. SVD v. SVD v. 0.87 0 00 000 0000 00000 Millions of Parameters Still pretty far from 0.86 grand prize

Winning Solution from BellKor

Last 0 days June 6 th submission triggers 0-day last call

Last 0 days June 6 th submission triggers 0-day last call

BellKor fends off competitors by a hair

BellKor fends off competitors by a hair