Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Similar documents
Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Property1 Property2. by Elvir Sabic. Recommender Systems Seminar Prof. Dr. Ulf Brefeld TU Darmstadt, WS 2013/14

Introduction to Data Mining

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Performance Comparison of Algorithms for Movie Rating Estimation

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information

Mining Web Data. Lijun Zhang

Hybrid Recommendation System Using Clustering and Collaborative Filtering

CS 124/LINGUIST 180 From Languages to Information

BBS654 Data Mining. Pinar Duygulu

Data Mining Techniques

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Antonio Fernández Anta

A Recommender System Based on Improvised K- Means Clustering Algorithm

CSE 158 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

Mining Web Data. Lijun Zhang

Distributed Itembased Collaborative Filtering with Apache Mahout. Sebastian Schelter twitter.com/sscdotopen. 7.

Collaborative Filtering Applied to Educational Data Mining

Clustering and Dimensionality Reduction. Stony Brook University CSE545, Fall 2017

Content-based Dimensionality Reduction for Recommender Systems

Collaborative Filtering using Euclidean Distance in Recommendation Engine

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017

Data Mining Techniques

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CSE 258 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

Recommender Systems. Techniques of AI

Recommendation Systems

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1

Recommender Systems New Approaches with Netflix Dataset

COMP6237 Data Mining Making Recommendations. Jonathon Hare

Part 11: Collaborative Filtering. Francesco Ricci

Data Mining Lecture 2: Recommender Systems

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

CONTENTS. 1.1 Recommender System Collaborative filtering 1 2. LITERATURE REVIEW 5 3. NARRATIVE Software Requirements Specification 11

Using Social Networks to Improve Movie Rating Predictions

Web Personalization & Recommender Systems

Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm

Recommender System using Collaborative Filtering and Demographic Characteristics of Users

Web Personalization & Recommender Systems

CSE 158. Web Mining and Recommender Systems. Midterm recap

Hotel Recommendation Based on Hybrid Model

Vector Space Models: Theory and Applications

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data

Achieving Better Predictions with Collaborative Neighborhood

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

Tag based Recommender System using Pareto Principle

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Extension Study on Item-Based P-Tree Collaborative Filtering Algorithm for Netflix Prize

Section 7.12: Similarity. By: Ralucca Gera, NPS

The influence of social filtering in recommender systems

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci

Clustering-Based Personalization

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

University of Florida CISE department Gator Engineering. Clustering Part 2

Dimension reduction : PCA and Clustering

COMP 465: Data Mining Recommender Systems

Recommender system techniques applied to Netflix movie data

Author(s): Rahul Sami, 2009

CS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering

New user profile learning for extremely sparse data sets

Singular Value Decomposition, and Application to Recommender Systems

Recommender Systems - Content, Collaborative, Hybrid

What I Do After Clicking Publish

By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin

Neighborhood-Based Collaborative Filtering

Music Recommendation with Implicit Feedback and Side Information

Towards a hybrid approach to Netflix Challenge

STREAMING RANKING BASED RECOMMENDER SYSTEMS

CSE 454 Final Report TasteCliq

LECTURE 12. Web-Technology

Machine Learning using MapReduce

AMAZON.COM RECOMMENDATIONS ITEM-TO-ITEM COLLABORATIVE FILTERING PAPER BY GREG LINDEN, BRENT SMITH, AND JEREMY YORK

Recommender Systems (RSs)

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #16: Recommenda2on Systems

CptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

CS249: ADVANCED DATA MINING

Latent Semantic Indexing

Using Data Mining to Determine User-Specific Movie Ratings

Rating Prediction Using Preference Relations Based Matrix Factorization

Part 11: Collaborative Filtering. Francesco Ricci

Collaborative Filtering based on User Trends

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Data Analytics Using Hadoop Framework for Effective Recommendation in E-commerce Based on Social Network Knowledge

Real-time Recommendations on Spark. Jan Neumann, Sridhar Alla (Comcast Labs) DC Spark Interactive Meetup East May

Information Retrieval and Web Search Engines

Seminar Collaborative Filtering. KDD Cup. Ziawasch Abedjan, Arvid Heise, Felix Naumann

José Miguel Hernández Lobato Zoubin Ghahramani Computational and Biological Learning Laboratory Cambridge University

Knowledge Discovery and Data Mining 1 (VO) ( )

Recommender Systems. Nivio Ziviani. Junho de Departamento de Ciência da Computação da UFMG

What is Unsupervised Learning?

Community-Based Recommendations: a Solution to the Cold Start Problem

An Item/User Representation for Recommender Systems based on Bloom Filters

Cross-validation. Cross-validation is a resampling method.

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem

Variational Bayesian PCA versus k-nn on a Very Sparse Reddit Voting Dataset

Transcription:

Recommender System What is it? How to build it? Challenges R package: recommenderlab 1

What is a recommender system Wiki definition: A recommender system or a recommendation system (sometimes replacing system with a synonym such as platform or engine) is a subclass of information filtering system that seeks to predict the rating or preference that a user would give to an item. A system that automatically recommend items to users, which likely to be of interest to the users, by utilizing historical information. Daily recommender systems: Netflix, Amazon, Online Ads, Google news, Twitter, Facebook, Pandora, Pinterest,... 2

Data Representation User i = 1 : m Item j = 1 : n Utility matrix R ij (aka the User-Item Ratings Matrix): user i s preference for item j, which could be binary (click or not click), numerical (movie ratings), or NA. Goal: predict the missing entries in R, which are likely to take high values. So not exactly a matrix completion problem, although MSE or its variants are often used as evaluation criteria. 3

A Naive Recommender System Aggregate the ratings for each item Find the top K highly liked items, and then recommend them to every user Drawback: not personalized. How to tailor the recommendations for a user? Key ideas: users tend to like similar items; items tend to be liked by similar users. 4

How to Build a Recommender System Content-based method Collaborative filtering (CF) method Item-based CF User-based CF Hybrid method Dimension reduction via SVD A key ingredient: similarity measure 5

Content-Based Recommendation Item profile: represent each item by a d-dim feature vector. For example, how to characterize a movie by a feature vector? User profile: represent each user by a d-dim feature vector by aggregating the feature vectors of items this user like. So we embed the m users and n items in a Euclidean space R d. Then we can recommend items that are close to user i to user i. 6

Measuring Similarity Jaccard similarity: useful for binary ratings A B A B, where A, B are two sets. Cosine similarity: useful for numerical ratings u t v u v, where u, v are two vectors Other similarity measures: Pearson correlation 7

Item-Based CF Instead of using feature vectors to measure the similarity between items, we use similarity between their ratings. That is, we use the j-th column of the user-item Utility matrix R as the feature vector for user i, and compute similarity between items using one of the similarity measures. Then recommend item j to users who like items similar to j. 8

User-Based CF Similar to what we have done in Item-Based CF, we use the i-th row of the user-item Utility matrix to compute similarity between users. For user i, find his/her K-nearest users, aggregate their item ratings and then recommend highly liked items to user i. 9

Singular Value Decomposition Approximate R m n U m d V t d n R ij NA by minimizing (R ij u t iv j ) 2 + λ 1 Pen(U) + λ 2 Pen(V ), where u i is the i-th row of matrix U and v j is the j-th row of matrix V. Then we can predict any missing entries in R by the corresponding inner product of u i and v j. I ve also seen systems that use u i s and v i s as the new reduced dimension representation of users/items, and then apply content-based or CF methods. 10

Some Practical Issues Correct bias by normalization Cluster users and items Combine multiple recommender systems Consider content (location, time, device) and interface (mobile, laptop) Serendipity/Diversity versus accuracy Incorporate user feedback 11

Challenges Scalability: large amount of items and users Imbalance: data sparsity Utility matrix: how to construct it based on the problem at hand? Cold-start: how to recommend a new item or make recommendations to a new user, since they do not have historical information? 12

R package: recommenderlab Recommender systems 101 [link] Recommendation system in R [link] Building a movie recommendation engine with R [link] 13