Seminar Collaborative Filtering. KDD Cup. Ziawasch Abedjan, Arvid Heise, Felix Naumann

Similar documents
Property1 Property2. by Elvir Sabic. Recommender Systems Seminar Prof. Dr. Ulf Brefeld TU Darmstadt, WS 2013/14

Additive Regression Applied to a Large-Scale Collaborative Filtering Problem

Recommender Systems New Approaches with Netflix Dataset

Large-Scale Duplicate Detection

Extension Study on Item-Based P-Tree Collaborative Filtering Algorithm for Netflix Prize

Music Recommendation with Implicit Feedback and Side Information

Collaborative Filtering Applied to Educational Data Mining

Performance Comparison of Algorithms for Movie Rating Estimation

Recommender Systems - Content, Collaborative, Hybrid

Music Recommendation Engine

Assignment 5: Collaborative Filtering

Progress Report: Collaborative Filtering Using Bregman Co-clustering

An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data

Music Recommendation at Spotify

Recommender Systems: User Experience and System Issues

Data Mining Applications of Singular Value Decomposition

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CSE 158 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

Data Mining Techniques

CSE 258 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)

A Scalable, Accurate Hybrid Recommender System

José Miguel Hernández Lobato Zoubin Ghahramani Computational and Biological Learning Laboratory Cambridge University

Collaborative Filtering based on User Trends

Robustness and Accuracy Tradeoffs for Recommender Systems Under Attack

Collaborative Filtering via Euclidean Embedding

Using Data Mining to Determine User-Specific Movie Ratings

Performance of Recommender Algorithms on Top-N Recommendation Tasks

Recommendation with Differential Context Weighting

CS224W Project: Recommendation System Models in Product Rating Predictions

Survey on Collaborative Filtering Technique in Recommendation System

COMP 465: Data Mining Recommender Systems

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM

Overview. Data-mining. Commercial & Scientific Applications. Ongoing Research Activities. From Research to Technology Transfer

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Data Mining Techniques

Collaborative filtering models for recommendations systems

Content-based Dimensionality Reduction for Recommender Systems

New user profile learning for extremely sparse data sets

Proposing a New Metric for Collaborative Filtering

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University

RECOMMENDATION SYSTEM USING COLLABORATIVE FILTERING

Hybrid algorithms for recommending new items

Predicting user preference for movies using movie lens dataset

Recommender Systems: User Experience and System Issues. About me. Scope of Recommenders. A Quick Introduction. Wide Range of Algorithms

Recommender Systems. Techniques of AI

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007

Clustering and Recommending Services based on ClubCF approach for Big Data Application

Alleviating the Sparsity Problem in Collaborative Filtering by using an Adapted Distance and a Graph-based Method

ECS289: Scalable Machine Learning

Towards Time-Aware Semantic enriched Recommender Systems for movies

Neural Content-Collaborative Filtering for News Recommendation

KNOW At The Social Book Search Lab 2016 Suggestion Track

Module: CLUTO Toolkit. Draft: 10/21/2010

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE

Summary. RapidMiner Project 12/13/2011 RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE

CS249: ADVANCED DATA MINING

Using Existing Numerical Libraries on Spark

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

GENETIC ALGORITHM BASED COLLABORATIVE FILTERING MODEL FOR PERSONALIZED RECOMMENDER SYSTEM

Feature Engineering in User s Music Preference Prediction

BordaRank: A Ranking Aggregation Based Approach to Collaborative Filtering

Automatically Building Research Reading Lists

Improving Top-N Recommendation with Heterogeneous Loss

Recommender System Optimization through Collaborative Filtering

Matrix Decomposition for Recommendation System

Application of Dimensionality Reduction in Recommender System -- A Case Study

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek

Distributed Itembased Collaborative Filtering with Apache Mahout. Sebastian Schelter twitter.com/sscdotopen. 7.

A Recommender System Based on Improvised K- Means Clustering Algorithm

How to Get Endorsements? Predicting Facebook Likes Using Post Content and User Engagement

A Random Walk Method for Alleviating the Sparsity Problem in Collaborative Filtering

Efficient Top-N Recommendation for Very Large Scale Binary Rated Datasets

CSE 701: LARGE-SCALE GRAPH MINING. A. Erdem Sariyuce

Review on Techniques of Collaborative Tagging

A Novel Non-Negative Matrix Factorization Method for Recommender Systems

Collaborative Filtering Based on Iterative Principal Component Analysis. Dohyun Kim and Bong-Jin Yum*

By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin

Hybrid Recommendation Models for Binary User Preference Prediction Problem

COMP6237 Data Mining Making Recommendations. Jonathon Hare

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Hybrid Weighting Schemes For Collaborative Filtering

Matching Users and Items Across Domains to Improve the Recommendation Quality

Data Mining Lecture 2: Recommender Systems

Experiences from Implementing Collaborative Filtering in a Web 2.0 Application

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

ExcUseMe: Asking Users to Help in Item Cold-Start Recommendations

Scope of Recommenders. Recommender Systems: User Experience and System Issues. Wide Range of Algorithms. About me. Classic Collaborative Filtering

Towards a hybrid approach to Netflix Challenge

Question Bank. 4) It is the source of information later delivered to data marts.

Influence in Ratings-Based Recommender Systems: An Algorithm-Independent Approach

Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract

Semantically Enhanced Collaborative Filtering on the Web

Literature Survey on Various Recommendation Techniques in Collaborative Filtering

Recommendation System for Sports Videos

Sparse Estimation of Movie Preferences via Constrained Optimization

CONTENTS. 1.1 Recommender System Collaborative filtering 1 2. LITERATURE REVIEW 5 3. NARRATIVE Software Requirements Specification 11

Transcription:

Seminar Collaborative Filtering KDD Cup Ziawasch Abedjan, Arvid Heise, Felix Naumann

2 Collaborative Filtering

Recommendation systems 3

Recommendation systems 4

Recommendation systems 5

Recommendation systems 6 Content-based recommendations Relevance is based on content Similar items Uses also metadata Collaborative filtering Recommend items suiting one s taste Based on community ratings Serendipity

Collaborative Filtering Algorithms 7 Memory-based CF Item/user-based top N New items and user can be added easily Scales well on dense data sets Limited scalability on sparse data sets Model-based CF Singular Value Decomposition (SVD) Better scalability on sparse data sets Little evidence for the model Hybrid recommenders Content-boosted CF Mixture of memory-based and model-based techniques

Neighbor-based CF 8 Process: filtering -> prediction -> recommendation User-based approach Create cluster of similar users Recommendations depend on ratings of similar users Problem: missing ratings Item-based approach Create item-item matrix based on rating similarities Retrieve top K similar items and aggregating their ratings by the user

Singular Value Decomposition 9 Reduce dimensions of user/item matrix by factorization Use overlapping features (e.g. gender, genre) items features items user r 11 r ij r 1n r m1 r mn = user r i1 f 11 f 12 f m1 f m2 X feature g 11 g 12 g 21 g 22 g 1n g 2n Source: Koren, Yehuda, Bell, Robert & Volinsky, Chris: "Matrix Factorization Techniques for Recommender Systems

10 KDD Cup

Overview 11 ACM SIGKDD Knowledge Discovery and Data Mining Variety of topics 2010 - Student performance evaluation 2008 - Breast cancer 2007 - Consumer recommendations 2004 - Particle physics; plus protein homology prediction 2001 - Molecular bioactivity; plus protein locale prediction 1997 - Direct marketing for lift curve optimization 2 tracks with three places

KDD Cup 2011 12 On Yahoo Music Dataset Artists, Albums, Songs, Genres Track 1: predict user rating Track 2: decide whether a user rates a song or not Album * Song * Genre Artist

Track 1 13 Typical collaboration filtering task Predict the rating of a specific user for unrated songs Includes hierarchy of items User might have rated other songs of same album/artist No user might have rated the song but the same album/artist Includes time stamp of rating Rating behavior might have changed over time Older songs rated differently than newer songs? #Users #Items #Ratings #Train Ratings #Validation Ratings #Test Ratings 1,000,990 624,961 262,810,175 252,800,275 4,003,960 6,005,940

Track 2 14 Predict if a user would rate a given song highly or not at all Need a model for rate behavior Machine learning? Ratings on songs only No timestamps Given six songs, which three will most likely not be rated #Users #Items #Ratings #Train Ratings #Test Ratings #Users 249,012 296,111 62,551,438 61,944,406 607,032 249,012

15 Seminar

Approaches Track 1 16 Koren, Y., Bell, R. & Volinsky, C.: "Matrix Factorization Techniques for Recommender Systems Similar to Singular Value Decomposition Create small abstracted matrix Soft clustering Sarwar, B., Karypis, G., Konstan, J. & Reidl, J.: "Item-based collaborative filtering recommendation algorithms Classic approach Improvements in Robert M. Bell & Yehuda Koren: "Improved Neighborhood-based Collaborative Filtering" Need to be adjusted to include time stamp and hierarchy

Approaches Track 2 17 Kurucz, M., Benczúr, A.A., Kiss, T., Nagy, I., Szabó, A. & Torma, B.: "Who Rated What: a Combination of SVD, Correlation and Frequent Sequence Mining" Combination of four predictions (only two relevant) SVD and base prediction Needs estimation on how many ratings Sueiras, J.: "A classical predictive modeling approach for Task Who rated what? of the KDD CUP 2007" Machine learning (logistic regression) User-specific, movie-related, and user-movie pair features SVD for movie-related

Application 18 Send mail to ziawasch.abedjan@hpi.uni-potsdam.de Subject: [CF Seminar] Deadline: April 18 th Notification: April 19 th Limited to 8 participants = 4 teams Random selection if more applicants Send top 3 wishes on tasks and approaches We are open to other approaches

Time Schedule 19 April 14th: first seminar, topic presentations April 16th: application deadline, team/paper preferences April 17th: team/paper notification April 21th: mandatory consultation April 28th: paper presentation May 12th: initial implementation/idea presentation June 9th: intermediate presentation/project consolidation June 23th: final presentations June 30th: KDD cup submission deadline (results only) July 9th: hand in short paper (6 pages)

Grading 20 3 LP (half semester, project seminar) Paper and final presentations Participation during intermediate presentations / discussions Implementation strategies/proposed extensions Short paper Bonus: good results in KDD cup

Options 21 Mahout or stand-alone Single or combined repository We recommend cooperation Many similar issues

References 22 Official KDD Cup: http://kddcup.yahoo.com/ Su, Xiaoyuan & Khoshgoftaar, Taghi M.: "A survey of collaborative filtering techniques" Koren, Yehuda; Bell, Robert & Volinsky, Chris: "Matrix Factorization Techniques for Recommender Systems" Sarwar, Badrul; Karypis, George, Konstan, Joseph & Reidl, John: "Itembased collaborative filtering recommendation algorithms Improvements in Robert M. Bell & Yehuda Koren: "Improved Neighborhood-based Collaborative Filtering Miklós Kurucz, András A. Benczúr, Tamás Kiss, István Nagy, Adrienn Szabó & Balázs Torma: "Who Rated What: a Combination of SVD, Correlation and Frequent Sequence Mining" Jorge Sueiras: "A classical predictive modeling approach for Task Who rated what? of the KDD CUP 2007