Web Personalization & Recommender Systems

Similar documents
Web Personalization & Recommender Systems

CS249: ADVANCED DATA MINING

Part 11: Collaborative Filtering. Francesco Ricci

Introduction to Data Mining

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Part 11: Collaborative Filtering. Francesco Ricci

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Recommender Systems. Techniques of AI

BBS654 Data Mining. Pinar Duygulu

Recommender Systems (RSs)

Data Mining Lecture 2: Recommender Systems

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

CS 124/LINGUIST 180 From Languages to Information

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Hybrid Recommendation System Using Clustering and Collaborative Filtering

COMP6237 Data Mining Making Recommendations. Jonathon Hare

Recap: Project and Practicum CS276B. Recommendation Systems. Plan for Today. Sample Applications. What do RSs achieve? Given a set of users and items

Recommender Systems - Content, Collaborative, Hybrid

Movie Recommender System - Hybrid Filtering Approach

Collaborative Filtering using Euclidean Distance in Recommendation Engine

Available online at ScienceDirect. Procedia Technology 17 (2014 )

Distributed Itembased Collaborative Filtering with Apache Mahout. Sebastian Schelter twitter.com/sscdotopen. 7.

5/13/2009. Introduction. Introduction. Introduction. Introduction. Introduction

CS 124/LINGUIST 180 From Languages to Information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

COMMUNITY DETECTION IN THE COLLABORATIVE WEB

Intelligent Techniques for Web Personalization

Recommender Systems. Nivio Ziviani. Junho de Departamento de Ciência da Computação da UFMG

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci

12 Web Usage Mining. With Bamshad Mobasher and Olfa Nasraoui

Using Data Mining to Determine User-Specific Movie Ratings

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

CS 124/LINGUIST 180 From Languages to Information

Recommender Systems New Approaches with Netflix Dataset

Knowledge Discovery and Data Mining 1 (VO) ( )

Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University

Community-Based Recommendations: a Solution to the Cold Start Problem

Orange3 Data Fusion Documentation. Biolab

Content-based Dimensionality Reduction for Recommender Systems

arxiv: v4 [cs.ir] 28 Jul 2016

Recommender Systems: User Experience and System Issues

The influence of social filtering in recommender systems

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Performance Comparison of Algorithms for Movie Rating Estimation

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM

Collaborative Filtering Applied to Educational Data Mining

amount of available information and the number of visitors to Web sites in recent years

Using Social Networks to Improve Movie Rating Predictions

TRUST METRICS IN RECOMMENDER SYSTEMS: A SURVEY

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

A Hybrid Peer-to-Peer Recommendation System Architecture Based on Locality-Sensitive Hashing

Machine Learning and Data Mining. Collaborative Filtering & Recommender Systems. Kalev Kask

Mining Web Data. Lijun Zhang

Internet Search. (COSC 488) Nazli Goharian Nazli Goharian, 2005, Outline

Property1 Property2. by Elvir Sabic. Recommender Systems Seminar Prof. Dr. Ulf Brefeld TU Darmstadt, WS 2013/14

Tag based Recommender System using Pareto Principle

DATA MINING - 1DL105, 1DL111

Web Usage Mining from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer Chapter written by Bamshad Mobasher

Repositorio Institucional de la Universidad Autónoma de Madrid.

Collaborative Filtering and Recommender Systems. Definitions. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

A Time-based Recommender System using Implicit Feedback

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems

PERSONALIZED TAG RECOMMENDATION

A Tag And Social Network Based Recommender System

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University

Ontology-based Web Recommendation from Tags

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1

Structural trust inference for social recommendation

Link Analysis in Weibo

Experiences from Implementing Collaborative Filtering in a Web 2.0 Application

Assignment 5: Collaborative Filtering

Jeff Howbert Introduction to Machine Learning Winter

Personalized News Recommender using Twitter

City, University of London Institutional Repository. This version of the publication may differ from the final published version.

Building a Recommendation System for EverQuest Landmark s Marketplace

Extension Study on Item-Based P-Tree Collaborative Filtering Algorithm for Netflix Prize

Music Recommendation with Implicit Feedback and Side Information

Data Mining Techniques

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs

Mining Web Data. Lijun Zhang

Machine Learning using MapReduce

Personalized Recommendations using Knowledge Graphs. Rose Catherine Kanjirathinkal & Prof. William Cohen Carnegie Mellon University

Semantically Enhanced Collaborative Filtering on the Web

Recommendation Systems

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

IMPROVING SCALABILITY ISSUES USING GIM IN COLLABORATIVE FILTERING BASED ON TAGGING

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 1)"

By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin

Supervised and Unsupervised Learning (II)

A Recommender System Based on Improvised K- Means Clustering Algorithm

Making Recommendations by Integrating Information from Multiple Social Networks

Comparative Analysis of Collaborative Filtering Technique

Web Service Recommendation Using Hybrid Approach

AMAZON.COM RECOMMENDATIONS ITEM-TO-ITEM COLLABORATIVE FILTERING PAPER BY GREG LINDEN, BRENT SMITH, AND JEREMY YORK

Transcription:

Web Personalization & Recommender Systems COSC 488 Slides are based on: - Bamshad Mobasher, Depaul University - Recent publications: see the last page (Reference section) Web Personalization & Recommender Systems i Most common type of personalization: Recommender systems User profile Recommendation algorithm 2 1

Recommender Systems Recommender systems are information filtering systems where users are recommended "relevant information items (products, content, services) or social items (friends, events) at the right context at the right time with the goal of pleasing the user and generating revenue for the system. Recommender systems are typically discussed under the umbrella of "People who performed action X also performed action Y" where the action X and Y might be search, view or purchase of product, or seek a friend or connection. Neel Sundaresan ebay Research Labs RecSys 11 3 RecSys 11- ebay i ebay Example: 4 Over 10 million items listed for sale daily 4 Items are listed in explicitly defined hierarchy of categories 4 Over 30,000 nodes in this category tree. Only a fraction of the items are cataloged. 4 Hundreds of millions of searches are done on a daily basis. 4 Language gap between buyers and sellers in search 4 Recommender system tries to fill-in the language gap using knowledge mined from buyer/seller hunlike a typical Web search, context from user behavior is used (user query, history of past queries, ) hexample: Identifying query relationships (within a session) Q1: Apple ipod mp3 player Q2: creative mp3 player» Using co-occurrence: apple ipod & crative are related BUT apple ipod and apple dishes are not! 4 2

The Recommendation Task ibasic formulation as a prediction problem Given a profile P u for a user u, and a target item i t, predict the preference score of user u on item i t i Typically, the profile P u contains preference scores by u on some other items, {i 1,, i k } different from i t 4 preference scores on i 1,, i k may have been obtained explicitly (e.g., movie ratings) or implicitly (e.g., time spent on a product page or a news article) 5 Notes on User Profiling i Utilizing user profiles for personalization assumes 4 1) past behavior is a useful predictor of the future behavior 4 2) wide variety of behaviors amongst users i Basic task in user profiling: Preference elicitation 4 May be based on explicit judgments from users (e.g. ratings) 4 May be based on implicit measures of user interest i Automatic user profiling 4 Use machine learning techniques to learn models of user behavior, preferences 4 May include keywords, categories, 4 May build a model for each specific user or build group profiles Similarity of profile(s) to incoming documents, news, advertisements are measured by comparing the document vector to the profile s indicies 6 3

Common Recommendation Techniques irule-based (Knowledge-Based) Filtering 4 Provides recommendations to users based on predefined (or learned) rules 4 age(x, 25-35) and income(x, 70-100K) and children(x, >=3) recommend(x, Minivan) icontent-based Filtering 4 Gives recommendations to a user based on items with similar content in the user s profile icollaborative Filtering 4 Gives recommendations to a user based on preferences of similar users 4 Preferences on items may be explicit or implicit 7 Content-Based Recommenders i Predictions for unseen (target) items are computed based on their similarity (in terms of content) to items in the user profile. i E.g., user profile P u contains recommend highly: and recommend mildly : 8 4

Content-Based Recommender Systems 9 Content-Based Recommenders: Personalized Search i How can the search engine determine the user s context?? Query: Madonna and Child? i Need to learn the user profile: 4 User is an art historian? 4 User is a pop music fan? 10 5

Content-Based Recommenders i Similarity of user profile to each item Example: 4 User profile: vector of terms from user high ranked documents 4 Use cosine similarity to calculate similarity between profile & documents i Disadvantage: 4 Unable to recommend new items (different to profile) 11 Collaborative Recommender Systems 12 6

Basic Collaborative Filtering Process Current User Record <user, item1, item2, > Neighborhood Formation Nearest Neighbors Combination Function Recommendation Engine Historical User Records user item rating Recommendations Neighborhood Formation Phase Recommendation Phase Both of the Neighborhood formation and the recommendation phases are real-time components Approaches: knn, AR mining, clustering, matrix factorization, probabilistic 13 Collaborative Systems: Approaches User-based & Item-based are most commonly used: i User-based (neighborhood-based): [Resnick et al. 1994; Shardanand 1994], 1) Calculate the similarity between the active user and the rest of the users. Pearson correlation, cosine vector space, (2) Select a subset of the users (neighborhood) according to their similarity with the active user. hsimilarity threshold, or N most similar (3) Compute the prediction using the neighbor ratings. Can Cluster the users to reduce the sparsity & improve scalability i Item-based: similar to the user-based but, instead of looking for neighbors among users, they look for similar items. 4 Advantage over user-based: more static, thus can be calculated off-line 14 7

Collaborative System (User-based) Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice Alice 5 2 3 3? User 1 2 4 4 1-1.00 User 2 2 1 3 1 2 0.33 User 3 4 2 3 2 1.90 User 4 3 3 2 3 1 0.19 User 5 3 2 2 2-1.00 Prediction Best match User 6 5 3 1 3 2 0.65 User 7 5 1 5 1-1.00 Compare target user with all user records (Using k-nearest neighbor with k = 1) 15 Using Clusters for Personalization Original Session/user data Result of Clustering A.html B.html C.html D.html E.html F.html user0 1 1 0 0 0 1 user1 0 0 1 1 0 0 user2 1 0 0 1 1 0 user3 1 1 0 0 0 1 user4 0 0 1 1 0 0 user5 1 0 0 1 1 0 user6 1 1 0 0 0 1 user7 0 0 1 1 0 0 user8 1 0 1 1 1 0 user9 0 1 1 0 0 1 Given an active session A B, the best matching profile is Profile 1. This may result in a recommendation for page F.html, since it appears with high weight in that profile. PROFILE 0 (Cluster Size = 3) -------------------------------------- 1.00 C.html 1.00 D.html A.html B.html C.html D.html E.html F.html Cluster 0 user 1 0 0 1 1 0 0 user 4 0 0 1 1 0 0 user 7 0 0 1 1 0 0 Cluster 1 user 0 1 1 0 0 0 1 user 3 1 1 0 0 0 1 user 6 1 1 0 0 0 1 user 9 0 1 1 0 0 1 Cluster 2 user 2 1 0 0 1 1 0 user 5 1 0 0 1 1 0 user 8 1 0 1 1 1 0 PROFILE 1 (Cluster Size = 4) -------------------------------------- 1.00 B.html 1.00 F.html 0.75 A.html 0.25 C.html PROFILE 2 (Cluster Size = 3) -------------------------------------- 1.00 A.html 1.00 D.html 1.00 E.html 0.33 C.html 16 8

Item-based Collaborative Filtering i Find similarities among the items based on ratings across users 4 Often measured based on a variation of Cosine measure i Prediction of item I for user a is based on the past ratings of user a on items similar to i. Star Wars Jurassic Park Terminator 2 Indep. Day Sally 7 6 3 7 Bob 7 4 4 6 Chris 3 7 7 2 Lynn 4 4 6 2 Karen 7 4 3? i Suppose: sim(star Wars, Indep. Day) > sim(jur. Park, Indep. Day) > sim(termin., Indep. Day) i Predicted rating for Karen on Indep. Day will be 7, because she rated Star Wars 7 4 That is if we only use the most similar item 4 Otherwise, we can use the k-most similar items and again use a weighted average 17 Collaborative Filtering (Item-based) Pair-wise comparison of items across users Prediction Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Alice 5 2 3 3? User 1 2 4 4 1 User 2 2 1 3 1 2 User 3 4 2 3 2 1 User 4 3 3 2 3 1 User 5 3 2 2 2 User 6 5 3 1 3 2 User 7 5 1 5 1 Item similarity 0.76 0.79 0.60 Best 0.71 0.75 match Calculate pair-wise item similarities: ( r u U u, i ru )( ru, j ru ) sim( i, j) = 2 2 ( r u U u, i ru ) ( r u U u, j ru ) Calculate rating based on N best neighbors: r j J u, j. sim( i, j) Predicted Rating: P( u, i) = sim( i, j) j J 18 9

Collaborative Systems (Cont d) i Hybrid: combines the item ratings of similar users to the active user, the ratings of the active user on similar items, the ratings of similar items by similar users, semantic information. i SVD-based: Matrix factorization techniques (variations exist): 4 each item is represented as a set of features (aspects) 4 each user as a set of values indicating his/her preference for the various aspects of the items. 4 The number of features to consider, K, is a model parameter. 4 The rating prediction is: P = ij x T i y j i Tendency-based: calculates tendency of users / items [ACM Transactions on the Web, 2011] 4 tendency of a user: the average difference between his/her ratings and the item mean. t u = i I u v I ui u v i t i = u U i v U ui u v u 19 Semantically Enhanced Hybrid Recommendation i Sample extension of the item-based algorithm 4 Use a combined similarity measure to compute item similarities: 4 where, hsemsim is the similarity of items i p and i q based on semantic features (e.g., keywords, attributes, etc.); and hratesim is the similarity of items i p and i q based on user ratings (as in the standard item-based CF) 4 α is the semantic combination parameter: hα = 1 only user ratings; no semantic similarity hα = 0 only semantic features; no collaborative similarity 20 10

Semantically Enhanced CF i Movie data set 4 Movie ratings from the movielens data set 4 Semantic info. extracted from IMDB based on the following ontology Movie Name Year Genre Actor Director Action Genre-All Romance Comedy Actor Name Movie Nationality Director Romantic Comedy Black Comedy Kids & Family Name Movie Nationality 21 Hybrid Recommender Systems 22 11

A.html B.html C.html D.html E.html user1 1 0 1 0 1 user2 1 1 0 0 1 user3 0 1 1 1 0 user4 1 0 1 1 1 user5 1 1 0 0 1 user6 1 0 1 1 1 User Pageview matrix UP Feature-Pageview Matrix FP A.html B.html C.html D.html E.html web 0 0 1 1 1 data 0 1 1 1 0 mining 0 1 1 1 0 business 1 1 0 0 0 intelligence 1 1 0 0 1 marketing 1 1 0 0 1 ecommerce 0 1 1 0 0 search 1 0 1 0 0 information 1 0 1 1 1 retrieval 1 0 1 1 1 23 User-Feature Matrix Note that: UF = UP x FP T web data mining business intelligence marketing ecommerce search information retrieval user1 2 1 1 1 2 2 1 2 3 3 user2 1 1 1 2 3 3 1 1 2 2 user3 2 3 3 1 1 1 2 1 2 2 user4 3 2 2 1 2 2 1 2 4 4 user5 1 1 1 2 3 3 1 1 2 2 user6 3 2 2 1 2 2 1 2 4 4 Example: users 4 and 6 are more interested in concepts related to Web information retrieval, while user 3 is more interested in data mining. 24 12

Some facts & observations to motivate Trust-based approaches [Golbeck, ACM Transactions on the Web, Sept 2009] icollaborative filtering (CF) 4 Effective when users have a common set of rated items 4 Difficult to compute similarity between users, if ratings are sparse 4 Use overall similarity of user profiles to make recommendations ionline social networks 4 A limited snapshot of the users and their interactions 4 Typically, a list of friends for each user 4 Few show the last date that a given user was active, and none show more detailed information about a user s history of interaction with the Web site. 4 No networks publicly share a history of interactions between users 4 Communications are kept private, and the formation or removal of relationships are not shown or recorded in a user s profile. 25 Collaborative Systems (Cont d) itrust-based approaches: consider social trust relationships between users [Andersen et al. 2008; Bedi et al. 2007; Ma et al. 2008; Massa and Avesani 2004; 2007], [Ziegler and Golbeck, 2006], 4Explicit trust specified by users 4Implicit trust calculated via Pearson, friendship factors, 26 13

Trust & Profile Similarity i Strong and significant correlation between trust and user similarity (the more similar two people, the greater the trust between them). [Ziegler and Golbeck, 2006] i Methods to infer trust on social network: 4 Explicitly defined trust 4 A function of corrupt vs. valid files that a peer provides (in P2P). EigenTrust algorithm [Kamvar et al. 2004] 4 The perspective of authoritative nodes. i Example -- Recommended Rating that is personalized for each user [ACM Transactions on the Web, 2011] - Alice trusts Bob 9. - Alice trusts Chuck 3. - Bob rates the movie Jaws with 4 stars. - Chuck rates the movie Jaws with 2 stars. Alice s recommended rating for Jaws is calculated as follows: talice Bob rbob Jaws + talice Chuck rchuck Jaws / talice Bob + talice Chuck = ((9 4) + (3 2)) / (9 + 3) = 3.5. 27 Collaborative Filtering (Tag-based) isocial Tagging 4Social exchange goes beyond collaborative filtering 4people add free-text tags to their content hdel.icio.us, Flickr, Last.fm idata record: <user, item, tag> Tag: user-item interaction can be annotated by multiple tags, indicating the reason of user s interest 28 14

Folksonomies 29 References (UNDER CONSTRUCTION!!!!) i ACM Transactions on the Web, 2011 i Golbeck, ACM Transactions on the Web, Sept 2009 i Andersen et al. 2008; i Bedi et al. 2007; i Ma et al. 2008; i Massa and Avesani 2004; 2007 i Ziegler and Golbeck, 2006, i Resnick et al. 1994; i Shardanand 1994 30 15