TriRank: Review-aware Explainable Recommendation by Modeling Aspects

Similar documents
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs

Incorporating Phrase-level Sentiment Analysis on Textual Reviews for Personalized Recommendation

CS249: ADVANCED DATA MINING

ECS289: Scalable Machine Learning

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek

Explainable Recommendation: A Survey and New Perspectives

Mining Web Data. Lijun Zhang

Latent Space Model for Road Networks to Predict Time-Varying Traffic. Presented by: Rob Fitzgerald Spring 2017

TGNet: Learning to Rank Nodes in Temporal Graphs. Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2

Pseudo-Implicit Feedback for Alleviating Data Sparsity in Top-K Recommendation

A Brief Review of Representation Learning in Recommender 赵鑫 RUC

Recommender Systems (RSs)

Survey on Recommendation of Personalized Travel Sequence

Music Recommendation with Implicit Feedback and Side Information

A Few Things to Know about Machine Learning for Web Search

POI2Vec: Geographical Latent Representation for Predicting Future Visitors

Diversity in Recommender Systems Week 2: The Problems. Toni Mikkola, Andy Valjakka, Heng Gui, Wilson Poon

Recommender Systems - Content, Collaborative, Hybrid

Part I: Data Mining Foundations

STREAMING RANKING BASED RECOMMENDER SYSTEMS

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

CPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017

Part 11: Collaborative Filtering. Francesco Ricci

Real-time Collaborative Filtering Recommender Systems

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Query Independent Scholarly Article Ranking

Community-Based Recommendations: a Solution to the Cold Start Problem

A Privacy-Preserving QoS Prediction Framework for Web Service Recommendation

Mining Web Data. Lijun Zhang

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Entity and Knowledge Base-oriented Information Retrieval

Personalized Entity Recommendation: A Heterogeneous Information Network Approach

PERSONALIZED TAG RECOMMENDATION

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Personal Web API Recommendation Using Network-based Inference

Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users

Location-Aware Web Service Recommendation Using Personalized Collaborative Filtering

Ranking Web Pages by Associating Keywords with Locations

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Performance Comparison of Algorithms for Movie Rating Estimation

Tag Based Image Search by Social Re-ranking

Cold-Start Web Service Recommendation Using Implicit Feedback

Recommender Systems. Collaborative Filtering & Content-Based Recommending

GraphGAN: Graph Representation Learning with Generative Adversarial Nets

Introduction to Text Mining. Hongning Wang

Data Mining Techniques

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci

LINK GRAPH ANALYSIS FOR ADULT IMAGES CLASSIFICATION

Recommendation System Using Yelp Data CS 229 Machine Learning Jia Le Xu, Yingran Xu

WebSci and Learning to Rank for IR

ACM MM Dong Liu, Shuicheng Yan, Yong Rui and Hong-Jiang Zhang

IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM

Automatically Building Research Reading Lists

Content-based Dimensionality Reduction for Recommender Systems

Implementation of Parallel CASINO Algorithm Based on MapReduce. Li Zhang a, Yijie Shi b

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

Inferring User Search for Feedback Sessions

Top-N Recommendations from Implicit Feedback Leveraging Linked Open Data

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

CS224W Project: Recommendation System Models in Product Rating Predictions

for Searching Social Media Posts

Separating Objects and Clutter in Indoor Scenes

Recommender Systems: User Experience and System Issues

Semi-supervised Data Representation via Affinity Graph Learning

Incorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches

by the customer who is going to purchase the product.

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA

arxiv: v1 [cs.ir] 19 Dec 2018

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing

R. R. Badre Associate Professor Department of Computer Engineering MIT Academy of Engineering, Pune, Maharashtra, India

Opportunities and challenges in personalization of online hotel search

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Automatic Domain Partitioning for Multi-Domain Learning

Federated Search. Jaime Arguello INLS 509: Information Retrieval November 21, Thursday, November 17, 16

NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags

Learning to Rank Features for Recommendation over Multiple Categories

Recommendation System for Location-based Social Network CS224W Project Report

Adaptive Dropout Training for SVMs

CS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering

Amazon Review Rating Prediction with Text-Mining, Latent-Factor Model and Restricted Boltzmann Machine

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Personalized Recommendations using Knowledge Graphs. Rose Catherine Kanjirathinkal & Prof. William Cohen Carnegie Mellon University

Demystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian

Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO

Sentiment Classification of Food Reviews

node2vec: Scalable Feature Learning for Networks

A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2

Joint Representation Learning for Top-N Recommendation with Heterogeneous Information Sources

NYU CSCI-GA Fall 2016

Survey on Community Question Answering Systems

Data Mining Techniques

Towards Better User Preference Learning for Recommender Systems

Mining Social Media Users Interest

Heterogeneous Graph-Based Intent Learning with Queries, Web Pages and Wikipedia Concepts

Addressing the Challenges of Underspecification in Web Search. Michael Welch

Chapter 27 Introduction to Information Retrieval and Web Search

System For Product Recommendation In E-Commerce Applications

Transcription:

TriRank: Review-aware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia Oct 22 2015

2 Recommender System Multifaceted Accuracy Scalability Explainability Transparency Scrutability Online learning Privacy Diversity Increase Users Trust & Satisfaction - Collaborative Filtering - Model-based - Memory-based - Graph-based - Content Filtering - Context-aware - Social - Temporal - Reviews - Hybrid

3 Recap: Collaborative Filtering Predict the preference of a user by the similar users. Focus on the user-item feedback matrix. E.g. matrix factorization model for CF: Input: Given a sparse useritem feedback matrix: User 'u' bought item 'i' Learn latent vector for each user, item: Affinity between user 'u' and item 'i':

4 Main Limitation of CF Hard to infer the actual rationale from the rating score only! Items

5 Example: Dilemma of CF Inputs: <u1, p1, 5> <u2, p1, 5> <u2, p2, 4> <u3, p1, 5> <u3, p3, 4> <u4, p3, 4> <u4, p4, 5> Inputs (aspects): <u1, p1, 5, seafood> <u2, p1, 5, chicken> <u2, p2, 4, chicken> <u3, p1, 5, seafood> <u3, p3, 4, seafood> <u4, p3, 4, seafood> <u4, p4, 5, seafood> p1 p2 p3 p4 u1 5 0 0 0 u2 5 4 0 0 u3 5 0 4 0 u4 0 0 4 5 Neighbors u2 and u3 have equal preference on p2 and p3 CF can not choose between p2 and p3!

6 Review-aware Recommendation Reviews justify a user s rating: by discussing the specific properties of items (aspects); by revealing which aspects the user is most interested in. Service Price aspects Starters Noodle Space Place

7 Existing Works Topic models on words + item latent factors: McAuley and Leskovec, Recsys 13: LDA + MF Ling etc, Recsys 14: LDA + PMF (full Bayesian treatment) Xu etc, CIKM 14: LDA + PMF + user clusters (full Bayesian) Bao etc, AAAI 14: NMF + MF Joint modeling of aspects and ratings: Diao etc, KDD 14: graphical model Zhang etc, SIGIR 14: collective NMF Musat etc, IJCAI 13: build user topical profiles

8 Limitations of previous works Focused on rating prediction. Top-K recommendation is more practical. Lack explainability and transparency. Well-known drawback of latent factor model. Do not support online learning (instant personalization). New data comes in (retraining is expensive). User updates his/her preference (scrutability). Historical data New data Training Recommendation Time

9 Our Solution - TriRank ü Review-aware recommendation. ü Graph-based method. - Top-K recommendation è Vertex ranking. ü Good accuracy. ü Explainable. ü Transparent. ü Offline training + online learning. - Provide instant personalization without retraining.

10 Basic Idea: Graph Propagation Inputs: <u1, p1, 1> <u2, p1, 1> <u2, p2, 1> <u3, p1, 1> <u3, p3, 1> <u4, p3, 1> <u4, p4, 1> Target user: u1 u1 u2 u3 u4 p1 p2 p3 p4 Item ranking: p2 p3 > p4 User ranking: u2 u3 > u4 Label propagation from the target user s historical item nodes captures the collaborative filtering. How to encode that mathematically?

11 u1 u2 Machine Learning for Graph Propagation (Graph Regularization) p1 p2 [He etc, SIGIR 2014] Smoothness kernel (propagation): - Nearby vertices should not vary too much: u3 u4 p3 p4 Fitting constraint (initial labels): - Ranking scores should adhere to the initial labels: Input: - Graph structure (matrix Y) - Initial labels to propagate (vectors p 0 ) Output: - Scores for each vertex (vectors u, p) Optimization (coordinate descent):, which exactly mimic the propagation process! X. He, M. Gao, M.-Y. Kan, Y. Liu, and K. Sugiyama. Predicting the popularity of web 2.0 items based on user comments. In Proc. SIGIR 14

12 Connection to CF models Recap: ranking loss function (for a target user): Prediction loss Traditional machine learning-based CF models: 1. Prediction model: E.g., matrix factorization: 2. Loss function: Regularizations Prediction loss on all items (include imputations). (important for top-k recommendation)

13 TriRank Solution Graph propagation in the tripartite graph: Inputs: <u1, p1, a1> <u2, p1, a1> <u2, p2, a1> <u3, p1, a2> <u3, p3, a2> Initial labels should encode: - Target user s preference on aspects/items/users: a 0 : reviewed aspects. p 0 : ratings on items. u 0 : similarity with other users (friendship).

14 Online Learning Offline Training: 1. Extract aspects from user reviews 2. Build the tripartite graph model (edge weights) 3. Label propagation from each vertex and save the scores. - i.e. store a V V matrix f(v i, v j ). (to save space, we can save top scores for each vertex) Online Learning (new data and updated preference applies): 1. Build user profile (i.e., L u vertices to propagate from). 2. Average the scores of the L u vertices: Complexity: O(L u ), almost constant!

15 Explainability Transparency: Collaborative filtering + Aspect filtering è (Similar users also choose the item) (Reviewed aspects match with the item) An example of reasoned recommendation: Item Ranking Aspect Ranking User Ranking

16 Experimental Settings Public datasets (filtering threshold at 10): Yelp Challenge Amazon electronics Sort reviews in chronological order for each user: Split: 80% training + 10% validation + 10% test Top-K evaluation: For each test user, we output K items as a ranking list: Recall-based measure: Ranking-based measure:

17 Aspect Extraction A well studied task in review mining [survey: Zhang and Liu, 2014]: Unsupervised rule-based methods: [Hu and Liu, KDD 04; Zhang etc. COLING 10]: phrase/sentence patterns. Supervised sequence labeling methods: [Jin and Ho, ICML 09; Jakob etc. EMNLP 10]: HMM, CRF We adopt a tool developed by Tsinghua IR group [Zhang etc. SIGIR 14]: rule-based system: Dataset #Aspect Density (U-A) Density (I-A) Top aspects (good examples) Yelp 6,025 3.05% 2.29% bar, salad, chicken, sauce, cheese, fries, bread, sandwich Amazon 1,617 3.80% 1.44% camera, quality, sound, price, battery, screen, size, lens Noisy aspects restaurants, food, ive (I ve), 150 product, features, picturemy

18 Item Popularity (ItemPop) ItemKNN [Sarwar etc. 2001] Item-based collaborative filtering Baselines PureSVD [Cremonesi etc. 2010] Matrix factorization with imputations Best factor number is 30. Large factors lead to overfitting. PageRank [Haveliwala etc. 2002] Personalized with user preference vector ItemRank [Gori etc. 2007] Personalized PageRank on item-item correlation graph TagRW [Zhang etc. 2013] Integrate tags by converting to user-user and item-item graph.

19 Yelp Results Hit Ratio@K NDCG@K Hit Ratio (recall): TriRank > PageRank > ItemKNN > TagRW > PureSVD > ItemRank NDCG (ranking): TriRank > PageRank > ItemKNN > PureSVD > ItemRank > TagRW

20 Amazon Results Hit Ratio@K NDCG@K The discrepancy between HR and NDCG is more obvious: - TagRW is strong for HR, but weak for NDCG;

21 Yelp Hit Ratio Yelp VS Amazon Amazon Hit Ratio 1. ItemKNN is strong for Yelp, but weak for Amazon - Amazon dataset is more sparse (#reviews/item: 28 vs 4) 2. PageRank performs better than ItemRank (both are Personalized PageRank) - Converting user-item graph to item-item graph leads to signal loss.

22 Utility of Aspects Dataset Yelp Amazon Settings (@50) HR NDCG HR NDCG All Set 18.58 7.69 18.44 12.36 No item-aspect 17.05 6.91 16.23 11.31 No user-aspect 18.52 7.68 18.40 12.36 No aspects 17.00 6.90 15.97 11.16 No user-item 11.67 4.84 10.32 5.08 1. Item-aspect relation is more important than user-aspect relation. 2. Aspects filtering is complementary to collaborative filtering. 3. User-item relation is still fundamental to model and most important!

23 Aspect Filtering How does the noisy aspects impact the performance? Ranking aspects by their TF-IDF score in item-aspect matrix. Insensitivity to noisy aspects: - Filtering out low TF-IDF aspects (e.g. stop words or quirks) do not improve. High TF-IDF aspects carry more useful signal for recommendation. - Filtering out high TF-IDF aspects hurt performance significantly.

24 Case Study Training reviews of a sampled Yelp user. Rank list by TriRank: 3 rd : Red Lobster 6 th : Chick-Fil-A Although the test set doesn t contain Red Lobster, we found she actually reviewed it later. (outside of the Yelp dataset)

25 Conclusion Tripartite graph ranking solution for review-aware recommendation: Explainable and transparent Robust to noisy aspects Online learning and instant personalization without retraining. Future work: Combine with factorization model (more effective to sparse data) Personalized (regularization) parameter settings More contexts to model: temporal, taxonomy and sentiment. Thank you! Thank SIGIR Student Travel Grant!

26 Reference X. He, M. Gao, M.-Y. Kan, Y. Liu, and K. Sugiyama. Predicting the popularity of web 2.0 items based on user comments. In Proc. SIGIR 14, pages 233 242, 2014. J. McAuley and J. Leskovec. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proc. of RecSys 13, pages 165 172, 2013. G. Ling, M. R. Lyu, and I. King. Ratings meet reviews, a combined approach to recommend. In Proc. of RecSys 14, pages 105 112, 2014. Y. Xu, W. Lam, and T. Lin. Collaborative filtering incorporating review text and co-clusters of hidden user communities and item groups. In Proc. of CIKM 14, pages 251 260, 2014. Q. Diao, M. Qiu, C.-Y. Wu, A. J. Smola, J. Jiang, and C. Wang. Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In Proc. of KDD 14, pages 193 202, 2014. Y. Zhang, M. Zhang, Y. Zhang, Y. Liu, and S. Ma. Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In Proc. of SIGIR 14, pages 83 92, 2014. C.-C. Musat, Y. Liang, and B. Faltings. Recommendation using textual opinions. In Proc. of IJCAI 13, pages 2684 2690, 2013.