A Brief Review of Representation Learning in Recommender 赵鑫 RUC

Similar documents
Mining Human Trajectory Data: A Study on Check-in Sequences. Xin Zhao Renmin University of China,

A Novel deep learning models for Cold Start Product Recommendation using Micro blogging Information

Collaborative Metric Learning

Sequential Recommender System based on Hierarchical Attention Network

E-Commerce for Cold Start Product Suggestion Using Micro Blogging Data Through Connecting Social Media

Content-Aware Hierarchical Point-of-Interest Embedding Model for Successive POI Recommendation

GraphGAN: Graph Representation Learning with Generative Adversarial Nets

Network embedding. Cheng Zheng

POI2Vec: Geographical Latent Representation for Predicting Future Visitors

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

Bridging Semantic Gaps between Natural Languages and APIs with Word Embedding

CS249: ADVANCED DATA MINING

Survey on Recommendation of Personalized Travel Sequence

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

Outline. Morning program Preliminaries Semantic matching Learning to rank Entities

Ph.D. in Computer Science & Technology, Tsinghua University, Beijing, China, 2007

FastText. Jon Koss, Abhishek Jindal

Decomposing Fit Semantics for Product Size Recommendation in Metric Spaces

A Study of MatchPyramid Models on Ad hoc Retrieval

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks

STREAMING RANKING BASED RECOMMENDER SYSTEMS

Knowledge Graph Embedding with Numeric Attributes of Entities

3 : Representation of Undirected GMs

Learning Graph-based POI Embedding for Location-based Recommendation

A Few Things to Know about Machine Learning for Web Search

Understanding and Recommending Podcast Content

First Place Solution for NLPCC 2018 Shared Task User Profiling and Recommendation

E-Commerce - Bookstore with Recommendation System using Prediction

Pseudo-Implicit Feedback for Alleviating Data Sparsity in Top-K Recommendation

International Journal of Scientific Research and Modern Education (IJSRME) Impact Factor: 6.225, ISSN (Online): (

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Non-negative Matrix Factorization for Multimodal Image Retrieval

Non-negative Matrix Factorization for Multimodal Image Retrieval

Location-Aware Web Service Recommendation Using Personalized Collaborative Filtering

Translation-based Recommendation: A Scalable Method for Modeling Sequential Behavior

Recommendation System for Location-based Social Network CS224W Project Report

Music Recommendation with Implicit Feedback and Side Information

A United Approach to Learning Sparse Attributed Network Embedding

Mining Web Data. Lijun Zhang

A Deep Relevance Matching Model for Ad-hoc Retrieval

Classifying a specific image region using convolutional nets with an ROI mask as input

LINE: Large-scale Information Network Embedding

arxiv: v1 [cs.ir] 27 Jun 2016

Intelligent Search Engine and Recommender Systems based on Knowledge Graph

arxiv: v1 [cs.lg] 12 Mar 2015

A Bayesian Approach to Hybrid Image Retrieval

A Privacy-Preserving QoS Prediction Framework for Web Service Recommendation

Online Cross-Modal Hashing for Web Image Retrieval

Non Overlapping Communities

Supervised Models for Multimodal Image Retrieval based on Visual, Semantic and Geographic Information

CPSC 340: Machine Learning and Data Mining. Multi-Dimensional Scaling Fall 2017

arxiv: v1 [cs.mm] 12 Jan 2016

COLD-START PRODUCT RECOMMENDATION THROUGH SOCIAL NETWORKING SITES USING WEB SERVICE INFORMATION

Mining Web Data. Lijun Zhang

Where Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf

Semi-supervised Data Representation via Affinity Graph Learning

L1-graph based community detection in online social networks

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

This Talk. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured

Collaborative Filtering using a Spreading Activation Approach

An efficient face recognition algorithm based on multi-kernel regularization learning

Trajectory analysis. Ivan Kukanov

Leveraging Click Completion for Graph-based Image Ranking

QUINT: On Query-Specific Optimal Networks

An improved PageRank algorithm for Social Network User s Influence research Peng Wang, Xue Bo*, Huamin Yang, Shuangzi Sun, Songjiang Li

arxiv: v2 [cs.si] 10 Jul 2018

Cold-Start Web Service Recommendation Using Implicit Feedback

DRN: A Deep Reinforcement Learning Framework for News Recommendation

Word2vec and beyond. presented by Eleni Triantafillou. March 1, 2016

Query Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata

A New Evaluation Method of Node Importance in Directed Weighted Complex Networks

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

Liangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison*

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Personalized Ranking Metric Embedding for Next New POI Recommendation

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

Relation Structure-Aware Heterogeneous Information Network Embedding

Link Prediction for Social Network

Learning Compact and Effective Distance Metrics with Diversity Regularization. Pengtao Xie. Carnegie Mellon University

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Point-of-Interest Recommendation in Location- Based Social Networks with Personalized Geo-Social Influence

On Exploiting Transient Contact Patterns for Data Forwarding in Delay Tolerant Networks

Entity and Knowledge Base-oriented Information Retrieval

Canonical Image Selection for Large-scale Flickr Photos using Hadoop

An Efficient Methodology for Image Rich Information Retrieval

Ruslan Salakhutdinov and Geoffrey Hinton. University of Toronto, Machine Learning Group IRGM Workshop July 2007

Rongrong Ji (Columbia), Yu Gang Jiang (Fudan), June, 2012

Document Information

Deep Attributed Network Embedding

Community-Based Recommendations: a Solution to the Cold Start Problem

with Deep Learning A Review of Person Re-identification Xi Li College of Computer Science, Zhejiang University

Probabilistic Graphical Models Part III: Example Applications

Multimodal Information Spaces for Content-based Image Retrieval

Improving Implicit Recommender Systems with View Data

Multimodal Medical Image Retrieval based on Latent Topic Modeling

Real-time Collaborative Filtering Recommender Systems

Learning a Hierarchical Embedding Model for Personalized Product Search

Network Representation Learning with Rich Text Information

result, it is very important to design a simulation system for dynamic laser scanning

Analysis of Website for Improvement of Quality and User Experience

Transcription:

A Brief Review of Representation Learning in Recommender Systems @ 赵鑫 RUC batmanfly@qq.com

Representation learning

Overview of recommender systems Tasks Rating prediction Item recommendation Basic models MF LibFM

Rating Prediction User-item matrix i1 i2 u1 1? u2? 5 u3 3? u4? 2 Online test Offline test

Item Recommendation User-item matrix i1 i2 u1 yes? u2? yes u3 no? u4 yes yes Online test Offline test Retrieval-based metrics, e.g., P@k, R@k, MAP

Context-Aware Recommendation When you know more information about users and items

More complicated tasks

Practical Considerations

Rating Prediction User-item matrix i1 i2 u1 1? u2? 5 u3 3? u4? 2 Online test Offline test

Latent Factor Models

Matrix factorization

A Basic Model

Another formulation A Basic Model

Probabilistic Matrix Factorization

Probabilistic Matrix Factorization

Context-Aware Recommendation When you know more information about users and items

LibFM

Outline of the approaches Recommendation by network embedding Recommendation by word embedding Embedding as regularization Recommendation by TransE Recommendation by metric learning Recommendation by multi-modality fusion

Outline of the approaches Recommendation by network embedding Recommendation by word embedding Embedding as regularization Recommendation by TransE Recommendation by metric learning Recommendation by multi-modality fusion

What is network embedding? We map each node in a network into a lowdimensional space Distributed representation for nodes Similarity between nodes indicate the link strength Encode network information and generate node representation 20

Example Zachary s Karate Network: 21

Framework 22

LINE First-order Proximity 2 3 4 1 5 6 7 8 9 10 Vertex 6 and 7 have a large first-order proximity The local pairwise proximity between the vertices Determined by the observed links However, many links between the vertices are missing Not sufficient for preserving the entire network structure From Jian Tang s slides

LINE 2 3 4 1 5 6 7 8 9 10 Vertex 5 and 6 have a large second-order proximity p 5 = (1,1, 1,1,0,0,0,0,0,0) p 6 = (1,1, 1,1,0,0,5,0,0,0) Second-order Proximity The proximity between the neighborhood structures of the vertices Mathematically, the second-order proximity between each pair of vertices (u,v) is determined by: p u = (w u1, w u2,, w u V ) p v = (w v1, w v2,, w v V ) From Jian Tang s slides

LINE Preserving the First-order Proximity Given an undirected edge v i, v j, the joint probability of v i, v j 1 p 1 v i, v j = 1 + exp ( u T i u j ) u i : Embedding of vertexv i v i p 1 v i, v j = (i,j ) w ij w i j Objective: O 1 = d(p 1,, p 1, ) w ij log p 1 (v i, v j ) i,j E KL-divergence From Jian Tang s slides

LINE Preserving the Second-order Proximity Given a directed edge (v i, v j ), the conditional probability of v j given v i is: p 2 v j v i = exp(u j T u i ) V k=1 exp(u k T u i ) u i : Embedding of vertex i when i is a source node; u i : Embedding of vertex i when i is a target node. p 2 v j v i = w ij k V w ik Objective: O 2 = λ i d(p 2 v i, p 2 v i ) i V w ij log p 2 (v j v i ) i,j E λ i : Prestige of vertex in the network λ i = j w ij From Jian Tang s slides

LINE Preserving both Proximity Concatenate the embeddings individually learned by the two proximity First-order Second-order From Jian Tang s slides

Recommendation by network embedding Learning Distributed Representations for Recommender Systems with a Network Embedding Approach (Zhao et al, AIRS 2016) Motivation

Recommendation by network embedding Given any edge in the network

Recommendation by network embedding User-item recommendation

Recommendation by network embedding User-item-tag recommendation

Outline of the approaches Recommendation by network embedding Recommendation by word embedding Embedding as regularization Recommendation by TransE Recommendation by metric learning Recommendation by multi-modality fusion

Recommendation by word embedding Recall word2vec Input: a sequence of words from a vocabulary V Output: a fixed-length vector for each term in the vocabulary v w It implements the idea of distributional semantics using a shallow neural network model.

Recommendation by word embedding Generalized token2vec Input: a sequence of symbol tokens from a vocabulary V Output: a fixed-length vector for each symbol in the vocabulary v w You can imagine that all the sequences in which surrounding contexts are sensitive can potentially be modeled with word2vec.

Recommendation by word embedding POI data modeling User ID Location ID Check-in time Category label/name GPS information Check-in information User connections

A sequential way to model POI data Given a user u, a trajectory is a sequence of check-in records related to u User ID Location ID Check-in Timestamp u1 l181 2016-08-26 9:26am u1 l32 2016-08-26 10:26am u1 l323 2016-08-25 11:26am u1 l32323 2016-08-25 1:26pm u2 l345 2016-08-26 9:16am u2 l13 2016-08-26 10:36am

A sequential way to model POI data Given a user u, a trajectory is a sequence of check-in records related to u User ID Location ID Check-in Timestamp u1 l181 2016-08-26 9:26am u1 l32 2016-08-26 10:26am u1 l323 2016-08-25 11:26am u1 l32323 2016-08-25 1:26pm u2 l345 2016-08-26 9:16am u2 l13 2016-08-26 10:36am u1: l181 l32 l323 l32323 u2: l345 l13

Task Input: Check-in sequences together with user relations Output: Embedding representations for users, locations and other related information Zhao et al., ACM TKDD 2017

Recall CBOW CBOW predicts the current word using surrounding contexts Pr(w t context(w t )) Window size 2c context(w t ) = [w t c,, w t+c ]

Model sequential relatedness A direct application of doc2vec

Modeling social connectedness A skip-gram way to model all the friends

A joint model to characterize trajectories and links Jointly optimizing the two loss functions

Modeling multi-grained sequential contexts A long trajectory sequence can be split into multiple segments User ID Location ID Check-in Timestamp u1 l181 2016-08-26 9:26am u1 l32 2016-08-26 10:26am u1 l323 2016-08-25 11:26am u1 l32323 2016-08-25 1:26pm u1: s1 s2 s1: l181 l32 s2: l323 l32323

Modeling multi-grained sequential contexts Modeling segment-level relatedness u1: s1 s2 s1: l181 l32 s2: l323 l32323

Modeling multi-grained sequential contexts Modeling location-level relatedness u1: s1 s2 s1: l181 l32 s2: l323 l32323

The joint hierarchical model Jointly optimizing three objective functions

Recommendation by word embedding Token2vec for product recommendation Doc2vec (Zhao et al., IEEE TKDE 2016) Doc user Word product A user profiling way

Recommendation by word embedding Token2vec for next-basket recommendation (Wang et al., SIGIR 2015)

Outline of the approaches Recommendation by network embedding Recommendation by word embedding Embedding as regularization Recommendation by TransE Recommendation by metric learning Recommendation by multi-modality fusion

Matrix factorization Motivation It mainly captures user-item interactions The item co-occurrence across users has been ignored Liang et al., RecSys 2016

Item embedding Motivation Levy and Goldberg show an equivalence between skip-gram word2vec trained with negative sampling value of k and implicit factorizing the pointwise mutual information (PMI) matrix shifted by log k. We can factorize the item co-occurrence matrix to obtain item embeddings

The joint model MF with embedding regularization

TransE Characterizing the triple relations

Next recommendation scenario What s the next movie to watch? He et al., RecSys 2017

Next recommendation scenario What s the next movie to watch? A traditional method Markov chain and factorized Markov chain

Next recommendation scenario What s the next movie to watch? A TransE based approach

Next recommendation scenario What s the next movie to watch? A TransE based approach

Outline of the approaches Recommendation by network embedding Recommendation by word embedding Embedding as regularization Recommendation by TransE Recommendation by metric learning Recommendation by multi-modality fusion

Metric learning for recommendation Metric A metric on a set X is a function The following conditions are satisfied

Metric learning for recommendation Metric learning The most original metric learning approach attempts to learn a Mahalanobis distance metric We can define the objective function

Metric learning for recommendation Metric Learning for knn Large margin nearest neighbor (LMNN) Pull loss Push loss

Metric learning for recommendation Representation-based metric learning Distance function Loss function Hsieh et al., WWW 2017

Metric learning for recommendation Representation-based metric learning Improving representations by integrating item features Regularization The joint loss

Outline of the approaches Recommendation by network embedding Recommendation by word embedding Embedding as regularization Recommendation by TransE Recommendation by metric learning Recommendation by multi-modality fusion

Multi-modality representation Rich side information

Multi-modality representation Rich side information Zhang et al., KDD 2016

Multi-modality representation Rich side information Modeling KB information

Multi-modality representation Rich side information Modeling text information

Multi-modality representation Rich side information Modeling image information

Multi-modality representation Rich side information Generative process

Multi-modality representation Complementary effect of visual and textual features Chen et al., to appear in AIRS 2017

Multi-modality representation A Multi-task learning method Chen et al., to appear in AIRS 2017

Future work ItemKNN MF (svd++) BPR FM? Why svd++, BPR and FM perform so well consistently on various datasets? How recommender systems borrow ideas from representation learning and deep learning? What is the future direction for recommender systems?

Thanks Wayne Xin Zhao, Sui Li, Yulan He, Edward Y. Chang, Ji-Rong Wen, Xiaoming Li: Connecting Social Media to E-Commerce: Cold-Start Product Recommendation Using Microblogging Information. IEEE Trans. Knowl. Data Eng. 28(5): 1147-1159 (2016) Pengfei Wang, Jiafeng Guo, Yanyan Lan, Jun Xu, Shengxian Wan, Xueqi Cheng. Learning Hierarchical Representation Model for NextBasket Recommendation. SIGIR 2015: 403-412 Xu Chen, Yongfeng Zhang, Wayne Xin Zhao and Zheng Qin. A Collaborative Neural Model for Rating Prediction by Leveraging User Reviews and Product Images. To appear in AIRS 2017. Wayne Xin Zhao, Feifan Fan, Ji-Rong Wen, Edward Chang. Joint Representation Learning for Location-based Social Networks with Multi-Grained Sequential Contexts. To appear in ACM TKDD. Liang, J. Altosaar, L. Charlin, and D. Blei. Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence. ACM RecSys, 2016. Ruining He, Wang-Cheng Kang, Julian McAuley. Translation-based Recommendation. RecSys 2017: 161-169 Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge J. Belongie, Deborah Estrin. Collaborative Metric Learning. WWW 2017: 193-201 Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, Wei-Ying Ma. Collaborative Knowledge Base Embedding for Recommender Systems. KDD 2016: 353-362