Real-time Collaborative Filtering Recommender Systems

Similar documents
Real-time Collaborative Filtering Recommender Systems

Adaptive Temporal Entity Resolution on Dynamic Databases

Collaborative Filtering based on User Trends

A Hybrid Peer-to-Peer Recommendation System Architecture Based on Locality-Sensitive Hashing

A Two Stage Similarity aware Indexing for Large Scale Real time Entity Resolution

Recommender Systems (RSs)

Developing Focused Crawlers for Genre Specific Search Engines

STREAMING RANKING BASED RECOMMENDER SYSTEMS

Project Report. An Introduction to Collaborative Filtering

Mining Web Data. Lijun Zhang

TriRank: Review-aware Explainable Recommendation by Modeling Aspects

Collaborative Filtering using Euclidean Distance in Recommendation Engine

Locality-Sensitive Hashing

Mining Web Data. Lijun Zhang

Recommender Systems New Approaches with Netflix Dataset

Automatic Record Linkage using Seeded Nearest Neighbour and SVM Classification

Recommender Systems. Techniques of AI

Towards a hybrid approach to Netflix Challenge

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Finding Similar Items:Nearest Neighbor Search

Predictive Indexing for Fast Search

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

CS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering

A Scalable, Accurate Hybrid Recommender System

ECS289: Scalable Machine Learning

Automatic training example selection for scalable unsupervised record linkage

Algorithms for Nearest Neighbors

Content-based Dimensionality Reduction for Recommender Systems

A PERSONALIZED RECOMMENDER SYSTEM FOR TELECOM PRODUCTS AND SERVICES

Metric Learning Applied for Automatic Large Image Classification

Collaborative Filtering and Recommender Systems. Definitions. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm

Full Text Search Engine as Scalable k-nearest Neighbor Recommendation System

Building a Scalable Recommender System with Apache Spark, Apache Kafka and Elasticsearch

Comparative performance of opensource recommender systems

BIG DATA ANALYSIS A CASE STUDY. Du, Haoran Supervisor: Dr. Wang, Qing. COMP 8790: Software Engineering Project AUSTRALIAN NATIONAL UNIVERSITY

Outsourcing Privacy-Preserving Social Networks to a Cloud

TREC 2016 Dynamic Domain Track: Exploiting Passage Representation for Retrieval and Relevance Feedback

Available online at ScienceDirect. Procedia Technology 17 (2014 )

B490 Mining the Big Data. 5. Models for Big Data

Predictive Analysis: Evaluation and Experimentation. Heejun Kim

COMP6237 Data Mining Making Recommendations. Jonathon Hare

Entity Resolution over Graphs

Part 11: Collaborative Filtering. Francesco Ricci

Top-K Entity Resolution with Adaptive Locality-Sensitive Hashing

A mixed hybrid recommender system for given names

Introduction. Chapter Background Recommender systems Collaborative based filtering

Collective Entity Resolution in Relational Data

Autonomic Workload Execution Control Using Throttling

Matrix Co-factorization for Recommendation with Rich Side Information HetRec 2011 and Implicit 1 / Feedb 23

VK Multimedia Information Systems

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Recommender Systems 6CCS3WSN-7CCSMWAL

ECS289: Scalable Machine Learning

Personal Web API Recommendation Using Network-based Inference

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.

Sampling Large Graphs: Algorithms and Applications

5/13/2009. Introduction. Introduction. Introduction. Introduction. Introduction

Variational Bayesian PCA versus k-nn on a Very Sparse Reddit Voting Dataset

Part 11: Collaborative Filtering. Francesco Ricci

Data Mining Lecture 2: Recommender Systems

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Singular Value Decomposition, and Application to Recommender Systems

FACE DETECTION AND LOCALIZATION USING DATASET OF TINY IMAGES

A P2P REcommender system based on Gossip Overlays (PREGO)

Recommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems

GraphCEP Real-Time Data Analytics Using Parallel Complex Event and Graph Processing

Instances on a Budget

Understanding and Recommending Podcast Content

Know your neighbours: Machine Learning on Graphs

VECTOR SPACE CLASSIFICATION

Chapter 5: Outlier Detection

Dynamic Embeddings for User Profiling in Twitter

Geometric data structures:

A Brief Review of Representation Learning in Recommender 赵鑫 RUC

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Sampling Large Graphs: Algorithms and Applications

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

Collaborative recommender systems: Combining effectiveness and efficiency

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks

An Efficient Recommender System Using Locality Sensitive Hashing

Improving the Accuracy of Top-N Recommendation using a Preference Model

Homework Assignment #3

A Collaborative Method to Reduce the Running Time and Accelerate the k-nearest Neighbors Search

Introduction to Data Mining

Image Retrieval with a Visual Thesaurus

Going nonparametric: Nearest neighbor methods for regression and classification

Link Analysis in Weibo

Approximate Nearest Neighbor Search. Deng Cai Zhejiang University

Enhanced Perceptual Distance Functions and Indexing. for Near-Replica Image Recognition

Link Prediction for Social Network

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:

Movie Recommender System - Hybrid Filtering Approach

SUGGEST. Top-N Recommendation Engine. Version 1.0. George Karypis

Making Recommendations by Integrating Information from Multiple Social Networks

Recommendation of APIs for Mashup creation

Learning independent, diverse binary hash functions: pruning and locality

1 Introduction. 3 Data Preprocessing. 2 Literature Review

Transcription:

Real-time Collaborative Filtering Recommender Systems Huizhi Liang, Haoran Du, Qing Wang Presenter: Qing Wang Research School of Computer Science The Australian National University Australia Partially funded by the Australian Research Council (ARC), Veda Advantage, and Funnelback Pty. Ltd., under Linkage Project. 1

Introduction Recommender Systems Applications Predict topics that would trend on Twitter Predict fluctuations in the prices of Bitcoin... 2

Introduction Recommender Systems Applications Predict topics that would trend on Twitter Predict fluctuations in the prices of Bitcoin... Common techniques Collaborative filtering i.e., use the ratings of users and items Content-based filtering: i.e., use the features of users and items Hybrid techniques i.e., combine the above two to overcome their limitations 3

Collaborative Filtering Coined by Goldberg et al. in Tapestry (1992): people collaborate to help one another perform filtering by... 4

Collaborative Filtering Coined by Goldberg et al. in Tapestry (1992): people collaborate to help one another perform filtering by... Assumption If two users act on n items similarly (e.g., watching and buying), they will act on other items similarly. 5

Collaborative Filtering Coined by Goldberg et al. in Tapestry (1992): people collaborate to help one another perform filtering by... Assumption If two users act on n items similarly (e.g., watching and buying), they will act on other items similarly. Two main phases (1) Offline model-building (2) On-demand recommendation 6

Collaborative Filtering Coined by Goldberg et al. in Tapestry (1992): people collaborate to help one another perform filtering by... Assumption If two users act on n items similarly (e.g., watching and buying), they will act on other items similarly. Two main phases (1) Offline model-building (2) On-demand recommendation Challenges Deal with highly sparse data Scale with the increasing numbers of users and items Make recommendations in real time 7

Real-Time Collaborative Filtering Top N item recommendation Given a target user u, to recommend a list of items c 1,..., c m such that A(u, c 1 )... A(u, c m ) where A(u, c i ) (i = 1,..., m) are the highest prediction scores of how much u would be interested in c i. 8

Real-Time Collaborative Filtering Top N item recommendation Given a target user u, to recommend a list of items c 1,..., c m such that A(u, c 1 )... A(u, c m ) where A(u, c i ) (i = 1,..., m) are the highest prediction scores of how much u would be interested in c i. Some questions How to conduct pair-wise comparisons efficiently? e.g., user-user/item-item How to capture new updates quickly? e.g. latest updates in social media 9

Overview of the Proposed Approach Key components LSH blocking Neighbourhood formation Recommendation generation 10

Overview of the Proposed Approach Key components LSH blocking Neighbourhood formation Recommendation generation User Blocks User Profile A target user LSH Blocking Block 1... Block n Item Blocks... Block 1 Block m Recommendation Generation Neighborhood Formation 11

LSH Blocking Construct blocks based on Cosine similarities User blocks Item blocks 12

LSH Blocking Construct blocks based on Cosine similarities User blocks Item blocks Use two LSH families to approximate Cosine similarities (1) Random hyperplane projection (2) Random bit sampling 13

LSH Blocking Random Hyperplane Projection. = Input vector Random vectors (d=4) Binary signature (k=2,l=2) Block signature 14

LSH Blocking Random Hyperplane Projection. = Input vector Random vectors (d=4) Binary signature (k=2,l=2) Block signature A n-dimensional input vector is mapped to a d-bit binary signature using random vectors, usually d n. 15

LSH Blocking Random Hyperplane Projection. = Input vector Random vectors (d=4) Binary signature (k=2,l=2) Block signature A n-dimensional input vector is mapped to a d-bit binary signature using random vectors, usually d n. The more random vectors we use, the more accurate the Cosine similarity between two input vectors is. 16

LSH Blocking Random Bit Sampling. = Input vector Random vectors (d=4) Binary signature (k=2,l=2) Block signature 17

LSH Blocking Random Bit Sampling. = Input vector Random vectors (d=4) Binary signature (k=2,l=2) Block signature Use the Hamming distance to measure the similarity of two binary signatures 18

LSH Blocking Random Bit Sampling. = Input vector Random vectors (d=4) Binary signature (k=2,l=2) Block signature Use the Hamming distance to measure the similarity of two binary signatures Use random bit sampling to approximate the Hamming distance over {0, 1} d - Select random bits from the binary signatures - Amplify the collision probability using AND/OR constructions 19

Neighborhood Formation Use user and item blocks to identify the neighbor users/items Neighbor users: in the same user blocks as a user Neighbor items: in the same item blocks as an item 20

Neighborhood Formation Use user and item blocks to identify the neighbor users/items Neighbor users: in the same user blocks as a user Neighbor items: in the same item blocks as an item But, user/item blocks could still be large... 21

Neighborhood Formation Use user and item blocks to identify the neighbor users/items Neighbor users: in the same user blocks as a user Neighbor items: in the same item blocks as an item But, user/item blocks could still be large... how to efficiently make the top N recommendations for a target user based on neighbor users/items? 22

Real-time Recommendation Generation Two approaches User-based recommendation Item-based recommendation 23

Real-time Recommendation Generation User-based Recommendation Rank/select neighbor users Count collision numbers of neighbour users in user blocks with the target user Set a threshold on the collision numbers to select neighbor users 24

Real-time Recommendation Generation User-based Recommendation Rank/select neighbor users Count collision numbers of neighbour users in user blocks with the target user Set a threshold on the collision numbers to select neighbor users Calculate prediction scores Find candidate items from the items of selected neighbor users Calculate the similarities between the target user and neighbor users who have a candidate item: A u (u i, c x ) = u j N ui Uc x 1 cosine(u Nui Uc i, u j ) x 25

Real-time Recommendation Generation User-based Recommendation Rank/select neighbor users Count collision numbers of neighbour users in user blocks with the target user Set a threshold on the collision numbers to select neighbor users Calculate prediction scores Find candidate items from the items of selected neighbor users Calculate the similarities between the target user and neighbor users who have a candidate item: A u (u i, c x ) = u j N ui Uc x 1 cosine(u Nui Uc i, u j ) x Generate recommendations The top N items with high prediction scores 26

Real-time Recommendation Generation Item-based Recommendation Rank/select neighbor items Count collision numbers of neighbour items in item blocks with each item of the target user Set a threshold on the collision numbers to select neighbor items 27

Real-time Recommendation Generation Item-based Recommendation Rank/select neighbor items Count collision numbers of neighbour items in item blocks with each item of the target user Set a threshold on the collision numbers to select neighbor items Calculate prediction scores Find candidate items, i.e., all selected neighbour items Calculate the similarities between each item of the target user and a candidate item: A c (u i, c x ) = 1 c j C ui cosine(c Cui j, c x ) 28

Real-time Recommendation Generation Item-based Recommendation Rank/select neighbor items Count collision numbers of neighbour items in item blocks with each item of the target user Set a threshold on the collision numbers to select neighbor items Calculate prediction scores Find candidate items, i.e., all selected neighbour items Calculate the similarities between each item of the target user and a candidate item: A c (u i, c x ) = 1 c j C ui cosine(c Cui j, c x ) Generate recommendations The top N items with high prediction scores 29

Experimental Setup Experiment Topic recommendation (i.e., recommend topics to users in a social media community) Data set Crawled from Twitter.com Selects the keywords that are at least used by 5 users as topics, and the users who have used at least 5 topics Contains 2320 users, 3319 topics, and 1,214,604 tweets Split into 90% training (2088 users) and 10% test (232 users) Evaluation metrics Top N=10 Precision & Recall Average Recommendation Time 30

Experimental Results Compared approaches CF-U & CF-C: Traditional user & item based CF RCF-U & RCF-C: Real-time user & item based CF Precision 0.05 Recall 0.5 Average Query Time 0.25 0.04 0.4 Percentage 0.20 0.15 0.10 Percentage 0.03 0.02 Seconds 0.3 0.2 0.05 0.01 0.1 0.00 CF-U CF-C RCF-U RCF-C 0.00 CF-U CF-C RCF-U RCF-C 0.0 CF-U CF-C RCF-U RCF-C 31

Conclusions We have studied a real-time recommender system LSH Blocking Neighborhood formation Recommendation generation We have used two LSH families to approximate the similarities between items/users Random hyperplane projection Random bit sampling We have conducted experiments on a Twitter dataset As future work, the temporal aspects of items and users can be future considered 32