José Miguel Hernández Lobato Zoubin Ghahramani Computational and Biological Learning Laboratory Cambridge University

Similar documents
Market Basket Analysis: an Introduction. José Miguel Hernández Lobato Computational and Biological Learning Laboratory Cambridge University

Memory issues in frequent itemset mining

Real World Performance of Association Rule Algorithms

Implications of Probabilistic Data Modeling for Mining Association Rules

Association Rule Mining Using Revolution R for Market Basket Analysis

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Real World Performance of Association Rule Algorithms

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S

Induction of Association Rules: Apriori Implementation

Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis

arxiv: v1 [cs.db] 6 Mar 2008

Recommender Systems - Content, Collaborative, Hybrid

Recommender Systems New Approaches with Netflix Dataset

Structure of Association Rule Classifiers: a Review

A Bottom-Up Projection Based Algorithm for Mining High Utility Itemsets

An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining

FIMI 03: Workshop on Frequent Itemset Mining Implementations

Prowess Improvement of Accuracy for Moving Rating Recommendation System

Chapter 1, Introduction

BINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK

Results and Discussions on Transaction Splitting Technique for Mining Differential Private Frequent Itemsets

Learning the Structure of Deep, Sparse Graphical Models

Mining Web Data. Lijun Zhang

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University

Seminar Collaborative Filtering. KDD Cup. Ziawasch Abedjan, Arvid Heise, Felix Naumann

Collaborative Filtering based on User Trends

CptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007

Performance Analysis of Data Mining Classification Techniques

Mining Web Data. Lijun Zhang

Markov Random Fields and Gibbs Sampling for Image Denoising

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE

Study on Classifiers using Genetic Algorithm and Class based Rules Generation

Improved Frequent Pattern Mining Algorithm with Indexing

Using Association Rules for Better Treatment of Missing Values

An Empirical Comparison of Collaborative Filtering Approaches on Netflix Data

MINING FREQUENT PATTERNS BY THE GROUPING COMPRESSION TREE

Novel Techniques to Reduce Search Space in Multiple Minimum Supports-Based Frequent Pattern Mining Algorithms

Understanding Rule Behavior through Apriori Algorithm over Social Network Data

Assignment 5: Collaborative Filtering

Additive Regression Applied to a Large-Scale Collaborative Filtering Problem

Part I: Data Mining Foundations

Dependency detection with Bayesian Networks

Probabilistic Abstraction Lattices: A Computationally Efficient Model for Conditional Probability Estimation

Data Distortion for Privacy Protection in a Terrorist Analysis System

Extension Study on Item-Based P-Tree Collaborative Filtering Algorithm for Netflix Prize

A NEW ASSOCIATION RULE MINING BASED ON FREQUENT ITEM SET

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Mixture models and frequent sets: combining global and local methods for 0 1 data

Association Rule Mining from XML Data

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 1, Jan-Feb 2015

Mining Rare Periodic-Frequent Patterns Using Multiple Minimum Supports

Table 1.1 Installation.

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Weighted Association Rule Mining from Binary and Fuzzy Data

Coefficient-Based Exact Approach for Frequent Itemset Hiding

Estimating Credibility of User Clicks with Mouse Movement and Eye-tracking Information

Grouping Association Rules Using Lift

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Supervised and Unsupervised Learning (II)

A Novel Algorithm for Associative Classification

Network embedding. Cheng Zheng

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci

CS570 Introduction to Data Mining

Maintenance of the Prelarge Trees for Record Deletion

Fundamental Data Mining Algorithms

Optimization using Ant Colony Algorithm

Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix

Applications of Concurrent Access Patterns in Web Usage Mining

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA

Multi-domain Predictive AI. Correlated Cross-Occurrence with Apache Mahout and GPUs

Mining Frequent Patterns with Counting Inference at Multiple Levels

I. INTRODUCTION. Keywords : Spatial Data Mining, Association Mining, FP-Growth Algorithm, Frequent Data Sets

An Improved Apriori Algorithm for Association Rules

The Fuzzy Search for Association Rules with Interestingness Measure

Link Prediction for Social Network

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Adaptive Supersampling Using Machine Learning Techniques

Data Mining. Lesson 9 Support Vector Machines. MSc in Computer Science University of New York Tirana Assoc. Prof. Dr.

node2vec: Scalable Feature Learning for Networks

A Framework for Visualizing Association Mining Results

More Efficient Classification of Web Content Using Graph Sampling

Table Of Contents: xix Foreword to Second Edition

Contents. Preface to the Second Edition

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Mining Frequent Itemsets Along with Rare Itemsets Based on Categorical Multiple Minimum Support

NLMF: NonLinear Matrix Factorization Methods for Top-N Recommender Systems

A Conflict-Based Confidence Measure for Associative Classification

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University

Unsupervised learning in Vision

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results

A Literature Review of Modern Association Rule Mining Techniques

BPR: Bayesian Personalized Ranking from Implicit Feedback

620 HUANG Liusheng, CHEN Huaping et al. Vol.15 this itemset. Itemsets that have minimum support (minsup) are called large itemsets, and all the others

Where Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Roadmap DB Sys. Design & Impl. Association rules - outline. Citations. Association rules - idea. Association rules - idea.

Content-based Dimensionality Reduction for Recommender Systems

08 An Introduction to Dense Continuous Robotic Mapping

Association Rule Mining. Introduction 46. Study core 46

Transcription:

José Miguel Hernández Lobato Zoubin Ghahramani Computational and Biological Learning Laboratory Cambridge University 20/09/2011 1

Evaluation of data mining and machine learning methods in the task of modeling binary transaction data. Experimental protocol based on the problem of product recommendation (cross selling): 1. We single out a set of test transactions. 2. A few items in these transactions are eliminated. 3. Different methods are then used to identify the missing items using a set of training transactions. a) Each method generates a specific score for each item. b) The items are sorted according to their score. c) Ideally, the missing items are ranked at the top of the resulting list. 2

Association rules Bayesian sets Graph based approaches Nearest neighbors User based nearest neighbors Item based nearest neighbors Matrix factorization techniques Partial SVD Bayesian probabilistic matrix factorization Variational Bayesian matrix factorization 3

Generate a score for each item using a dependency model for the data given by a set of association rules. Efficient algorithms for finding (Apriori). Prediction: given a new transaction with the items bought by a particular user, we Find all the matching rules in such that. For any item, we compute a score by aggregating the confidence of the matching rules such that. Performance depends on (often the larger, the better). 4

Given a new transaction the item is ranked according to: These probabilities are obtained by 1. Assuming a simple probabilistic model for the data. 2. Using Bayes rule. Item is a binary vector with the th column in.,. 5

Transaction data can be encoded in the form of a graph. The graph has two types of nodes: transactions and items. Edges connect items with the transactions they are contained in. Prediction: given a new transaction, the score for any specific item is the number of different 2 step paths between the items in the transaction and that particular item. 6

Assumption: customers with similar tastes often generate similar baskets. Prediction: given a new transaction, we find a neighborhood of similar transactions and aggregate their item counts weighted by similarity. Jaccard similarity: Drawbacks: Computing the similarities can be very expensive. Need to keep all the transactions in memory. 7

Customers will often buy items similar to the ones they already bought. A similarity matrix is pre computed for all items. For each item, only the most similar items are considered. Prediction: given a new transaction, we aggregate the similarity measures of the items most similar to those already included in the transaction. Computationally very efficient. Applicable to large scale problems. Used by Amazon.com. 8

is represented using a low rank approximation: where and are and matrices, respectively and is small. 9

where and are orthonormal matrices and is a diagonal matrix with the first singular values. We define and. Since and are orthonormal, Given a new transaction the score given to the th item is: where is the th row in. There is available highly efficient software for computing partial SVD decompositions on sparse matrices (e.g., package irlba in R). 10

Multivariate Gaussian priors are fixed for each row of and :,. Gaussian Wishart priors for the hyperparameters: Gaussian likelihood. Bayesian inference using Gibbs sampling. Given a new transaction, the score of each items is given by the average prediction of Bayesian linear regression problems, one problem per sample of.,, 11

Gaussian priors for and :,. The posterior distribution is approximated by:. The parameters of, the prior hyperparameters and the level of noise in the Gaussian likelihood are selected by minimizing: Given a new transaction, the prior for is fixed to and we approximate the posterior for the new row of and as in the training phase. Unlike, BPMF, VBMF does not take into account correlations in the posterior., 12

13

Four datasets from the FIMI repository (http://fimi.ua.ac.be/): Retail: market basket data from a Belgian retail store ( ). BMS POS: point of sale data from electronics retailer ( ). BMS WebView 2: click data from an e commerce site ( 3340). Kosarak: click data from an on line news portal ( ). Data pre processing: Only considered the 1000 most frequent items. Only considered transactions with at least 10 items. Training, validation and test sets: 2000 transactions each. For each test transaction, a 15% of the items are eliminated as test items. Objective: identify the items eliminated in the test transactions. 14

Top N recommendation approach. 1. For each test transaction, we form a ranked list of items. 2. The items already in the transaction obtain the lowest rank. 3. We single out the top N elements in this list (, ). 4. We have a hit each time one missing item is singled out. 5. Recall is used as a measure of performance: We also include in the analysis a method that recommends the most popular products (top pop). 15

16

Time: average prediction time per test transaction (in seconds). 17

18

19

20

R. Agrawal, T. Imielinski, and A. Swami (1993) Mining association rules between sets of items in large databases. ACM SIGMOD International Conference on Management of Data, 207 216. Ghahramani, Z. and Heller, K. A. (2005) Bayesian Sets Advances in Neural Information Processing Systems. Craswell, N. and Szummer, M. Random walks on the click graph. Proceedings of SIGIR 20007, 239 246. C. Desrosiers and G. Karypis (2011) A comprehensive survey of neighborhood based recommendation methods. Recommender Systems Handbook, Springer, 107 144. Baglama, J. and Reichel, L. (2005) Augmented Implicitly Restarted Lanczos Bidiagonalization Methods SIAM J. Sci. Comput., Society for Industrial and Applied Mathematics, 27, 19 42 Salakhutdinov, R. and Mnih, A. (2008) Bayesian probabilistic matrix factorization using MCMC International Conference in Machine Learning, 872 879 Raiko, T.,Ilin, A. and Juha, K. (2007) Principal Component Analysis for Large Scale Problems with Lots of Missing Values Machine Learning: ECML 2007, Springer Berlin / Heidelberg, 4701, 691 698 21

Hahsler M, Buchta C, Gun B, Hornik K (2010). arules: Mining Association Rules and Frequent Itemsets. R package version 1.0 3. http://cran.r project.org/package=arules Hahsler M (2010). recommenderlab R package version 0.1 2. http://cran.r project.org/package=recommenderlab Brijs, T., Swinnen, G., Vanhoof, K. and Wets, G. (1999) Using association rules for product assortment decisions: a case study Proceedings of SIGKDD 1999, 254 260. Thanks to Blue Martini Software for contributing the KDD Cup 2000 data. Kohavi, R., Brodley, C. E., Frasca, B., Mason, L. and Zheng, Z. (2000) KDD Cup 2000 organizers' report: peeling the onion SIGKDD Explor. Newsl., ACM, 2, 86 93. Zheng, Z.; Kohavi, R. and Mason, L. (2001) Real world performance of association rule algorithms Proceedings of SIGKDD, 401 406. 22

23