Introduction. Chapter Background Recommender systems Collaborative based filtering
|
|
- Johnathan Parsons
- 6 years ago
- Views:
Transcription
1 ii Abstract Recommender systems are used extensively today in many areas to help users and consumers with making decisions. Amazon recommends books based on what you have previously viewed and purchased, Netflix presents you with shows and movies you might enjoy based on your interactions with the platform and Facebook serves personalized ads to every user based on gathered browsing information. These systems are based on shared similarities and there are several ways to develop and model them. This study compares two methods, user and item-based filtering in k nearest neighbours systems. The methods are compared on how much they deviate from the true answer when predicting user ratings of movies based on sparse data. The study showed that none of the methods could be considered objectively better than the other and that the choice of system should be based on the data set.
2 Chapter 1 Introduction 1.1 Background In everyday life, it is often necessary to make choices without sufficient personal experience of the alternatives. We then rely on recommendations from other people to make as smart choices as possible. E.g., when shopping at a shoe store, a customer could describe features of previously owned shoes to a clerk and then the clerk would make recommendations for new shoes based on the customer s past experiences. A dedicated clerk could, besides providing recommendations, also remember past choices and experiences of customers. This would allow the clerk to make personalised recommendations to returning customers. The way we transform this experience to the digital era is by using recommender systems [1] Recommender systems Recommender systems can be viewed as a digital representation of the clerk in the previous example. The goal of a recommender system is to make predictions of what items users might be interested in by analysing gathered data. Gathering data can be done with an implicit and/or an explicit approach. An implicit approach records users behaviour when reacting to incoming data (e.g. by recording for how long a user actually watched a movie before switching to something else). This can be done without user knowledge. The explicit approach depends on the user explicitly specifying their preferences regarding items, e.g. by rating a movie. Input to a recommender system is the gathered data and the output is a prediction or recommendation for the user [2]. A recommender system s predictions will generally be more accurate the more data it can base its predictions on. Having a small amount of data to base predictions on is known as the sparse data problem and is expanded upon in section Collaborative based filtering Collaborative Filtering (CF) is a common algorithm used in recommender systems. CF provides predictions and recommendations based on other users and/or items in the system. We assume that similar users or items in the system can be used to predict each other s ratings. If we know that Haris likes the same things as Alex and Alex also likes candy then we can predict that Haris will most likely also enjoy candy [3, 4]. 1
3 2 CHAPTER 1. INTRODUCTION Two common methods for implementing collaborative filtering are user and itembased filtering. Both of these methods create a similarity matrix where the similarities between users (or items) is calculated and stored in a matrix. The distance (similarity) between users can be calculated in several ways and two common methods are the Pearson correlation coefficient or the cosine similarity Calculating the similarity between users To calculate how similar users are, a matrix is used where the users are rows and different items are columns. One can then look at how similar users are by comparing their ratings for every item. Below is an example matrix and table with 3 users (Amy, Bill and Jim) and only 2 items (Snow Crash and Girl with the Dragon Tattoo). Figure 1.1: Comaprison matrix [guidetodatamining.com] Figure 1.2: Comaprison table [guidetodatamining.com] The figures 1.1 and 1.2 show Bill and Jim having more in common than any other pair. There are several ways to give a value to this similarity. Some common approaches are:
4 CHAPTER 1. INTRODUCTION 3 Manhattan distance The Manhattan distance is a simple form of similarity calculation. It is the sum of the differences between ratings in every axis. In the above case, where the matrix is in 2D, the Manhattan distance between Bill, at index 1, and Jim, at index 2, would be: x 1 x 2 + y 1 y 2 = = 2 Euclidean distance The Euclidean distance uses the difference of every axis and applies the Pythagorean Theorem to calculate the "straight line distance" between two objects in the matrix. Pythagorean theorem: a 2 + b 2 = c 2 Euclidean distance between Jim, at index 1, and Amy, at index 3, is calculated with the equation: ( x1 x 3 2 ) + ( y 1 y 3 2 ) = ( ) + ( 1 5 ) 2 = Correlation An issue that isn t visualized by this example is what happens when there is incomplete data. As in, some users haven t rated some items of the matrix. If users A and B have rated the same 100 items but A and C only have 10 rated items in common, the similarity calculation between A and B should obviously be stronger as it is based on more data. Using the Manhattan or Euclidean distance however, this will not be accounted for, making these methods poor when data is missing [5]. To account for this, two other methods, Pearson correlation coefficient and cosine similarity can be used. Pearson correlation coefficient (PCC) The PCC draws a line between two users ratings to get a correlation value where a straight, increasing line represents a high correlation while a decreasing line shows that the compared units do not correlate much. Figure 1.3: Example of a correlation table [guidetodatamining.com] The figures 1.3 and 1.4, show an example of positive correlation. The Pearson correlation coefficient takes what is known as "grade inflation" into account [5]. This is the phenomenon of users rating things differently even though they feel the same way about them. In the above example, Weird Al is the band Clara dislikes the most yet they are still rated at 4. Robert also dislikes Weird Al but gives them a rating of 1. In the Manhattan or Euclidean calculations, this would represent a big difference between the users but
5 4 CHAPTER 1. INTRODUCTION Figure 1.4: Graphing the table shows a positive correlation [guidetodatamining.com] the graph shows that they are very much alike. When placing these 5 bands in order of preference, they agree completely. The formula for calculating PCC is: r = n (x i x)(y i ȳ) i=1 (1.1) n n (x i x) 2 (y i ȳ) 2 i=1 i=1 Cosine similarity Cosine similarity is another way of calculating the similarity between users preferences. Here the users and their ratings of items are represented as two vectors and their similarity is based on the cosine of the angle between them. Cosine similarity is often used for recommender systems since it ignores items which both users haven t rated, so called 0-0 matches, which are in abundance when dealing with sparse data. The cosine similarity is calculated as: cos( x, y ) = x y x y Where the dot in the numerator represents the dot product and x in the denominator indicates the length of vector x k Nearest Neighbours (knn) K nearest neighbours is the method of looking at some number (k) of users or items that are similar to make predictions. Meaning that not all users, or items, are accounted for when making a prediction. The difference between user or item-based filtering is creating a matrix of similar users or similar items. Similar users are users who often share sentiment/rating of items. When recommender systems were first developed, user-based filtering was used but it has issues with scalability. As the amount of data increases, calculating the similarity matrix raises exponentially. To combat this, Amazon developed item-based filtering which labels similar items into groups so that once a user rates some (1.2)
6 CHAPTER 1. INTRODUCTION 5 item highly, the algorithm recommends other similar items from the same group. Itembased filtering scales better than the user-based approach [3, 5, 6] Evaluation Two common methods for evaluating recommender systems are used in this study. The Root Mean Squared Error (RMSE) is calculated by: RMSE = 1 n n d 2 i (1.3) i=1 and the Mean Absolute Error (MAE) is calculated by: MAE = 1 n n d i (1.4) Where n is the number of predictions made and d is the distance between the recommender system s prediction and the correct answer. The closer the RMSE and MAE values are to 0 the better accuracy the recommender system has. RMSE disproportionally penalizes large errors while MAE does not mirror many small errors properly so both measurements should be used when evaluating the accuracy [7, 8, 9]. To provide test data for evaluation, a dataset is divided into two parts. One part is used for building the similarity matrix and the other part is used for evaluation Sparse data problem Sparse data is a common problem in recommender systems where the dataset consists of few ratings compared to the number of users. This issue was simulated by splitting the dataset into two asymmetric parts. The smaller part is then used to make predictions for all objects in the larger part [10]. i=1 1.2 Datasets Three datasets where used in this study. These are all datasets involving user ratings of movies. The datasets have all been previously used in studies about recommender systems [10]. The datasets are: FilmTrust FilmTrust was an old film rating website that has now been shut down. The data was crawled from the FilmTrust website in June 2011 as part of a research paper on recommender systems [11]. The FilmTrust database has users and items. There is a total of ratings where the scale goes from 1 to 5. CiaoDVD CiaoDVD was a DVD rating website where users could share their reviews of movies and give recommendations for stores with the best prices. The data was crawled from dvd.ciao.co.uk in December 2013 as part of a research paper on trust prediction [12]. The
7 6 CHAPTER 1. INTRODUCTION CiaoDVD database has 920 users and items. There is a total of ratings and the scale goes from 1 to 5. MovieLens MovieLens is a well-known dataset used in many scientific papers. It consists of a collection of movie ratings from the MovieLens web site. The dataset was collected over various periods of time [13]. The MovieLens database has users and items. There are a total number of ratings and the scale goes from 1 to 5. In this dataset, all users have rated at least 20 items. 1.3 Surprise There are multiple free and available to use implementations of recommender systems. The algorithms in this study was implemented using the python library Surprise [14]. Surprise is licensed under the BSD 3-Clause license [15]. 1.4 Purpose The study compares how well the two collaborative based filtering systems user-based and item-based perform when predictions are based on sparse data, known as the sparse data problem. The sparse data problem is a common one in the field of machine learning [16] and understanding how effective these different methods are, is of great value for future implementations. 1.5 Research question How do the two filtering systems user-based and item-based compare when making predictions based on sparse data? 1.6 Scope and constraints The different datasets that were used are from MovieLens, FilmTrust and CiaoDVD. The python library Surprise was used to conduct all tests. This study will only compare the correctness of predictions when these are based on sparse data. Other factors such as speed and memory efficiency will not be taken into consideration. The correctness will be measured using the RMSE and MAE.
8 Chapter 2 Method Running the two filtering methods, user and item-based filtering, on a dataset is henceforth referred to as a "test". Every test was conducted 10 times with randomized sets of training and test data. The mean value of these 10 runs represent the result of a test. 2.1 Data handling Before use, the data needed processing. Following are the methods used to prepare the data for testing Simulating sparse data In the study, sparse data is defined by using 20% of the dataset for training and 80% for verification. This ratio has been used in similar studies [17] Formatting data The dataset provided from MovieLens and FilmTrust use a format that Surprise can handle natively. The dataset from CiaoDVD was formatted before use. The python script in appendix B.3 was used to retrieve only the columns with user id, movie id and rating Creating test data The data was split using a python script, see appendix B.2, that first read all the data from file into an array. Then a shuffle of the array was done by providing a seed value, ranging from 1 to 10, to the shuffle function in the python library. After that every fifth rating (20%) was written to one file and the rest was written to another. The smaller file was then used as training data for the recommender system and the bigger file was used as test data. This was repeated 10 times with different seeds for each dataset. 2.2 Conducting the tests The created test and training datasets were used to build models, run the prediction algorithm and evaluate the result. See appendix B.1 for code. 7
9 8 CHAPTER 2. METHOD Building similarity model A PCC and cosine similarity model was built for each dataset. Note that the models had to be created for each dataset and only one model could be evaluated in each run. This was configured with built in functions in the Surprise library Building the prediction algorithm Built-in methods in Surprise were used to create the prediction algorithm. In table 2.1 the configurations for the different prediction algorithms are shown. All setups used a minimum of 1 neighbour for predictions. Test Filtering method Similarity model Max Neighbours used 1 Item-based cosine 40 2 User-based cosine 40 3 Item-based pearson 40 4 User-based pearson 40 Table 2.1: Configurations for prediction algorithms Evaluating the algorithms Evaluation of the algorithms was done with the built-in function, evaluate(), in the Surprise library. Each test was run with all (10) test and training data combinations for each dataset. For both correlation evaluations (PCC and cosine similarity) and each dataset a mean value for the RMSE and MAE score was calculated based on the evaluation of the 10 different seeded partitions of the data. An average was used to prevent strong influences from deviating scores in the case of bad data in the results.
10 Chapter 3 Results The following structure will be used to present the results of the study: Two sections are used showing results based on each of the similarity matrix structures, Pearson correlation coefficient (Pearson) or cosine similarity (Cosine). For all datasets, user and item-based filtering will be compared side by side in a plot for each metric, MAE or RMSE. The plot shows the average value of the 10 test runs. The lower the value, the better predictions have been made. Following the plot of average scores there is another plot which shows the max deviation for the scores. This is the difference between the highest and lowest score of the 10 test runs for each dataset and filtering method. The lower the difference, the smaller the spread which has been observed between different test runs. This plot is included to give an idea of how much the tests varied which is relevant as we use an average value. The full metrics of the tests are presented in appendix A. 3.1 Pearson The following results were obtained using the Pearson method for the similarity matrix. Figure 3.1: MAE, Pearson 9
11 10 CHAPTER 3. RESULTS The plot in figure 3.1 shows the results for the MAE scores. The plot shows a small advantage for item-based filtering for the FilmTrust dataset while there is an opposite advantage for the MovieLens dataset. For the CiaoDVD dataset user and item-based based filtering score about the same. Figure 3.2: Max MAE score deviation for Pearson The difference plot in figure 3.2 shows that the difference of the max and min value is less than for all the datasets. FilmTrust has highest value for user-based filtering. The scores have a deviation of around 3%. The plot also shows that there is a big difference for user and item-based deviation for FilmTrust. Figure 3.3: RMSE, Pearson The RMSE scores, plotted in figure 3.3, give hints about the same trends as the MAE scores. The dataset for FilmTrust had better accuracy when item-based filtering was used
12 CHAPTER 3. RESULTS 11 and MovieLens had better accuracy when user-based was used. CiaoDVD had about the same accuracy for both filtering methods. Figure 3.4: Max RMSE score deviation for Pearson The difference plot in figure 3.4 shows the same max deviation for the FilmTrust dataset with less than difference between the max and min values. The difference between the user and item-based approaches for the FilmTrust dataset which was observed in figure 3.2 is present here as well.
13 12 CHAPTER 3. RESULTS 3.2 Cosine The following results were obtained using the cosine similarity method for the similarity matrix. Figure 3.5: MAE, Cosine In figure 3.5 the same trend which was observed for the pearson matrices in figure 3.1 are still visible. However, user and item-based filtering scored slightly closer to each other. Figure 3.6: Max MAE score deviation for cosine For the cosine similarity matrix, the difference between the max and min scores are much closer than for the Pearson similarity matrices. From figure 3.6 we see that the max score deviation is less than 0.01 points. However, there is a slightly lesser deviation for
14 CHAPTER 3. RESULTS 13 item-based filtering for all datasets. Notice that the big deviation for user-based filtering for the FilmTrust dataset which was observed when using the Pearson method is not present here. Figure 3.7: RMSE, Cosine The RMSE score using the cosine similarity matrix plotted in figure 3.7 shows the same trends as the RMSE score for the Pearson similarity matrix in figure 3.3. Figure 3.8: Max RMSE score deviation for cosine As opposed to the MAE score we see a slightly smaller deviation of the scores for user-based filtering. The deviation is less than 0.01 points which is very low.
15 Chapter 4 Discussion The discussion section has been divided into three parts with one part discussing our results and how the study was conducted, one part talking about external dependencies and the last part analysing the current state of the art and the relevancy of the study. Figures show a clear pattern where neither user nor item-based filtering has a clear advantage over the other, independent of error and correlation measurements (MAE, RMSE and Pearson, cosine). The results suggest that the choice of filtering method should be based on the data set. Exactly what properties of the data set that one should look for when determining filtering method is hard to say based on this study as it only contains 3 different ones with several differences (making it hard to pinpoint determining factors). Our experiments show a clear correlation between the two error measurements where both give the same result for every dataset on what filtering method performed best. The MAE scores being lower than the respective RMSE ones across the board is expected as MAE can never produce a higher value than RMSE, only an equal one (if all errors have the same magnitude). The maximum k value for the k-nearest neighbours algorithm which denotes how many items or users one makes the recommendations based on was chosen to be 40 in all tests. Choosing the optimal k value is not a simple task and there are many suggestions for how one should go about doing it but no agreed upon best method [18]. Using cross validation with different k values and comparing results is one recommended method but this approach depends on the data set. Since different data sets are used in this study, different k values might be needed for the datasets to enable the system to perform at optimal capacity. Other ways of calculating an optimal k value are discussed in [19]. Calculating an optimal k value for every data set was considered outside of this study s scope and the default value of the Surprise library (40) was used instead. This value is, as stated, the maximum number of neighbours which the algorithm will consider. If there are not 40 users (or items) which are similar enough to be considered neighbours, Surprise will use a lower amount (to a minimum of 1). Using a different maximum k value may have an impact on the results if this study s experiments are to be remade. Every test result is a mean average of 10 runs where the training and test data sets were randomized. This method was used because it was a fair compromise when considering its correctness and the scope of the study. One can naturally get a more statistically sound value by averaging 1000 test runs instead of 10 but running the tests is 14
16 CHAPTER 4. DISCUSSION 15 time consuming (computationally) and it is hard to set a limit for how many data points are needed for a fair assessment. One more thing which our method doesn t account for is outliers which can skew the mean considerably. However, only running each test 10 times allowed us to see that no big statistical outliers were present in the mean calculations. This is shown in the figures (3.2, 3.4, 3.6, 3.8) 4.1 External dependencies Two of the datasets, FilmTrust and CiaoDVD, were acquired from a scientific paper and not taken directly from their respective source. They were both collected by crawling the websites while these were online (they have been shut down at the time of writing). This makes it hard to control the correctness of the data. The dataset from CiaoDVD came in a non-compatible format for the python program so the data had to be processed and formatted which leaves room for human error. An important attribute of the MovieLens dataset is that all users have made at least 20 ratings. There are no known similar minimum thresholds for the other datasets. To raise the confidence of the drawn conclusions, more datasets should be used of varying sizes and from areas other than movie ratings. Initially the paper included a dataset from Yelp of restaurant reviews but because of its different data format and time restrictions, this dataset could not be used in this study. We have no reason to doubt the Surprise software. All our tests have returned reasonable results and Surprise looks like a professionally built product for all intents and purposes. It is open source, actively maintained (latest commit was within 24 hours of writing ( )), well documented and written by a Ph.D. student at IRIT (Toulouse Institute of Computer Science Research). To confirm the accuracy of the software, one can use the same data sets and algorithms of this study and input these into another working recommender system and check if the results are identical. 4.2 State of the art and relevancy Many companies use recommender systems today. Some bigger ones are Amazon, Facebook, Linkedin and Youtube. Finding out exactly what algorithms these companies use and how they are implemented has proven very difficult. There are two major reasons for this. One is that such information is part of their (often) closed source code. The other is that there is no simple answer to the question as most modern recommender systems are based on a plethora of algorithms. One famous case where this was displayed was the Netflix Prize, a contest for developing a better recommender system for Netflix with a price pool of a million dollars [20]. The best (winning) algorithms were in fact never implemented by Netflix as their huge complexity and engineering effort required overshadowed the slightly better predictions they would bring [21]. The relevancy of the study can be questioned since its scope is quite narrow. Limiting itself to only comparing the accuracy of the two methods and dismissing other factors such as memory efficiency and computational demand/speed may make the results irrelevant if one of the methods can t ever be feasibly applied because of such limitations. However, even if such limitations do exist, this and similar studies could provide valuable insight for if pursuing a solution to such limitations is worth putting effort into.
Part 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationCOMP6237 Data Mining Making Recommendations. Jonathon Hare
COMP6237 Data Mining Making Recommendations Jonathon Hare jsh2@ecs.soton.ac.uk Introduction Recommender systems 101 Taxonomy of recommender systems Collaborative Filtering Collecting user preferences as
More informationJeff Howbert Introduction to Machine Learning Winter
Collaborative Filtering Nearest es Neighbor Approach Jeff Howbert Introduction to Machine Learning Winter 2012 1 Bad news Netflix Prize data no longer available to public. Just after contest t ended d
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationUsing Social Networks to Improve Movie Rating Predictions
Introduction Using Social Networks to Improve Movie Rating Predictions Suhaas Prasad Recommender systems based on collaborative filtering techniques have become a large area of interest ever since the
More informationRecommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen
Recommendation Algorithms: Collaborative Filtering CSE 6111 Presentation Advanced Algorithms Fall. 2013 Presented by: Farzana Yasmeen 2013.11.29 Contents What are recommendation algorithms? Recommendations
More informationProject Report. An Introduction to Collaborative Filtering
Project Report An Introduction to Collaborative Filtering Siobhán Grayson 12254530 COMP30030 School of Computer Science and Informatics College of Engineering, Mathematical & Physical Sciences University
More informationWhy Use Graphs? Test Grade. Time Sleeping (Hrs) Time Sleeping (Hrs) Test Grade
Analyzing Graphs Why Use Graphs? It has once been said that a picture is worth a thousand words. This is very true in science. In science we deal with numbers, some times a great many numbers. These numbers,
More informationCS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp
CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as
More informationSingular Value Decomposition, and Application to Recommender Systems
Singular Value Decomposition, and Application to Recommender Systems CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Recommendation
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationWeighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract
Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD Abstract There are two common main approaches to ML recommender systems, feedback-based systems and content-based systems.
More informationA PERSONALIZED RECOMMENDER SYSTEM FOR TELECOM PRODUCTS AND SERVICES
A PERSONALIZED RECOMMENDER SYSTEM FOR TELECOM PRODUCTS AND SERVICES Zui Zhang, Kun Liu, William Wang, Tai Zhang and Jie Lu Decision Systems & e-service Intelligence Lab, Centre for Quantum Computation
More informationChapter 2 Basic Structure of High-Dimensional Spaces
Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,
More informationData Mining Lecture 2: Recommender Systems
Data Mining Lecture 2: Recommender Systems Jo Houghton ECS Southampton February 19, 2019 1 / 32 Recommender Systems - Introduction Making recommendations: Big Money 35% of Amazons income from recommendations
More informationHybrid Recommendation System Using Clustering and Collaborative Filtering
Hybrid Recommendation System Using Clustering and Collaborative Filtering Roshni Padate Assistant Professor roshni@frcrce.ac.in Priyanka Bane B.E. Student priyankabane56@gmail.com Jayesh Kudase B.E. Student
More informationGeneral Instructions. Questions
CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These
More informationRecommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems
Recommender Systems - Introduction Making recommendations: Big Money 35% of amazons income from recommendations Netflix recommendation engine worth $ Billion per year And yet, Amazon seems to be able to
More informationCS224W Project: Recommendation System Models in Product Rating Predictions
CS224W Project: Recommendation System Models in Product Rating Predictions Xiaoye Liu xiaoye@stanford.edu Abstract A product recommender system based on product-review information and metadata history
More informationUse of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University
Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University {tedhong, dtsamis}@stanford.edu Abstract This paper analyzes the performance of various KNNs techniques as applied to the
More informationSlide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques
SEVENTH EDITION and EXPANDED SEVENTH EDITION Slide - Chapter Statistics. Sampling Techniques Statistics Statistics is the art and science of gathering, analyzing, and making inferences from numerical information
More informationRecommendation Systems
Recommendation Systems CS 534: Machine Learning Slides adapted from Alex Smola, Jure Leskovec, Anand Rajaraman, Jeff Ullman, Lester Mackey, Dietmar Jannach, and Gerhard Friedrich Recommender Systems (RecSys)
More informationRecommendation Systems
Recommendation Systems Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Recommendation Systems The task: Find new items (movies, books, music, ) you may like based on what you have liked before
More informationAI Dining Suggestion App. CS 297 Report Bao Pham ( ) Advisor: Dr. Chris Pollett
AI Dining Suggestion App CS 297 Report Bao Pham (009621001) Advisor: Dr. Chris Pollett Abstract Trying to decide what to eat can be challenging and time-consuming. Google or Yelp are two popular search
More informationThanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman
Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman http://www.mmds.org Overview of Recommender Systems Content-based Systems Collaborative Filtering J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive
More informationFurther Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables
Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables
More informationMining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams
/9/7 Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu /6/01 Jure Leskovec, Stanford C6: Mining Massive Datasets Training data 100 million ratings, 80,000 users, 17,770
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu Training data 00 million ratings, 80,000 users, 7,770 movies 6 years of data: 000 00 Test data Last few ratings of
More informationThe influence of social filtering in recommender systems
The influence of social filtering in recommender systems 1 Introduction Nick Dekkers 3693406 Recommender systems have become more and more intertwined in our everyday usage of the web. Think about the
More informationCSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.
CSE 547: Machine Learning for Big Data Spring 2019 Problem Set 2 Please read the homework submission policies. 1 Principal Component Analysis and Reconstruction (25 points) Let s do PCA and reconstruct
More informationCptS 570 Machine Learning Project: Netflix Competition. Parisa Rashidi Vikramaditya Jakkula. Team: MLSurvivors. Wednesday, December 12, 2007
CptS 570 Machine Learning Project: Netflix Competition Team: MLSurvivors Parisa Rashidi Vikramaditya Jakkula Wednesday, December 12, 2007 Introduction In current report, we describe our efforts put forth
More informationRecommender Systems New Approaches with Netflix Dataset
Recommender Systems New Approaches with Netflix Dataset Robert Bell Yehuda Koren AT&T Labs ICDM 2007 Presented by Matt Rodriguez Outline Overview of Recommender System Approaches which are Content based
More informationData can be in the form of numbers, words, measurements, observations or even just descriptions of things.
+ What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and
More informationStatistics can best be defined as a collection and analysis of numerical information.
Statistical Graphs There are many ways to organize data pictorially using statistical graphs. There are line graphs, stem and leaf plots, frequency tables, histograms, bar graphs, pictographs, circle graphs
More informationCollaborative Filtering using Euclidean Distance in Recommendation Engine
Indian Journal of Science and Technology, Vol 9(37), DOI: 10.17485/ijst/2016/v9i37/102074, October 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Collaborative Filtering using Euclidean Distance
More informationStatistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte
Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,
More informationCollaborative Filtering using a Spreading Activation Approach
Collaborative Filtering using a Spreading Activation Approach Josephine Griffith *, Colm O Riordan *, Humphrey Sorensen ** * Department of Information Technology, NUI, Galway ** Computer Science Department,
More informationSTA 570 Spring Lecture 5 Tuesday, Feb 1
STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row
More informationGLOSSARY OF TERMS. Commutative property. Numbers can be added or multiplied in either order. For example, = ; 3 x 8 = 8 x 3.
GLOSSARY OF TERMS Algorithm. An established step-by-step procedure used 1 to achieve a desired result. For example, the 55 addition algorithm for the sum of two two-digit + 27 numbers where carrying is
More informationUsing Excel for Graphical Analysis of Data
Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are
More informationKnowledge Discovery and Data Mining 1 (VO) ( )
Knowledge Discovery and Data Mining 1 (VO) (707.003) Data Matrices and Vector Space Model Denis Helic KTI, TU Graz Nov 6, 2014 Denis Helic (KTI, TU Graz) KDDM1 Nov 6, 2014 1 / 55 Big picture: KDDM Probability
More informationIntroduction to Data Mining
Introduction to Data Mining Lecture #7: Recommendation Content based & Collaborative Filtering Seoul National University In This Lecture Understand the motivation and the problem of recommendation Compare
More informationCS 124/LINGUIST 180 From Languages to Information
CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of
More informationCPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017
CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other
More informationUsing a percent or a letter grade allows us a very easy way to analyze our performance. Not a big deal, just something we do regularly.
GRAPHING We have used statistics all our lives, what we intend to do now is formalize that knowledge. Statistics can best be defined as a collection and analysis of numerical information. Often times we
More informationCS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering
W10.B.0.0 CS435 Introduction to Big Data W10.B.1 FAQs Term project 5:00PM March 29, 2018 PA2 Recitation: Friday PART 1. LARGE SCALE DATA AALYTICS 4. RECOMMEDATIO SYSTEMS 5. EVALUATIO AD VALIDATIO TECHIQUES
More informationES-2 Lecture: Fitting models to data
ES-2 Lecture: Fitting models to data Outline Motivation: why fit models to data? Special case (exact solution): # unknowns in model =# datapoints Typical case (approximate solution): # unknowns in model
More informationAverages and Variation
Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus
More informationCISC 4631 Data Mining
CISC 4631 Data Mining Lecture 03: Nearest Neighbor Learning Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof. R. Mooney (UT Austin) Prof E. Keogh (UCR), Prof. F.
More informationGlossary Common Core Curriculum Maps Math/Grade 6 Grade 8
Glossary Common Core Curriculum Maps Math/Grade 6 Grade 8 Grade 6 Grade 8 absolute value Distance of a number (x) from zero on a number line. Because absolute value represents distance, the absolute value
More informationSimilarity and recommender systems
Similarity and recommender systems Andreas C. Kapourani January 8 Introduction In this lab session we will work with some toy data and implement a simple collaborative filtering recommender system (RS),
More informationVCEasy VISUAL FURTHER MATHS. Overview
VCEasy VISUAL FURTHER MATHS Overview This booklet is a visual overview of the knowledge required for the VCE Year 12 Further Maths examination.! This booklet does not replace any existing resources that
More informationBuilding Better Parametric Cost Models
Building Better Parametric Cost Models Based on the PMI PMBOK Guide Fourth Edition 37 IPDI has been reviewed and approved as a provider of project management training by the Project Management Institute
More informationTowards a hybrid approach to Netflix Challenge
Towards a hybrid approach to Netflix Challenge Abhishek Gupta, Abhijeet Mohapatra, Tejaswi Tenneti March 12, 2009 1 Introduction Today Recommendation systems [3] have become indispensible because of the
More informationBar Graphs and Dot Plots
CONDENSED LESSON 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs
More informationChallenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track
Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Alejandro Bellogín 1,2, Thaer Samar 1, Arjen P. de Vries 1, and Alan Said 1 1 Centrum Wiskunde
More informationBy Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin
By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth Under The Guidance of Dr. Richard Maclin Outline Problem Statement Background Proposed Solution Experiments & Results Related Work Future
More informationGraphical Analysis of Data using Microsoft Excel [2016 Version]
Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.
More informationCCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12
Tool 1: Standards for Mathematical ent: Interpreting Functions CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12 Name of Reviewer School/District Date Name of Curriculum Materials:
More informationImage Compression With Haar Discrete Wavelet Transform
Image Compression With Haar Discrete Wavelet Transform Cory Cox ME 535: Computational Techniques in Mech. Eng. Figure 1 : An example of the 2D discrete wavelet transform that is used in JPEG2000. Source:
More informationA Recommender System Based on Improvised K- Means Clustering Algorithm
A Recommender System Based on Improvised K- Means Clustering Algorithm Shivani Sharma Department of Computer Science and Applications, Kurukshetra University, Kurukshetra Shivanigaur83@yahoo.com Abstract:
More informationTechnical Arts 101 Prof. Anupam Saxena Department of Mechanical engineering Indian Institute of Technology, Kanpur. Lecture - 7 Think and Analyze
Technical Arts 101 Prof. Anupam Saxena Department of Mechanical engineering Indian Institute of Technology, Kanpur Lecture - 7 Think and Analyze Last time I asked you to come up with a single funniest
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS46: Mining Massive Datasets Jure Leskovec, Stanford University http://cs46.stanford.edu /7/ Jure Leskovec, Stanford C46: Mining Massive Datasets Many real-world problems Web Search and Text Mining Billions
More informationData Mining. Lecture 03: Nearest Neighbor Learning
Data Mining Lecture 03: Nearest Neighbor Learning Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof. R. Mooney (UT Austin) Prof E. Keogh (UCR), Prof. F. Provost
More informationSample some Pi Monte. Introduction. Creating the Simulation. Answers & Teacher Notes
Sample some Pi Monte Answers & Teacher Notes 7 8 9 10 11 12 TI-Nspire Investigation Student 45 min Introduction The Monte-Carlo technique uses probability to model or forecast scenarios. In this activity
More informationPreparing for AS Level Further Mathematics
Preparing for AS Level Further Mathematics Algebraic skills are incredibly important in the study of further mathematics at AS and A level. You should therefore make sure you are confident with all of
More informationCourse Outline for Grade 12 College Foundations MAP4C
Course Outline for Grade 12 College Foundations MAP4C UNIT 1 TRIGONOMETRY Pearson Pg. 8-12 #2-5 1.1, Introduction to Trigonometry, Primary Trig Ratios C3.1 solve problems in two dimensions using metric
More informationInternational Journal of Advance Engineering and Research Development. A Facebook Profile Based TV Shows and Movies Recommendation System
Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 3, March -2017 A Facebook Profile Based TV Shows and Movies Recommendation
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More informationImproving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm
Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm Majid Hatami Faculty of Electrical and Computer Engineering University of Tabriz,
More informationStatistics 1 - Basic Commands. Basic Commands. Consider the data set: {15, 22, 32, 31, 52, 41, 11}
Statistics 1 - Basic Commands http://mathbits.com/mathbits/tisection/statistics1/basiccommands.htm Page 1 of 3 Entering Data: Basic Commands Consider the data set: {15, 22, 32, 31, 52, 41, 11} Data is
More informationCS 124/LINGUIST 180 From Languages to Information
CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of
More information6 TOOLS FOR A COMPLETE MARKETING WORKFLOW
6 S FOR A COMPLETE MARKETING WORKFLOW 01 6 S FOR A COMPLETE MARKETING WORKFLOW FROM ALEXA DIFFICULTY DIFFICULTY MATRIX OVERLAP 6 S FOR A COMPLETE MARKETING WORKFLOW 02 INTRODUCTION Marketers use countless
More informationSurvey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9
Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2
More informationHierarchical Clustering
What is clustering Partitioning of a data set into subsets. A cluster is a group of relatively homogeneous cases or observations Hierarchical Clustering Mikhail Dozmorov Fall 2016 2/61 What is clustering
More informationPersonalized Web Search
Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents
More informationCOMP 465: Data Mining Recommender Systems
//0 movies COMP 6: Data Mining Recommender Systems Slides Adapted From: www.mmds.org (Mining Massive Datasets) movies Compare predictions with known ratings (test set T)????? Test Data Set Root-mean-square
More informationLagrange Multipliers and Problem Formulation
Lagrange Multipliers and Problem Formulation Steven J. Miller Department of Mathematics and Statistics Williams College Williamstown, MA 01267 Abstract The method of Lagrange Multipliers (and its generalizations)
More informationCS 124/LINGUIST 180 From Languages to Information
CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys Metallica
More informationCity, University of London Institutional Repository. This version of the publication may differ from the final published version.
City Research Online City, University of London Institutional Repository Citation: Überall, Christian (2012). A dynamic multi-algorithm collaborative-filtering system. (Unpublished Doctoral thesis, City
More informationSmarter Balanced Vocabulary (from the SBAC test/item specifications)
Example: Smarter Balanced Vocabulary (from the SBAC test/item specifications) Notes: Most terms area used in multiple grade levels. You should look at your grade level and all of the previous grade levels.
More informationAlgebra 2 Chapter Relations and Functions
Algebra 2 Chapter 2 2.1 Relations and Functions 2.1 Relations and Functions / 2.2 Direct Variation A: Relations What is a relation? A of items from two sets: A set of values and a set of values. What does
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Fundamentals of learning (continued) and the k-nearest neighbours classifier Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart.
More informationI can solve simultaneous equations algebraically, where one is quadratic and one is linear.
A* I can manipulate algebraic fractions. I can use the equation of a circle. simultaneous equations algebraically, where one is quadratic and one is linear. I can transform graphs, including trig graphs.
More informationMiddle School Math Course 3
Middle School Math Course 3 Correlation of the ALEKS course Middle School Math Course 3 to the Texas Essential Knowledge and Skills (TEKS) for Mathematics Grade 8 (2012) (1) Mathematical process standards.
More informationRecommender Systems 6CCS3WSN-7CCSMWAL
Recommender Systems 6CCS3WSN-7CCSMWAL http://insidebigdata.com/wp-content/uploads/2014/06/humorrecommender.jpg Some basic methods of recommendation Recommend popular items Collaborative Filtering Item-to-Item:
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationMastery. PRECALCULUS Student Learning Targets
PRECALCULUS Student Learning Targets Big Idea: Sequences and Series 1. I can describe a sequence as a function where the domain is the set of natural numbers. Connections (Pictures, Vocabulary, Definitions,
More informationCorrelation. January 12, 2019
Correlation January 12, 2019 Contents Correlations The Scattterplot The Pearson correlation The computational raw-score formula Survey data Fun facts about r Sensitivity to outliers Spearman rank-order
More informationData Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology
❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous
More informationCPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017
CPSC 340: Machine Learning and Data Mining Recommender Systems Fall 2017 Assignment 4: Admin Due tonight, 1 late day for Monday, 2 late days for Wednesday. Assignment 5: Posted, due Monday of last week
More informationFractions. 7th Grade Math. Review of 6th Grade. Slide 1 / 306 Slide 2 / 306. Slide 4 / 306. Slide 3 / 306. Slide 5 / 306.
Slide 1 / 06 Slide 2 / 06 7th Grade Math Review of 6th Grade 2015-01-14 www.njctl.org Slide / 06 Table of Contents Click on the topic to go to that section Slide 4 / 06 Fractions Decimal Computation Statistics
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu /2/8 Jure Leskovec, Stanford CS246: Mining Massive Datasets 2 Task: Given a large number (N in the millions or
More informationRecommender Systems using Collaborative Filtering D Yogendra Rao
Recommender Systems using Collaborative Filtering D Yogendra Rao Department of Computer Science and Engineering National Institute of Technology Rourkela Rourkela 769 008, India Recommender Systems using
More informationPart 12: Advanced Topics in Collaborative Filtering. Francesco Ricci
Part 12: Advanced Topics in Collaborative Filtering Francesco Ricci Content Generating recommendations in CF using frequency of ratings Role of neighborhood size Comparison of CF with association rules
More informationTips and Guidance for Analyzing Data. Executive Summary
Tips and Guidance for Analyzing Data Executive Summary This document has information and suggestions about three things: 1) how to quickly do a preliminary analysis of time-series data; 2) key things to
More informationData Mining Techniques
Data Mining Techniques CS 60 - Section - Fall 06 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Recommender Systems The Long Tail (from: https://www.wired.com/00/0/tail/)
More informationDemystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian
Demystifying movie ratings 224W Project Report Amritha Raghunath (amrithar@stanford.edu) Vignesh Ganapathi Subramanian (vigansub@stanford.edu) 9 December, 2014 Introduction The past decade or so has seen
More informationLines of Symmetry. Grade 3. Amy Hahn. Education 334: MW 8 9:20 a.m.
Lines of Symmetry Grade 3 Amy Hahn Education 334: MW 8 9:20 a.m. GRADE 3 V. SPATIAL SENSE, GEOMETRY AND MEASUREMENT A. Spatial Sense Understand the concept of reflection symmetry as applied to geometric
More information