Study on Recommendation Systems and their Evaluation Metrics PRESENTATION BY : KALHAN DHAR
Agenda Recommendation Systems Motivation Research Problem Approach Results References
Business Motivation What is the best recommendation system for my business?
Research Motivation Lack of a more comprehensive approach for evaluating recommendation systems: To understand some of the unexplored metrics pertaining to recommendation algorithms. To combine some of these unexplored metrics with explored ones and evaluate variations. How to interpret the evaluation results? How to choose a recommendation system for your application or business?
What are Recommendation Systems? Recommendation systems or recommendation systems are a subclass of information filtering systems that seek to predict the "rating" or "preference" that a user would give to an item. How do they help? Recommending good items of user interest. Rating Predictions. What s good? Better accuracy. Precision/Recall.
Approaches Content-Based Recommendation: Creation of Item Profiles Product Recommendations: Type of Product, Usage, Cost, Family, Popularity etc. Creation of User Profiles Purchase history, User Info, User Involvement etc. Collaborative Filtering Recommendation Systems: Item-Item CF User-User CF Tasks Recommendation Algorithms Study Implementation of Algorithms and Variations Comparing Accuracy on Known Metrics Exploring Unknown Metrics Goingthrough literature on Unknown metrics Implementing some unknown metric and comparing results. Results and Observations
Data Set and Technologies Used Movie Lens Dataset: ~700 users ~100000 ratings ~9000 Movies Software's used: R, MySQL
Variations in Recommendation Systems
Variations are simple intuitions being captured Pearson Correlation: More commonly known as centered cosine-> Useful in Recommendations based on discrete values like ratings or reviews Jacard Similarity: Size of Intersection/Size of Union -> Useful in Product Recommendation engines where User Click Activity is recorded. Weighted Average: Generally used with CF Recommendations. Used in Recommendations with Ratings/Reviews Range Conversion: NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin. Simple weight to rating prediction conversion. Can be used in the initial phases of recommendation implementation. The notion of percentiles is an extension of this idea.
Serendipity/Novelty Recommendation which the user did not know about. Example: Movie Recommendations A recommendation was presented to the user which was different than the usual recommendations. Deviation from the natural prediction order. Idea of a threshold: Above which the concept of similarity is flawed. Similarity becomes redundancy.
Redundancy Thresholds Redundancy begins after the threshold
Diversity Diversity metrics refer to level of diversity of the recommendations. Example: Vacation hotel bookings Generally defined as the opposite of similarity. Diversity may come at the expense of similarity. Creating an Accuracy Diversity Curve can help in getting the threshold value.
Similarity-Diversity Curve The plot here shows the curve between similarity and diversity A,B, C represent certain variations of recommendation algorithms used. The point of intersection of the graphs represents the recommendation system with optimum similarity with respect to an optimum diversity.
Results Table
Interpretation of Results Interpretation may depend upon the kind of service and business where the recommendation system is used. Health- Focus on Accuracy and Precision is important in health care. Health Recommender Systems: Concepts, Requirements, Technical Basics and Challenges : This paper gives examples of applications of recommendation engines in Health Care Online Products-Similarity-Diversity Curve and Novelty should be preferred. Recommender Systems in E-Commerce : This paper gives details about the usage of recommendation engines in the field of E-commerce Tourism: Novelty and Diversity are important metrics. Travel Recommendations in a Mobile Tourist Information System : This paper talks about utility of recommendation systems in Tourism and Travel Industry.
Issues faced during the implementation Small data can become big data in no time Several manipulations on the data set required a cross product of users and items. 5.6 billion rows for an intermediate dataset. This increase impacted the data size hugely. Time is money Certain queries are to be kept overnight for them to run and produce results. Need to experiment with different R libraries to solve this issue. Lot of skewed joins in certain places. Hardware Constraints Only 4GB Ram, 256GB Disk
Weaknesses Results and interpretations are limited to the Movie Lens Data set being used and have not been evaluated on any other dataset of a different domain. Implementations are constrained by the characteristics of the data set. Number of features are limited. Dataset is static in nature. Several assumptions are to be taken for the metric evaluations. Implementation is influenced by the hardware configuration. It is only the creator who really knows the flaws of his design
Future work Creation of an end-to-end mechanism for evaluating recommendation systems. Comparison of datasets from different data sources. To be able to compare several unexplored metrics like Robustness, Adaptability, Risk. To be able to have a standardized definition of these unexplored metrics. To bring these unexplored metrics to the main stream of recommendation system evaluation.
References Ziegler, C.N., McNee, S.M., Konstan, J.A., Lausen, G.: Improving recommendaoon lists through topic diversificaoon. In: WWW 0 5: Proceedings of the 14th international conference on World Wide Web, pp. 22 32. ACM, New York, NY, USA (2005) Zhang, Y., Callan, J., Minka, T.: Novelty and redundancy detection in adaptive filtering. In: SIGIR 02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 81 88. ACM, New York, NY, USA (2002) Zhang, M., Hurley, N.: Avoiding monotony: improving the diversity of recommendation lists. In: RecSys 08: Proceedings of the 2008 ACM conference on Recommender systems, pp. 123 130. ACM, New York, NY, USA (2008) Bradley, K., Smyth, B.: Improving recommendation diversity. In: Twelfth Irish Conference on
References Continued A Survey of Accuracy Evaluation Metrics of Recommendation Tasks, Guy Shani, Asela Gunawardana. Recommender Systems in E-Commerce J. Ben Schafer, Joseph Konstan, John Riedl Collaborative Filtering Recommender Systems By Michael D. Ekstrand, John T. Riedl and Joseph A. Konstan Health Recommender Systems: Concepts, Requirements, Technical Basics and Challenges Travel Recommendations in a Mobile Tourist Information System
Image References Dexter: http://jtwitch88.deviantart.com/art/dexter-578146210 Book: http://www.freestockphotos.biz/stockphoto/14324 User: https://commons.wikimedia.org/wiki/file:system-users.svg MovieLens: https://fi.wikipedia.org/wiki/tiedosto:movielens-helping.gif R: https://en.wikipedia.org/wiki/r_(programming_language) MySQL: https://en.wikipedia.org/wiki/file:mysql_logo.png Novelty: https://en.wikipedia.org/wiki/file:novelty.jpg.jpg Diversity: http://calicospanish.com/a-rainbow-of-diversity-diverse-methods-and-styles-in-worldlanguage-teaching/
Image References Continued Agenda: https://pixabay.com/en/photos/agenda/ Amazon: https://commons.wikimedia.org/wiki/file:amazon.com-logo.svg Google: https://commons.wikimedia.org/wiki/file:googlechangedlogo.jpg Youtube: https://commons.wikimedia.org/wiki/file:logo_youtube.svg Netflix:https://commons.wikimedia.org/wiki/File:Netflix_logo.svg Analyst: http://www.thebluediamondgallery.com/scrabble/a/analyst.html Question: https://commons.wikimedia.org/wiki/file:question_mark.svg
Q and A