Indoor Location-based Recommender System. Zhongduo Lin

Size: px
Start display at page:

Download "Indoor Location-based Recommender System. Zhongduo Lin"

Transcription

1 Indoor Location-based Recommender System by Zhongduo Lin A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto Copyright c 2013 by Zhongduo Lin

2 Abstract Indoor Location-based Recommender System Zhongduo Lin Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2013 WiFi-based indoor localization is emerging as a new positioning technology. In this work, we present our efforts to find the best recommender system based on the indoor location tracks collected from the Bow Valley shopping mall for one week. The time a user spends in a shop is considered as an implicit preference and different mapping algorithms are proposed to map the time to a more realistic rating value. A new distribution error metric is proposed to examine the mapping algorithms. Eleven different recommender systems are built and evaluated in terms of accuracy and execution time. The Slope-One recommender system with a logarithmic mapping algorithm is finally selected with a score of 1.292, distribution error of and execution time of 0.39 seconds for ten runs. ii

3 Acknowledgements I would like to thank my supervisor, Prof. Paul Chow, for his great guidance throughout the past two years. Thank you for allowing me to explore what I like and giving me full support when I didn t do well in my work. I also want to acknowledge all the support from my group members, who are always willing to help me whenever needed. Thank all my colleagues in CISCO Systems, especially my mentor Vince Mammoliti. Thank you for being such a great friend and philosopher. I greatly appreciate the encouragement from all my friends in the past two years. Thank you for cheering me up when I am down. iii

4 Contents 1 Introduction Contribution Thesis Organization Recommender System Overview Recommender System Model Recommender System Classifications Categories based on solutions Categories based on information collecting methods Categories based on evaluation methods Collaborative Filtering Algorithms User-based CF algorithm Item-based CF algorithm Slope-One algorithm Similarity metrics Related Work 20 4 Collecting Data Indoor Localization with WiFi Data Collection Architecture iv

5 4.3 The Bow Valley Square Data Set Definition of terms Building test beds Accuracy analysis Preprocessing algorithms Statistical analysis Building the Recommender Systems System Overview Interaction between MSE and device Interaction between MSE and server Interaction between server and device Mahout Implicit Preference Mapping Functions Mapping functions Evaluation metrics Methodology and Evaluation Methodology Experimental platform Test cases Evaluation methods Evaluation Evaluation of non-scaling mapping functions Evaluation of mapping functions with scaling Execution time evaluation Conclusion 59 v

6 8 Future Work 60 Bibliography 62 vi

7 List of Tables 2.1 Term definitions for different RSs A simple example of a utility matrix Classification of recommender systems research [1] A simple example of average difference and root-mean-square difference. [2] GroupLens 1M breakdown Comparison of RSs with non-scaling mapping functions Comparison of RSs with scaling mapping functions Distribution error for scaling mapping functions Comparison in terms of execution time vii

8 List of Figures 4.1 Typical wireless controller to AP deployment for location [3] MSE High Level Architecture [3] Interaction with the MSE API [3] Device occurrence distribution Number of points per path Path duration distribution Service System Overview Mahout Framework Implicit recommender system overview User-based RS with non-scaling mapping functions User-based RS with scaling mapping functions viii

9 Chapter 1 Introduction In the 2002 movie, Minority Report [4], Tom Cruise walks through a shopping mall and is targeted with personalized advertising. This is a highly sophisticated form of a Location- Based Service that provides a service based on the location of a user. The advertising Cruise receives also recommends products, such as cars and beer. This is a Recommender System (RS) that utilizes the location-based service to make a recommendation based on the user s location. This thesis develops the first WiFi-based Recommender System as a platform for further exploration of such systems. With the development of sensing and automated means of perceiving the physical environment, it is possible to collect much more implicit context with our everyday electronic devices such as smart phones, or personal digital assistants (PDA). Among these contexts, Ljungstrand [5] predicted that location-based services (LBSs) [6] will be the most common form of context-aware computing [7]. Traditionally, LBSs were designed to support outdoor applications such as navigation and fleet management. While the Global Positioning System (GPS) [8] has achieved a great success and popularity all over the world in the recent decade, indoor LBSs are a marketing tool that have the potential to increase business profit. The growing interest in this technology can be demonstrated by the recent actions of large technology companies. 1

10 Chapter 1. Introduction 2 1. March 24th, 2013, Apple acquired WiFiSlam [9], an indoor GPS startup that enables a smart phone to pinpoint its location. 2. September 26th, 2012, Cisco announced the acquisition of ThinkSmart Technologies [10], a startup that analyses indoor location information based on Cisco s wireless networking infrastructure. 3. April 27th, 2010, Google [11] submitted a white paper to several national data protection authorities on vehicle-based collection of WiFi data for use in Google location-based services. Often used in conjunction with a LBS is a Recommender System (RS). Recommender systems [12, 13, 14] are a subclass of information filtering systems that seek to predict the rating or preference that a user would give to an item (such as music, books, or movies) or social element (e.g. people or groups) they had not yet considered, using a model built from the characteristics of an item (content-based approaches) or the user s social environment (collaborative filtering approaches). It has become an extremely common context aware service in recent years. One main area of research has been focused on building recommender systems based on user trajectories. Application domains include mobile social networking, tourism guides, urban computing and information retrieval. While there are a few research papers published recently about location-based recommender systems using GPS tracks, there are even fewer for indoor LBSs due to the lack of a standard for indoor positioning, which will be discussed in Chapter 4. Compared to GPS, WiFi-based indoor positioning systems introduce different noise due to the instability of WiFi signals, and present unique characteristics that will be discussed in Section 4.1.

11 Chapter 1. Introduction Contribution This work details our efforts to find the best recommender system implementation based on the real-world mobile tracks in a shopping mall, aiming to recommend the shops in the mall that will interest the customers according to their track history. Over ten thousand customers are tracked for a week in the Bow Valley Square shopping mall [15]. This data is used to develop a recommender system to predict the preferences a user will give to all the shops based on the data set and recommend the top shops to the user. The main contributions of this work are: 1. The first time building of a location-based recommender system using WiFi positioning technology; 2. A comparison between different recommender systems using WiFi positioning; 3. The proposal of different mapping functions for implicit information and a new evaluation metric for the mapping functions Though this work only focuses on a specific data set, it is expected that the methods developed in this thesis will work on other applications using the same positioning infrastructure. For example, CISCO s acquisition, ThinkSmart Technology, has demonstrated success on other applications such as airport planning and museum tour guide systems using the same positioning system. 1.2 Thesis Organization The remainder of this thesis is organized as follows. Chapter 2 provides an overview of the state-of-the-art technologies used in recommender systems and the different categories of recommender systems. This overview will help with understanding the basic data collection methods in Chapter 4 and evaluation methodology in Chapter 6. A list of

12 Chapter 1. Introduction 4 related work, including outdoor location-based recommender systems and generic services based on indoor location, is presented in Chapter 3. Chapter 4 details our data collecting method and a simple analysis of the data set. Chapter 5 presents the whole process of building the recommender systems. Our methodology and evaluation results are detailed in Chapter 6. Chapter 7 provides the conclusion and Chapter 8 describes future work.

13 Chapter 2 Recommender System Overview The interest in recommender systems has been high among industry and academia since the mid-1990s when the first papers on collaborative filtering were published and proved to be useful [16, 17, 18]. Then there was a significant boost to research into recommender systems when NetFlix offered a prize of $1,000,000 to the first person or team to beat their own recommender algorithm, CineMatch, by 10% [2]. This chapter will briefly introduce the basic ideas, concepts and techniques for general recommender systems. A more detailed and comprehensive overview on recommender systems can be found in [19]. 2.1 Recommender System Model There are three subjects in the data sets used to build a recommender system: user, item and preference. Since the Amazon book recommender system [20] is more familiar to the audience, it is taken as an analogy to help illustrate these three terms along with the shop recommender system (Shop RS) developed in this thesis,. Table 2.1 shows a comparison between the Amazon Book RS and Shop RS in terms of the definitions of the three subjects. Generally, the users refer to the customers who make the decisions and are the ones that the recommender systems aim to recommend items to. The items refer to different objects available for the users to choose. They can 5

14 Chapter 2. Recommender System Overview 6 Table 2.1: Term definitions for different RSs Amazon Book RS Shop RS user customers on the online system customers in the shopping mall item books on the system shops in the shopping mall preference ratings given by users to items time users spend in certain shops Table 2.2: A simple example of a utility matrix User ID CIBC Copy Centre Rise Bakery Cafe X-Press Second Cup be books for the Amazon Book RS, places for a tourism recommender system, etc. The preference refers to the degree that a user likes a certain item. The most popular way to represent the preference is by the explicit rating values ranging from 1 to 5, which is used by the Amazon online system [20] and BestBuy [21]. They can also be implied by user behaviours like clicks or time spent in a certain web page, which will be discussed later in Section 2.2. The data set used in a recommender system is always represented by a matrix that is referred to as a utility matrix. A simple example for Shop RS is illustrated in Table 2.2. The users are represented by their user IDs such as 56 in the leftmost column, while the items are represented by the names of the shops. The preference values in the example are the actual number of seconds the user spent in the shops. Notice that most values in the preference field are left blank, which indicates that the users have never entered the corresponding shops. In reality, since a row in the utility matrix will include all the preferences to each shop in the shopping mall, the matrix will be much more sparse than that in the example, with the typical user only visiting a tiny fraction of all the available shops.

15 Chapter 2. Recommender System Overview Recommender System Classifications There are many different ways to classify and identify a recommender system based on certain attributes or techniques used. Similar to [13], in this section we focus on the three major attributes that apply to most recommender systems, and can be used to identify the recommender system developed in this thesis the recommendation solutions adopted, the information collection methods and the evaluation methods Categories based on solutions A blank preference can be estimated in different ways using methods from machine learning, approximation theory and various heuristics. A recommender system solution refers to the way the recommender system is built. Following the classifications in [22], the recommender systems can be classified into three categories based on their approach: 1. Content-based recommender: The RS recommends items similar to the ones the user preferred in the past, 2. Collaborative recommender: The RS recommends items that similar users preferred in the past, 3. Hybrid recommender: A combined approach using both content-based recommender and collaborative recommender. In a content-based recommender, a record or collection of records representing important characteristics of an item, which is often referred to as an item profile, must be constructed to classify different items. For example, in a music recommender system, a profile can include features such as the composer, the singer, the genre and the year. There are various techniques to automatically extract characteristics from items, among which the most widely used one is Term Frequency/Inverse Document Frequency (TF-IDF) that is used to specify keyword weights in text-based items. The idea of

16 Chapter 2. Recommender System Overview 8 content-based recommender systems is based on the fact that people tend to like items that are similar to their preference. Assuming that some users like a song from Avril Lavigne, it is likely that they will like other songs from her. While this approach works quite well in practice, it suffers from several drawbacks: 1. Limited content analysis: Not only is it time consuming when some features, such as music genre, have to be entered manually, but it is not sufficient to reflect the different quality or popularity among items with the same features. 2. Domain-specialization: It is almost impossible to decide a set of general features for all the items. For example, a book can be well described by its page count, author and publisher, while none of these features can be applied to a bookmark. However, one can easily see the connection between a book and a bookmark. These drawbacks prevent researchers from developing a general framework for contentbased recommender systems [2]. An alternative way is to use collaborative methods, which is enjoying a high interest among researchers and is also the approach of this thesis. Instead of depending on the features of items to determine their similarity, it leverages the similarity of the user ratings for co-rated items. More details about the collaborative approach will be discussed in Section 2.3. The main limitation of this approach is the sparsity of the user ratings. In any recommender system, the number of ratings already obtained is usually very small compared to the number of ratings that have to be predicted. For example, there may be millions of books in the Amazon book recommender system, while the average number of books that a user rates may be below five. To overcome the limitations of content-based and collaborative recommender systems, a hybrid approach is proposed by combining collaborative and content-based recommender techniques. Almost all modern recommender systems can be classified into this category since they more or less utilize techniques from both content-based and col-

17 Chapter 2. Recommender System Overview 9 laborative methods. Depending on the different ways to combine these two methods, the hybrid recommender systems can be further classified as: (1) combining totally separate recommender systems, (2) adding the characteristics of one to the other, (3) constructing a general unifying model that incorporates the characteristics of both recommender systems. Since this is not the focus of this thesis, the discussion will end here. More details can be found in [1]. Table 2.3, reproduced from [1], shows a comprehensive summary of techniques used in each category and example research efforts. A brief introduction to heuristic-based and model-based techniques is described in Section Categories based on information collecting methods Information collecting methods refer to the ways that user feedback is collected. Depending on the level of user involvement, recommender systems can be classified as follows: 1. Intrusive recommender system: A significant level of user involvement is required to get the feedback. 2. Non-intrusive recommender system: Little or no explicit user involvement is required to get the feedback. Most of the practical recommender systems now fall into the intrusive recommender systems. The most widely used way to collect user feedback is to explicitly ask users to rate the items they have reviewed or purchased. For example, in almost all the online shopping systems such as BestBuy and ebay, customers are encouraged to rate the items they bought. While being predominant, intrusive recommender systems suffer from the fact that users are lazy in that they would not bother to even come up with the appropriate rating [17], resulting in an extremely small number of ratings per user. An alternative way to get feedback is to leverage certain proxies to estimate the real rating a user will give to an item. Minimizing intrusiveness while keeping the accuracy

18 Chapter 2. Recommender System Overview 10 Table 2.3: Classification of recommender systems research [1] Recommendation Recommendation Technique Approach Heuristic-based Model-based Content-based Commonly used techniques: Commonly used techniques: TF-IDF (information retrieval) Bayesian classifiers Clustering Clustering Representative research examples: Decision trees Lang 1995 Artificial neural networks Balabanovic & Shoham 1997 Representative research examples: Pazzani & Billsus 1997 Pazzani & Billsus 1997 Mooney et al Mooney & Roy 1999 Billsus & Pazzani 1999, 2000 Collaborative Hybrid Commonly used techniques: Nearest neighbor (cosine, correlation) Clustering Graph theory Representative research examples: Resnick et al Hill et al Shardanand & Maes 1995 Breese et al Nakamura & Abe 1998 Aggarwal et al Delgado & Ishii 1999 Pennock & Horwitz 1999 Sarwar et al Combining content-based and collaborative components using: Linear combination of predicted ratings Various voting schemes Incorporating one component as a part of the heuristic for the other Representative research examples: Balabanovic & Shoham 1997 Claypool et al Good et al Pazzani 1999 Billsus & Pazzani 2000 Tran & Cohen 2000 Melville et al Zhang et al Commonly used techniques: Bayesian networks Clustering Artificial neural networks Linear regression Probablistic models Representative research examples: Billsus & Pazzani 1998 Breese et al Ungar & Foster 1998 Chien & George 1999 Getoor & Sahami 1999 Pennock & Horwitz 1999 Goldberg et al Kumar et al Pavlov & Pennock 2002 Shani et al Yu et al. 2002, 2004 Hofmann 2003, 2004 Marlin 2003 Si & Jin 2003 Combining content-based and collaborative components by: Incorporating one component as a part of the model for the other Building one unifying model Representative research examples: Basu et al Condliff et al Soboroff & Nicholas 1999 Ansari et al Popescul et al Schein et al Table 2: Classification of recommender systems research Comprehensive understanding of users and items As was pointed out in [2, 8, 54, 105], most of the recommendation methods produce ratings that are based on a limited understanding of users and items as captured by user and item profiles and do not take full advantage of the information in the user's transactional histories and other

19 Chapter 2. Recommender System Overview 11 of recommendations is still an important research topic because of its difficulty and promising potential. The non-intrusive approach is mostly used in online systems such as newsgroup article recommender systems and the most common proxies are the click behaviour and the time spent on a article. Morita and Shinoda [23] have found that user preference to NetNews articles are well reflected by the time spent reading these articles regardless of the length of the article. The recommender system described in this thesis is one of the non-intrusive recommender systems, since the time users spend in a certain shop is used as a proxy to estimate their ratings. After determining the proxy, another important factor that will influence the efficiency of the recommender system is how to actually derive the appropriate estimation of preference from the implicit information. Different mapping functions are discussed in Chapter Categories based on evaluation methods A recommender system is a tool to generate the best recommendations. Therefore, before making a recommendation to the user, the recommender system should determine which recommendation is the best one. The ideal recommender would be a psychic that could know exactly user preferences towards different items. However, unlike cases such as solving a mathematical problem, where a golden key exists and can be used to examine an algorithm, no one can know exactly how much a user will like a new item, including the user itself, in a recommender system. Different recommender systems are trying to achieve a certain goal that they believe will result in the best recommendation. This goal, though it will never be perfect, is referred to as the evaluation method. The two most prevailing evaluation methods are: 1. Scoring: Set aside a small part of the real data set as test data set, then estimate the test set with the remaining data set, and finally compare the difference between the estimated values and the real ones.

20 Chapter 2. Recommender System Overview 12 Table 2.4: A simple example of average difference and root-mean-square difference. [2] Item 1 Item 2 Item 3 actual value estimated value difference average difference =( )/3=1.5 root-mean-square = ( )/3 = Precision and recall: Similar to the scoring method, but instead of comparing values, the top n items are returned for each user. The items are then used to calculate the precision and recall. Precision is the proportion of top recommendations that are good recommendations (existing highly rated items) and recall is the proportion of good recommendations that appear in top recommendations. The precision and recall method is not used in this thesis because it generally requires a relatively large number of ratings from the users, which is not applicable to this work. A more detailed description can be found in [2]. For the scoring evaluation method, after getting the estimated preference values from a collaborative filtering algorithm, the recommender systems need to determine the overall error metric they want to minimize. Average difference and root-mean-square of the differences are the two most common metrics to use. Table 2.4 gives a simple example to illustrate how these two error metrics work. The two metrics are similar in terms of the quality of recommendations. The average difference is adopted in this thesis. A smaller value indicates better performance of the recommender system.

21 Chapter 2. Recommender System Overview Collaborative Filtering Algorithms A collaborative recommender system tries to estimate the unknown ratings based on the items previously rated by other users. Therefore, unlike a content-based approach, a collaborative recommender system can only depend on the ratings rather than any domain specific information of items. According to Breese [24], collaborative filtering algorithms can be classified into two groups: 1. Heuristic-based: Make rating estimations based on the entire collection of previously rated items by the users. 2. Model-based: Use the collection of ratings to learn a model that will be used to make rating predictions. The model-based collaborative filtering algorithms are getting more and more attention recently as the interest in machine learning grows among academia. Many techniques such as artificial neural networks, singular value decomposition and clustering are leveraged to learn different models for recommender systems and are proven to be efficient in many applications [1]. Despite its great potential, it is not implemented in this thesis. Instead, the simplest-first approach is taken because the goal is to first implement a complete system that can then be studied and measured to determine where improvements are required. The heuristic-based collaborative filtering (CF) algorithms are the focus of this thesis. They are widely used because of their simplicity and efficiency. They can be further divided into two categories: 1. User-based CF algorithm: Estimate the unknown ratings based on other similar users. 2. Item-based CF algorithm: Estimate the unknown ratings based on the similarities between the target item and other items that are co-rated by the the same user.

22 Chapter 2. Recommender System Overview User-based CF algorithm The shopping mall case in this thesis is taken as an example to illustrate how the userbased CF algorithm works. Around Christmas, there will be many people going to the shopping mall to prepare for a Christmas party. Most of them will go to the candy shop, the gift shop and the decoration shop during their visits. So if a customer has already been to the candy shop and the gift shop, then it can be inferred that he or she is similar to those who are preparing for a Christmas party and the decoration shop can be a good recommendation to the customer. The reasoning is intuitive because people tend to like things that similar customers like. The first problem is how to define the similarity metrics, which will be discussed in Section Several terms, which will be used in this section, need to be defined before the algorithm is introduced. 1. Similarity: How similar a user is to another one in terms of their ratings to the items that both rated. 2. Nearest neighbours: The users that are most similar to a certain user according to the similarity metric. 3. Neighbourhood size: The number of the neighbours used in a user-based recommender system. Algorithm 1 lists a generic user-based algorithm. The first for loop aims to get the n nearest neighbours of each user. Note that we can simply ignore this step to include all the other users in the second for loop, which is equivalent to setting n to infinity. However, setting an appropriate value to n will accelerate the computation and always result in better recommendations, because more similar users tend to provide more reliable predictions. The second for loop estimates the unknown ratings by calculating a weighted average of ratings that the n nearest neighbours give to the items.

23 Chapter 2. Recommender System Overview 15 for every other user w do compute a similarity s between u and w; retain the top users, ranked by similarity, as a neighbourhood n; end for every item i that some user in n has a preference for, but u has no preference for yet do for every other user v in n that has a preference for i do compute a similarity s between u and v; incorporate v s preference for i, weighted by s, into a running average; end end Algorithm 1: User-based collaborative filtering algorithm [2] Item-based CF algorithm The same example in Section is taken to illustrate how the item-based CF algorithm works. Around Christmas, there will be many people going to the candy shop, the gift shop and the decoration shop. So if a customer has already been to the candy shop and the gift shop, then it can be concluded that since the candy shop, the gift shop and the decoration shop tend to be visited altogether, the decoration shop can be a good candidate to recommend to the customer. Though the recommendation result of this item-based approach is the same as the user-based one, the basis to draw the conclusion is different. The recommendation is based on the similarity of shops regardless of the users. Therefore, the term nearest neighbours does not exist in the item-based CF algorithms. The similarity, however, still needs to be calculated though from a different perspective. Algorithm 2 lists a generic item-based algorithm. The outer for loop iterates all the items that a user has not rated yet. The inner for loop calculates the weighted average

24 Chapter 2. Recommender System Overview 16 for every item i that u has no preference for yet do end for every item j that u has a preference for do end compute a similarity s between i and j; add u s preference for j, weighted by s, to a running average; return the top items, ranked by weighted average; Algorithm 2: Item-based collaborative filtering algorithm [2] of the ratings of items that the user has already rated. Note that the similarity s here refers to the similarity between items instead of users. There are several advantages of the item-based approach compared to the user-based one: 1. Scalability: The run time of an item-based recommender system scales up as the number of items increases, thus if the number of items is relatively low compared to the number of users, an item-based recommender system is more preferable. 2. Less subject to change: Over time the similarities between items tend to converge, while user tastes can vary vastly. Therefore, an item-based recommender system typically can start making reasonable recommendations after a user s first rating, while user-based ones need enough ratings to find nearest neighbours. 3. Able to be preprocessed: Since the item-item similarities are more fixed, it is reasonable to precompute them to speed up the execution time Slope-One algorithm The Slope-One item-based collaborative algorithm was proposed by Lemire in 2005 to reduce over-fitting, improve performance and ease implementation [25]. It has quickly

25 Chapter 2. Recommender System Overview 17 become popular. The assumption of the Slope-One filtering algorithm is that a certain linear correlation between the preference values for one item and another exists and can be used to estimate the preferences for some item Y based on the preferences for item X, via some linear functions like Y = mx + b. for every item i do end for every other item j do end for every user u expressing preference for both i and j do end add the difference in u s preference for i and j to an average; for every item i the user u expresses no preference for do end for every item j that user u expresses a preference for do end find the average preference difference between j and i; add this difference to u s preference value for j; add this to a running average; return the top items, ranked by these averages; Algorithm 3: Slope-One algorithm [2] The Slope-One algorithm is shown in Algorithm 3. The first for loop of the algorithm calculates item-item differences in preference values based on all the user ratings. The second part of the algorithm estimates each unknown rating that a user gives to a certain item, by averaging each estimated rating based on another item that the user rated and the differences between these two items. Finally the top items are returned for recommendations.

26 Chapter 2. Recommender System Overview 18 The Slope-One algorithm is attractive since the online portion is fast and scalable. Moreover, it is easily updated when a preference changes, because only the relevant difference values need to be updated. However, it suffers from large memory requirements since O(n 2 ) space is required to store the item-item differences, where n refers to the number of items Similarity metrics Both user-based and item-based collaborative filtering algorithms require the calculation of similarity. Note that the utility matrix in Table 2.2 can be described as a set of user vectors, each one of which refers to a row in the matrix, as well as a set of item vectors, each of which refers to a column in the matrix. Therefore, both the item-based and user-based approaches can share the same general vector similarity metrics. Various similarity metrics can be found in different implementations to calculate the similarity between two vectors sim(x, y). The common metrics include the Pearson correlation-based similarity, Euclidean distance-based similarity, cosine measure similarity and log-likelihood test [2]. Or alternatively, a custom one can be implemented to leverage domain-specific information. For example, if the similarity metric is determined by the attributes of items instead of ratings, then a content-based recommender system can be built using the collaborative algorithm. The similarity used in this thesis is the Pearson correlation-based similarity due to its popularity and efficiency. Equation (2.1) shows the naive equation of the Pearson correlation-based similarity for the user-based algorithms. The equation works for the item-based ones with a slight modification of the notations. Let S xy be the set of all items rated by both users x and y, r x,s be the rating user x gives to item s and r x be the

27 Chapter 2. Recommender System Overview 19 average rating of all the ratings from user x. Then sim(x, y) can be calculated as: (r x,s r x )(r y,s r y ) s S xy sim(x, y) = (r x,s r x ) (2.1) 2 (r y,s r y ) 2 s S xy s S xy The Pearson correlation is a number between -1 and 1 that measures the tendency of two vectors. A larger value implies a higher positive linear correlation between the two vectors. Intuitively, the more items that are co-rated by both users, the more reliable one user s rating can be used to predict the other s. However, the naive Pearson correlation fails to consider the number of items over which it is computed. Therefore, a weight that can reflect the number of common rated items is sometimes added to the equation of the naive Pearson correlation.

28 Chapter 3 Related Work To the best of our knowledge, there has never been any work on an indoor locationbased collaborative recommender system before. A significant reason is the lack of an indoor positioning technology, which will be discussed in Chapter 4. However, there are a handful of efforts to enhance user experience in a physical indoor shopping mall with recommendations. The consumer-friendly shopping assistance system built by Sae-Ueng [26] is the most similar attempt to this work. The system collects personal behaviour to a log file automatically with RFID and camera sensors in the ubiquitous environment. By implicitly inferring customer preferences from their behaviours, such as touching and purchasing a product, the system can then make customized recommendations to the customers. However, the fact that they use RFID to identify customers requires all the customers to wear a customized RFID tag, which generally is not applicable to a shopping mall. And using cameras to detect customer behaviour not only increases the system cost, but is not likely to guarantee a correct detection rate in reality, where people may be close to each other. The data collection method in this thesis is based on standard WiFi-enabled devices carried by the customers, which generally will not require any specific hardware from the shopping mall. 20

29 Chapter 3. Related Work 21 Though not a location-based recommender system, the personalized shopping recommender built by Hsu [27] tries to predict user preferences based on their transactional histories. Instead of building a collaborative filtering method based on ratings (e.g., GroupLens [17]) to perform personalized shopping recommendations, they derive a model-based recommender system based on a customized probabilistic graphic model called Hybrid Poisson Aspect Modelling (HyPAM) to address data skewness and sparsity. HyPAM applies a cluster model to cluster customers and an aspect model to model the relationships between customer clusters and products. Experimental results show that HyRAM outperforms representatives of the collaborative filtering methods and data mining methods by a large margin. In this thesis, only collaborative filtering algorithms are implemented to test the performance of different recommender systems. Asthana [28] designs a shopping assistant service that personalizes the attention provided to a customer based on individual needs. The system consists of a wireless communication device called a Personal Shopping Assistant, and a centralized server storing customer profiles. The server provides personalized service by pushing retail information to the user devices. However, the user cannot update the database, preventing the server from making recommendations based on user s purchase history. Also, the devices are provided by the retailer rather than integrated in a standard hand-held device. Reischach and Michahelles [29] present a concept called Apriori that enables consumers to access and share product recommendations using their mobile phone. This work encourages users to submit ratings by providing a better user interface with their mobile phones. However, the system only enables users to submit their ratings and receive others ratings, but fails to make customized recommendations to users. In this thesis, the customers are not required to explicitly rate any items and the server can make recommendations based on their behaviour. Anacleto [30] describes a customized one-to-one recommendation system inside a virtual shopping center, considering server, product, and facility at the same time. The

30 Chapter 3. Related Work 22 system extracts a purchase pattern for each customer from the shopping path history. Then it provides customized recommendations of certain brands based on the customer s current location in the virtual shopping mall to achieve improvement in sales and profit of a retail company. In this thesis, instead of specific items, shops are being recommended to customers. An extensive study on indoor wireless network traces is presented in Hsu [31]. The data is collected from the networks at the University of South California and the University of Florida over a period of several months. Instead of getting the exact location, the location is roughly represented by the APs that the mobile devices are associated with. Though no recommender system is developed in this design, an efficient way for mobile users to summarize their mobility preferences is constructed based on singular value decomposition (SVD). Then the mobility summaries are used to calculate the distance between users and identify user groups in the population based on their mutual similarities. The location data used in this thesis is in a much finer granularity, with a precision of 1-2 meter instead of generally tens of meters using APs for localization. Instead of focusing on grouping users, we move a step forward by building a recommender system with the traces. The B-MAD system (Bluetooth Mobile Advertising) [32] delivers location-aware mobile advertisements to mobile phones using Bluetooth positioning and Wireless Application Protocol (WAP) push, with an accuracy of 50 to 100 meters. Each device is identified by its phone number (MSISDNs). The Ad Server recommends any undelivered advertisements associated with the location that have not been delivered to the end user. The basic system architecture of the B-MAD system is similar to the one in this work. However, the recommendation algorithm running on the Ad server is much simpler, since advertisements associated with each location are pre-defined. Therefore, little computation, if any, is required to make recommendations to users. Another related research area is outdoor location-based recommender systems, which

31 Chapter 3. Related Work 23 has been receiving great attention recently. While the characteristics of outdoor location data are quite different from that of the indoor one (discussed in Chapter 4), the techniques used in both cases are somewhat similar. However, even for outdoor LBS, few pure location-based collaborative recommender systems, if any, are found. The main reason is that the pattern of outdoor activities tends to be more fixed. For example, people will be in the work place during daytime and go back home at night. With WiFi indoor localization, the physical mall will be more like an online shopping system, as people move around fast and arbitrarily, which will be discussed in Section 4.1. Cyberguide, a mobile context-aware tour guide [33], is designed to be a location-based tour guide for indoor and outdoor users on a number of different hand-held platforms. The indoor positioning system is based on using TV remote control units as active beacons and a special infrared receiver, while standard GPS is used in the outdoor environment. The goal of this system is to predict tourist destinations based on the current location and a history of past locations. Cyberguide is designed mainly for navigation and easier communication between mobile devices. No recommender system is implemented in [33]. Li and Zheng [34] propose a framework called hierarchical graph-based similarity measurement (HGSM) to consistently model each individual s location history and effectively measure the similarity among users. Both the sequence property of people s movement behaviours and the hierarchy property of geographic spaces are taken into account. The framework is evaluated by using the GPS data collected by 65 volunteers over a period of six months in the real world. Though no recommender system is built in [34], the similarity metric developed can be a good candidate for a location-based recommender system and can be integrated into the work in this thesis. Ahn [35] presents a novel advertisement recommendation model for mobile users called Mobile Advertisement Recommender using Collaborative Filtering (MAR-CF). The model is a multi-dimensional personalization model based on the traditional CF algorithm, taking into account user location, interest and time. The utility matrix is

32 Chapter 3. Related Work 24 modified to incorporate additional contexts to items and a multi-dimensional similarity metric is developed to replace the conventional Pearson correlation metric. Though a collaborative filtering recommender system as well, the recommender system developed in this thesis is a pure location-based recommender system, which means the location itself is the item. Therefore, only the time and location information is used in this thesis, as in a physical shopping mall the transaction history is almost impossible to acquire.

33 Chapter 4 Collecting Data As a garbage in, garbage out filtering algorithm, the performance of a recommender system depends significantly on the quality of the input data. This chapter presents the advantages of WiFi-based positioning over the other indoor positioning methods and the positioning architecture using the CISCO Mobility Service Engine (MSE) [36] in our experiment. An analysis of the Bow Valley shopping mall data set used in this thesis is presented. 4.1 Indoor Localization with WiFi Since its first appearance at the beginning of the 1990s, LBSs have been explored in conjunction with research on ubiquitous computing. While traditional LBSs were designed for supporting outdoor applications, there has been a growing demand for indoor LBSs for asset management and better shopping experiences. While GPS has become a standard and efficient way for outdoor positioning, there are not any known large-scale indoor positioning systems due to the absence of a standard way for indoor positioning. Conventional GPS receivers do not work inside buildings due to the signal absorbing effects of the buildings, while cellular positioning methods generally fail to deliver a satisfactory degree of accuracy [6]. An increasing number of efforts are spent in extracting 25

34 Chapter 4. Collecting Data 26 values from GPS tracks, for example, mining interesting places. However, compared to outdoor tracks, analysing indoor location tracks can provide more direct benefits to the service providers, such as a shopping mall in this thesis, for the following reasons: 1. More total time spent: People spend more than 90% of their time indoors. Therefore, a collecting period of a meaningful data set for GPS tracks is generally above half a year for each person, while the lifetime of a user track in this application is generally less than one hour. 2. Higher mobile frequency: People tend to change their locations more frequently when they are indoors compared to outdoors. After a move outdoors, people are likely to stand statically in one area, which is not distinguishable by GPS, for a long time. But people tend to move frequently when they are in a museum or a shopping mall. 3. Finer granularity: The physical distance between locations outdoors is generally greater than indoors, forcing more constraints such as time constraints to a locationbased recommender system. For example, people are not likely to travel two hours for a lunch even though the recommended restaurant may be a perfect match. On the other hand, the locations recommended by an indoor location-based recommender system are generally more reachable. Despite the lack of a standard positioning method, a variety of techniques are explored by researchers in a limited scale. Among them, the most popular methods include RFID, Bluetooth, Ultrasound, WiFi and Infrared. WiFi-based positioning systems are now getting the most interest due to their ability to leverage the existing network architecture and the popularity of WiFi-enabled devices. Moreover, CISCO has embedded its positioning hardware engine, MSE, into the CISCO network infrastructure, increasing the potential of WiFi as a standard way for indoor positioning.

35 Chapter 2 Architectural Overview M obility Services Engine to Application Chapter 4. Collecting Data 27 Figure 2-2 Typical wireless controller to AP deployment for location To calculate location for the laptop shown in Figure 2-2, a MSE running the Context-Aware Mobility Service must collect information from all controllers (and their access points) in the surrounding physical Figure environment, 4.1: Typical rather than wireless a single wireless controller controller to AP(and deployment its access points). for location Because [3] of this, it is necessary to run the Context-Aware Mobility Service on an appliance or server that aggregates all access point measurements from multiple wireless controllers. 4.2 Data Collection Architecture In addition, location calculations must be performed at a high rate (and within a matter of a few seconds) to enable context consumers to take advantage of context-aware information. The results of a dedicated platform include a more scalable service that can meet the needs of high performance applications that use contextual information. The CISCO MSE is a platform that runs one or more mobility services including the M obility Services Engine to Application Context-Aware mobility service, which can capture information on network equipment, network sensors, environmental sensors, mobile network devices, and mobile assets. It The Mobility Services API is an interface that provides management and data access to the services can running easily on integrate the MSE as device shown information, Figure 2-3. such as location, with other systems to improve their business functionality. With appropriate configuration, the MSE can provide an accuracy within one meter. Figure 4.1 shows the infrastructure when the MSE is used in a positioning system. OL The laptops or other WiFi-enabled devices being located periodically emit beacons to Context Aw are M obility API W hite Paper 2-3

36 Introduction to Architectural Overview Chapter The high level 4. architecture Collecting of the Data MSE is shown in Figure Figure 2-1 MSE High Level Architecture The Context-Aware Mobility Service, like other mobility services, is a software instance running on the MSE. Figure 4.2: MSE High Level Architecture [3] The Context-Aware Mobility Service has the following characteristics: It functions across multiple edge technologies such as wireless and wired networks. several directions at each reference position. The access points (APs) in the surrounding OL area receive these beacons and record the associated Receive Signal Strength (RSS). Context Aw are M obility API W hite Paper These data are then transferred to the MSE through the network controllers. 2-1 Then the Fingerprinting positioning algorithm [6] is run on the MSE to calculate the exact locations of the devices in the map, which are then stored to the database inside the MSE. The advantage of such a network-based mode is that unlike a GPS analysis tool, no software is required to be installed in the users devices, increasing the accessibility of the positioning system. In addition, since the MSE is integrated into the basic network infrastructure, the total cost of such an additional function is much cheaper compared to other indoor positioning systems such as RFID in a network-enabled place. Figure 4.2 describes the high-level architecture of the MSE. After storing the locations

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab

Recommender System. What is it? How to build it? Challenges. R package: recommenderlab Recommender System What is it? How to build it? Challenges R package: recommenderlab 1 What is a recommender system Wiki definition: A recommender system or a recommendation system (sometimes replacing

More information

A Time-based Recommender System using Implicit Feedback

A Time-based Recommender System using Implicit Feedback A Time-based Recommender System using Implicit Feedback T. Q. Lee Department of Mobile Internet Dongyang Technical College Seoul, Korea Abstract - Recommender systems provide personalized recommendations

More information

Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm

Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm Improving Results and Performance of Collaborative Filtering-based Recommender Systems using Cuckoo Optimization Algorithm Majid Hatami Faculty of Electrical and Computer Engineering University of Tabriz,

More information

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman

Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman Thanks to Jure Leskovec, Anand Rajaraman, Jeff Ullman http://www.mmds.org Overview of Recommender Systems Content-based Systems Collaborative Filtering J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive

More information

Hybrid Recommendation System Using Clustering and Collaborative Filtering

Hybrid Recommendation System Using Clustering and Collaborative Filtering Hybrid Recommendation System Using Clustering and Collaborative Filtering Roshni Padate Assistant Professor roshni@frcrce.ac.in Priyanka Bane B.E. Student priyankabane56@gmail.com Jayesh Kudase B.E. Student

More information

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Infinite data. Filtering data streams

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University  Infinite data. Filtering data streams /9/7 Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci

Part 12: Advanced Topics in Collaborative Filtering. Francesco Ricci Part 12: Advanced Topics in Collaborative Filtering Francesco Ricci Content Generating recommendations in CF using frequency of ratings Role of neighborhood size Comparison of CF with association rules

More information

Part 11: Collaborative Filtering. Francesco Ricci

Part 11: Collaborative Filtering. Francesco Ricci Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating

More information

Movie Recommender System - Hybrid Filtering Approach

Movie Recommender System - Hybrid Filtering Approach Chapter 7 Movie Recommender System - Hybrid Filtering Approach Recommender System can be built using approaches like: (i) Collaborative Filtering (ii) Content Based Filtering and (iii) Hybrid Filtering.

More information

COMP6237 Data Mining Making Recommendations. Jonathon Hare

COMP6237 Data Mining Making Recommendations. Jonathon Hare COMP6237 Data Mining Making Recommendations Jonathon Hare jsh2@ecs.soton.ac.uk Introduction Recommender systems 101 Taxonomy of recommender systems Collaborative Filtering Collecting user preferences as

More information

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem

Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem Comparison of Recommender System Algorithms focusing on the New-Item and User-Bias Problem Stefan Hauger 1, Karen H. L. Tso 2, and Lars Schmidt-Thieme 2 1 Department of Computer Science, University of

More information

A Recommender System Based on Improvised K- Means Clustering Algorithm

A Recommender System Based on Improvised K- Means Clustering Algorithm A Recommender System Based on Improvised K- Means Clustering Algorithm Shivani Sharma Department of Computer Science and Applications, Kurukshetra University, Kurukshetra Shivanigaur83@yahoo.com Abstract:

More information

Recommender Systems (RSs)

Recommender Systems (RSs) Recommender Systems Recommender Systems (RSs) RSs are software tools providing suggestions for items to be of use to users, such as what items to buy, what music to listen to, or what online news to read

More information

Predicting Messaging Response Time in a Long Distance Relationship

Predicting Messaging Response Time in a Long Distance Relationship Predicting Messaging Response Time in a Long Distance Relationship Meng-Chen Shieh m3shieh@ucsd.edu I. Introduction The key to any successful relationship is communication, especially during times when

More information

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen

Recommendation Algorithms: Collaborative Filtering. CSE 6111 Presentation Advanced Algorithms Fall Presented by: Farzana Yasmeen Recommendation Algorithms: Collaborative Filtering CSE 6111 Presentation Advanced Algorithms Fall. 2013 Presented by: Farzana Yasmeen 2013.11.29 Contents What are recommendation algorithms? Recommendations

More information

Project Report. An Introduction to Collaborative Filtering

Project Report. An Introduction to Collaborative Filtering Project Report An Introduction to Collaborative Filtering Siobhán Grayson 12254530 COMP30030 School of Computer Science and Informatics College of Engineering, Mathematical & Physical Sciences University

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #7: Recommendation Content based & Collaborative Filtering Seoul National University In This Lecture Understand the motivation and the problem of recommendation Compare

More information

Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract

Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD. Abstract Weighted Alternating Least Squares (WALS) for Movie Recommendations) Drew Hodun SCPD Abstract There are two common main approaches to ML recommender systems, feedback-based systems and content-based systems.

More information

Web Personalization & Recommender Systems

Web Personalization & Recommender Systems Web Personalization & Recommender Systems COSC 488 Slides are based on: - Bamshad Mobasher, Depaul University - Recent publications: see the last page (Reference section) Web Personalization & Recommender

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Recommender Systems 6CCS3WSN-7CCSMWAL

Recommender Systems 6CCS3WSN-7CCSMWAL Recommender Systems 6CCS3WSN-7CCSMWAL http://insidebigdata.com/wp-content/uploads/2014/06/humorrecommender.jpg Some basic methods of recommendation Recommend popular items Collaborative Filtering Item-to-Item:

More information

A Scalable, Accurate Hybrid Recommender System

A Scalable, Accurate Hybrid Recommender System A Scalable, Accurate Hybrid Recommender System Mustansar Ali Ghazanfar and Adam Prugel-Bennett School of Electronics and Computer Science University of Southampton Highfield Campus, SO17 1BJ, United Kingdom

More information

A Genetic Algorithm Approach to Recommender. System Cold Start Problem. Sanjeevan Sivapalan. Bachelor of Science, Ryerson University, 2011.

A Genetic Algorithm Approach to Recommender. System Cold Start Problem. Sanjeevan Sivapalan. Bachelor of Science, Ryerson University, 2011. A Genetic Algorithm Approach to Recommender System Cold Start Problem by Sanjeevan Sivapalan Bachelor of Science, Ryerson University, 2011 A thesis presented to Ryerson University in partial fulfillment

More information

Content-based Dimensionality Reduction for Recommender Systems

Content-based Dimensionality Reduction for Recommender Systems Content-based Dimensionality Reduction for Recommender Systems Panagiotis Symeonidis Aristotle University, Department of Informatics, Thessaloniki 54124, Greece symeon@csd.auth.gr Abstract. Recommender

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu Customer X Buys Metalica CD Buys Megadeth CD Customer Y Does search on Metalica Recommender system suggests Megadeth

More information

The influence of social filtering in recommender systems

The influence of social filtering in recommender systems The influence of social filtering in recommender systems 1 Introduction Nick Dekkers 3693406 Recommender systems have become more and more intertwined in our everyday usage of the web. Think about the

More information

Collaborative Filtering using Euclidean Distance in Recommendation Engine

Collaborative Filtering using Euclidean Distance in Recommendation Engine Indian Journal of Science and Technology, Vol 9(37), DOI: 10.17485/ijst/2016/v9i37/102074, October 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Collaborative Filtering using Euclidean Distance

More information

International Journal of Advance Engineering and Research Development. A Facebook Profile Based TV Shows and Movies Recommendation System

International Journal of Advance Engineering and Research Development. A Facebook Profile Based TV Shows and Movies Recommendation System Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 3, March -2017 A Facebook Profile Based TV Shows and Movies Recommendation

More information

Assignment 5: Collaborative Filtering

Assignment 5: Collaborative Filtering Assignment 5: Collaborative Filtering Arash Vahdat Fall 2015 Readings You are highly recommended to check the following readings before/while doing this assignment: Slope One Algorithm: https://en.wikipedia.org/wiki/slope_one.

More information

amount of available information and the number of visitors to Web sites in recent years

amount of available information and the number of visitors to Web sites in recent years Collaboration Filtering using K-Mean Algorithm Smrity Gupta Smrity_0501@yahoo.co.in Department of computer Science and Engineering University of RAJIV GANDHI PROUDYOGIKI SHWAVIDYALAYA, BHOPAL Abstract:

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Data Matrices and Vector Space Model Denis Helic KTI, TU Graz Nov 6, 2014 Denis Helic (KTI, TU Graz) KDDM1 Nov 6, 2014 1 / 55 Big picture: KDDM Probability

More information

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Recommender Systems. Collaborative Filtering & Content-Based Recommending Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on

More information

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek Recommender Systems: Practical Aspects, Case Studies Radek Pelánek 2017 This Lecture practical aspects : attacks, context, shared accounts,... case studies, illustrations of application illustration of

More information

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs

Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs Browser-Oriented Universal Cross-Site Recommendation and Explanation based on User Browsing Logs Yongfeng Zhang, Tsinghua University zhangyf07@gmail.com Outline Research Background Research Topic Current

More information

Part 11: Collaborative Filtering. Francesco Ricci

Part 11: Collaborative Filtering. Francesco Ricci Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating

More information

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM SUHAS PATIL [M.Tech Scholar, Department Of Computer Science &Engineering, RKDF IST, Bhopal, RGPV University, India] Dr.Varsha Namdeo [Assistant Professor, Department

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Web Personalization & Recommender Systems

Web Personalization & Recommender Systems Web Personalization & Recommender Systems COSC 488 Slides are based on: - Bamshad Mobasher, Depaul University - Recent publications: see the last page (Reference section) Web Personalization & Recommender

More information

Available online at ScienceDirect. Procedia Technology 17 (2014 )

Available online at  ScienceDirect. Procedia Technology 17 (2014 ) Available online at www.sciencedirect.com ScienceDirect Procedia Technology 17 (2014 ) 528 533 Conference on Electronics, Telecommunications and Computers CETC 2013 Social Network and Device Aware Personalized

More information

Experiences from Implementing Collaborative Filtering in a Web 2.0 Application

Experiences from Implementing Collaborative Filtering in a Web 2.0 Application Experiences from Implementing Collaborative Filtering in a Web 2.0 Application Wolfgang Woerndl, Johannes Helminger, Vivian Prinz TU Muenchen, Chair for Applied Informatics Cooperative Systems Boltzmannstr.

More information

Evaluating the suitability of Web 2.0 technologies for online atlas access interfaces

Evaluating the suitability of Web 2.0 technologies for online atlas access interfaces Evaluating the suitability of Web 2.0 technologies for online atlas access interfaces Ender ÖZERDEM, Georg GARTNER, Felix ORTAG Department of Geoinformation and Cartography, Vienna University of Technology

More information

Study and Analysis of Recommendation Systems for Location Based Social Network (LBSN)

Study and Analysis of Recommendation Systems for Location Based Social Network (LBSN) , pp.421-426 http://dx.doi.org/10.14257/astl.2017.147.60 Study and Analysis of Recommendation Systems for Location Based Social Network (LBSN) N. Ganesh 1, K. SaiShirini 1, Ch. AlekhyaSri 1 and Venkata

More information

Using Data Mining to Determine User-Specific Movie Ratings

Using Data Mining to Determine User-Specific Movie Ratings Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Prowess Improvement of Accuracy for Moving Rating Recommendation System

Prowess Improvement of Accuracy for Moving Rating Recommendation System 2017 IJSRST Volume 3 Issue 1 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Scienceand Technology Prowess Improvement of Accuracy for Moving Rating Recommendation System P. Damodharan *1,

More information

Recommender System for volunteers in connection with NGO

Recommender System for volunteers in connection with NGO Recommender System for volunteers in connection with NGO Pooja V. Chavan Department of Computer Engineering & Information Technology Veermata Jijabai Technological Institute, Matunga Mumbai, India Abstract

More information

Data mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014

Data mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014 Data Mining Data mining processes What technological infrastructure is required? Data mining is a system of searching through large amounts of data for patterns. It is a relatively new concept which is

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

CS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering

CS435 Introduction to Big Data Spring 2018 Colorado State University. 3/21/2018 Week 10-B Sangmi Lee Pallickara. FAQs. Collaborative filtering W10.B.0.0 CS435 Introduction to Big Data W10.B.1 FAQs Term project 5:00PM March 29, 2018 PA2 Recitation: Friday PART 1. LARGE SCALE DATA AALYTICS 4. RECOMMEDATIO SYSTEMS 5. EVALUATIO AD VALIDATIO TECHIQUES

More information

Towards a hybrid approach to Netflix Challenge

Towards a hybrid approach to Netflix Challenge Towards a hybrid approach to Netflix Challenge Abhishek Gupta, Abhijeet Mohapatra, Tejaswi Tenneti March 12, 2009 1 Introduction Today Recommendation systems [3] have become indispensible because of the

More information

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems A probabilistic model to resolve diversity-accuracy challenge of recommendation systems AMIN JAVARI MAHDI JALILI 1 Received: 17 Mar 2013 / Revised: 19 May 2014 / Accepted: 30 Jun 2014 Recommendation systems

More information

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li

DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Welcome to DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Time: 6:00pm 8:50pm Wednesday Location: Fuller 320 Spring 2017 2 Team assignment Finalized. (Great!) Guest Speaker 2/22 A

More information

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,

More information

ITS (Intelligent Transportation Systems) Solutions

ITS (Intelligent Transportation Systems) Solutions Special Issue Advanced Technologies and Solutions toward Ubiquitous Network Society ITS (Intelligent Transportation Systems) Solutions By Makoto MAEKAWA* Worldwide ITS goals for safety and environment

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA

COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia WWW 2010

More information

Impact of Term Weighting Schemes on Document Clustering A Review

Impact of Term Weighting Schemes on Document Clustering A Review Volume 118 No. 23 2018, 467-475 ISSN: 1314-3395 (on-line version) url: http://acadpubl.eu/hub ijpam.eu Impact of Term Weighting Schemes on Document Clustering A Review G. Hannah Grace and Kalyani Desikan

More information

Keyword Extraction by KNN considering Similarity among Features

Keyword Extraction by KNN considering Similarity among Features 64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys Metallica

More information

Next Stop Recommender

Next Stop Recommender Next Stop Recommender Ben Ripley, Dirksen Liu, Maiga Chang, and Kinshuk School of Computing and Information Systems Athabasca University Athabasca, Canada maiga@ms2.hinet.net, kinshuk@athabascau.ca Abstract

More information

A Constrained Spreading Activation Approach to Collaborative Filtering

A Constrained Spreading Activation Approach to Collaborative Filtering A Constrained Spreading Activation Approach to Collaborative Filtering Josephine Griffith 1, Colm O Riordan 1, and Humphrey Sorensen 2 1 Dept. of Information Technology, National University of Ireland,

More information

Recommendation System for Location-based Social Network CS224W Project Report

Recommendation System for Location-based Social Network CS224W Project Report Recommendation System for Location-based Social Network CS224W Project Report Group 42, Yiying Cheng, Yangru Fang, Yongqing Yuan 1 Introduction With the rapid development of mobile devices and wireless

More information

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Abstract Deciding on which algorithm to use, in terms of which is the most effective and accurate

More information

Recommender Systems - Content, Collaborative, Hybrid

Recommender Systems - Content, Collaborative, Hybrid BOBBY B. LYLE SCHOOL OF ENGINEERING Department of Engineering Management, Information and Systems EMIS 8331 Advanced Data Mining Recommender Systems - Content, Collaborative, Hybrid Scott F Eisenhart 1

More information

ihits: Extending HITS for Personal Interests Profiling

ihits: Extending HITS for Personal Interests Profiling ihits: Extending HITS for Personal Interests Profiling Ziming Zhuang School of Information Sciences and Technology The Pennsylvania State University zzhuang@ist.psu.edu Abstract Ever since the boom of

More information

Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach

Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach Gediminas Adomavicius Department of Information & Decision Sciences Carlson School of Management University

More information

A Constrained Spreading Activation Approach to Collaborative Filtering

A Constrained Spreading Activation Approach to Collaborative Filtering A Constrained Spreading Activation Approach to Collaborative Filtering Josephine Griffith 1, Colm O Riordan 1, and Humphrey Sorensen 2 1 Dept. of Information Technology, National University of Ireland,

More information

Mobile based Text Image Translation System for Smart Tourism. Saw Zay Maung Maung UCSY, Myanmar. 23 November 2017, Brunei

Mobile based Text Image Translation System for Smart Tourism. Saw Zay Maung Maung UCSY, Myanmar. 23 November 2017, Brunei Mobile based Text Image Translation System for Smart Tourism Saw Zay Maung Maung UCSY, Myanmar. 23 November 2017, Brunei 1 Smart Tourism Tourism is cultural and economic phenomenon which entails the movement

More information

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as

More information

SOCIAL MEDIA MINING. Data Mining Essentials

SOCIAL MEDIA MINING. Data Mining Essentials SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate

More information

Michele Gorgoglione Politecnico di Bari Viale Japigia, Bari (Italy)

Michele Gorgoglione Politecnico di Bari Viale Japigia, Bari (Italy) Does the recommendation task affect a CARS performance? Umberto Panniello Politecnico di Bari Viale Japigia, 82 726 Bari (Italy) +3985962765 m.gorgoglione@poliba.it Michele Gorgoglione Politecnico di Bari

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu //8 Jure Leskovec, Stanford CS6: Mining Massive Datasets High dim. data Graph data Infinite data Machine learning

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

SERVICE RECOMMENDATION ON WIKI-WS PLATFORM

SERVICE RECOMMENDATION ON WIKI-WS PLATFORM TASKQUARTERLYvol.19,No4,2015,pp.445 453 SERVICE RECOMMENDATION ON WIKI-WS PLATFORM ANDRZEJ SOBECKI Academic Computer Centre, Gdansk University of Technology Narutowicza 11/12, 80-233 Gdansk, Poland (received:

More information

Whitepaper US SEO Ranking Factors 2012

Whitepaper US SEO Ranking Factors 2012 Whitepaper US SEO Ranking Factors 2012 Authors: Marcus Tober, Sebastian Weber Searchmetrics Inc. 1115 Broadway 12th Floor, Room 1213 New York, NY 10010 Phone: 1 866-411-9494 E-Mail: sales-us@searchmetrics.com

More information

Flight Recommendation System based on user feedback, weighting technique and context aware recommendation system

Flight Recommendation System based on user feedback, weighting technique and context aware recommendation system www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 5 Issue 09 September 2016 Page No.17973-17978 Flight Recommendation System based on user feedback, weighting

More information

Performance Comparison of Algorithms for Movie Rating Estimation

Performance Comparison of Algorithms for Movie Rating Estimation Performance Comparison of Algorithms for Movie Rating Estimation Alper Köse, Can Kanbak, Noyan Evirgen Research Laboratory of Electronics, Massachusetts Institute of Technology Department of Electrical

More information

Hotel Recommendation Based on Hybrid Model

Hotel Recommendation Based on Hybrid Model Hotel Recommendation Based on Hybrid Model Jing WANG, Jiajun SUN, Zhendong LIN Abstract: This project develops a hybrid model that combines content-based with collaborative filtering (CF) for hotel recommendation.

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of

More information

Context Aware Computing

Context Aware Computing CPET 565/CPET 499 Mobile Computing Systems Context Aware Computing Lecture 7 Paul I-Hai Lin, Professor Electrical and Computer Engineering Technology Purdue University Fort Wayne Campus 1 Context-Aware

More information

Whitepaper Spain SEO Ranking Factors 2012

Whitepaper Spain SEO Ranking Factors 2012 Whitepaper Spain SEO Ranking Factors 2012 Authors: Marcus Tober, Sebastian Weber Searchmetrics GmbH Greifswalder Straße 212 10405 Berlin Phone: +49-30-3229535-0 Fax: +49-30-3229535-99 E-Mail: info@searchmetrics.com

More information

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4 Prof. James She james.she@ust.hk 1 Selected Works of Activity 4 2 Selected Works of Activity 4 3 Last lecture 4 Mid-term

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

Predicting User Ratings Using Status Models on Amazon.com

Predicting User Ratings Using Status Models on Amazon.com Predicting User Ratings Using Status Models on Amazon.com Borui Wang Stanford University borui@stanford.edu Guan (Bell) Wang Stanford University guanw@stanford.edu Group 19 Zhemin Li Stanford University

More information

Recommender Systems New Approaches with Netflix Dataset

Recommender Systems New Approaches with Netflix Dataset Recommender Systems New Approaches with Netflix Dataset Robert Bell Yehuda Koren AT&T Labs ICDM 2007 Presented by Matt Rodriguez Outline Overview of Recommender System Approaches which are Content based

More information

BBS654 Data Mining. Pinar Duygulu

BBS654 Data Mining. Pinar Duygulu BBS6 Data Mining Pinar Duygulu Slides are adapted from J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org Mustafa Ozdal Example: Recommender Systems Customer X Buys Metallica

More information

Distributed Itembased Collaborative Filtering with Apache Mahout. Sebastian Schelter twitter.com/sscdotopen. 7.

Distributed Itembased Collaborative Filtering with Apache Mahout. Sebastian Schelter twitter.com/sscdotopen. 7. Distributed Itembased Collaborative Filtering with Apache Mahout Sebastian Schelter ssc@apache.org twitter.com/sscdotopen 7. October 2010 Overview 1. What is Apache Mahout? 2. Introduction to Collaborative

More information

Advances in Natural and Applied Sciences. Information Retrieval Using Collaborative Filtering and Item Based Recommendation

Advances in Natural and Applied Sciences. Information Retrieval Using Collaborative Filtering and Item Based Recommendation AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Information Retrieval Using Collaborative Filtering and Item Based Recommendation

More information

code pattern analysis of object-oriented programming languages

code pattern analysis of object-oriented programming languages code pattern analysis of object-oriented programming languages by Xubo Miao A thesis submitted to the School of Computing in conformity with the requirements for the degree of Master of Science Queen s

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of

More information

Mobile and Ubiquitous Computing: Mobile Sensing

Mobile and Ubiquitous Computing: Mobile Sensing Mobile and Ubiquitous Computing: Mobile Sensing Master studies, Winter 2015/2016 Dr Veljko Pejović Veljko.Pejovic@fri.uni-lj.si Based on: Mobile and Ubiquitous Computing Mirco Musolesi, University of Birmingham,

More information

System For Product Recommendation In E-Commerce Applications

System For Product Recommendation In E-Commerce Applications International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 11, Issue 05 (May 2015), PP.52-56 System For Product Recommendation In E-Commerce

More information

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models

More information

A Data Classification Algorithm of Internet of Things Based on Neural Network

A Data Classification Algorithm of Internet of Things Based on Neural Network A Data Classification Algorithm of Internet of Things Based on Neural Network https://doi.org/10.3991/ijoe.v13i09.7587 Zhenjun Li Hunan Radio and TV University, Hunan, China 278060389@qq.com Abstract To

More information

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems

CSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems CSE 258 Web Mining and Recommender Systems Advanced Recommender Systems This week Methodological papers Bayesian Personalized Ranking Factorizing Personalized Markov Chains Personalized Ranking Metric

More information

Garmin Forerunner 620 Review

Garmin Forerunner 620 Review Garmin Forerunner 620 Review The Garmin Forerunner 620 was recently released in December and is Garmin's newest and most advanced running GPS watch. This review will explore the new features in the 620

More information

Fall 2017 ECEN Special Topics in Data Mining and Analysis

Fall 2017 ECEN Special Topics in Data Mining and Analysis Fall 2017 ECEN 689-600 Special Topics in Data Mining and Analysis Nick Duffield Department of Electrical & Computer Engineering Teas A&M University Organization Organization Instructor: Nick Duffield,

More information

Implementing a Content-Based Recommender System For News Readers

Implementing a Content-Based Recommender System For News Readers Implementing a Content-Based Recommender System For News Readers by Mahta Moattari Bachelor of Information Technology and Computer Science, Amirkabir University of Tech., 2010 A REPORT SUBMITTED IN PARTIAL

More information

Implicit Personalization of Public Environments using Bluetooth

Implicit Personalization of Public Environments using Bluetooth Implicit Personalization of Public Environments using Bluetooth Hema Mahato RWTH Aachen 52062 Aachen hema.mahato@rwth-aachen.de Dagmar Kern Pervasive Computing University of Duisburg-Essen 45117 Essen,

More information