CHAPTER 7 INTEGRATION OF CLUSTERING AND ASSOCIATION RULE MINING TO MINE CUSTOMER DATA AND PREDICT SALES

94 CHAPTER 7 INTEGRATION OF CLUSTERING AND ASSOCIATION RULE MINING TO MINE CUSTOMER DATA AND PREDICT SALES 7.1 INTRODUCTION Product Bundling and offering products to customers is a great challenge in retail marketing. A predictive mining approach predicts sales for a new location based on the existing data. The major issue lies in the analysis of sales forecast based on the dependencies among the products and the customer segmentation, which helps to improve the market of the retail stores. A new methodology is proposed to identify customer s behaviour also. The methodology is based on the integration of data mining approaches such as clustering and association rule mining. It focuses on the discovery of rules and concentrates on marketing the products based on the population. Association rules generated for a location at a point of sale cannot be effective in another location since the complete and complex behaviour of customers and their approach in selecting products are different. 7.2 CLUSTERING BASED ASSOCIATION RULE MINING SYSTEM (CARMS) Since the introduction of the problem of mining association rules, several generate and test type of algorithms has been proposed for the task of discovering frequent sets, Agard and Kusiak (2004). In order to obtain the association rules for a new store based on the analysis of customer transactions from the existing knowledge base, CARMS architecture is used to predict sales. The system involves different consecutive stages communicating with one another in generating rules as the data pre-processing, data partitioning, data transformation, and association rule mining. Before proceeding to the rule mining of datasets, raw data must be pre-processed in order to be useful for knowledge discovery. Due to the uncertainty of customer requirements and their behaviour, we have to preprocess the knowledge base. Figure 7.1 illustrates the specification of the problem domain. Based on the raw data stored in the knowledge base, target datasets should be identified, involving such data cleaning and filtering tasks as integration of multiple databases, removal of noises and handling of missing data files. Figure 7.2 shows the block diagram of CARMS.

95 CD PD Transaction Database Customer Details Product Details Clustering Features Segment the Database Generate Rules CD Customer Domain PD Product Domain Fig. 7.1: Specification of Problem Domain All target data should be organized into a usable transaction database. This involves the clear understanding of the variables, selection of attributes, which are more pertinent in generating rules. In the architecture proposed, the sales records and the product details are transformed into transaction data, which consists of a unique Transaction Identifier (TID). Transaction data consists of customer details and their affinity towards the products. Each customer is given a series of options on the selection of products based on the customers attributes such as income, age and gender, which are recoded as the key operational features. The options of products that the customer desires can be stated as related functional requirements, which can serve as mandatory information for the predicting of sales at a new location.

96 Identification of Customer details Transaction records for Customer set C= {C 1, C 2,.C s } Clustering Identification of Product details Existing product purchased records for each cluster R= {R 1, R 2,.R n } Generate Rules Fig. 7.2: Block Diagram of CARMS 7.3 CLUSTERING Clustering is the task of segmenting a heterogeneous population into a number of more homogenous clusters. A cluster is therefore a collection of objects, which are similar between them and are dissimilar to the objects belonging to other clusters. So, the goal of clustering is to determine the intrinsic grouping in a set of unlabeled data. Good clustering can be shown that there is no absolute best criterion, which would be independent of the final aim of the clustering. Consequently, it is the user, which must supply this criterion, in such a way that the result of the clustering will suit their needs. For instance, we could be interested in finding representatives for homogeneous groups (data reduction), in finding natural clusters and describe their unknown properties ( natural data types), in finding useful and suitable groupings ( useful data classes) or in finding unusual data objects (outlier detection).

97 7.4 ASSOCIATION RULEMINING Association rules are adopted to discover the interesting relationship and knowledge in a large dataset. Definition 1 : Given a set of items I = { I 1,I 2,,I s ), and the database of transaction records D = {t 1,t 2,,t n }, where t 1 = { I i1, I i2,,i ik } and I ij I, an association rule is an implication of the form X=> Y where X,Y C I and X Y =. Definition 2: The support (s) for an association rule X=>Y is the percentage of transactions in the database that contain X U Y. That is, support (X =>Y) = P (X U Y), P is the probability. Definition 3: The confidence or strength ( ) for an association rule (X=>Y) is the ratio of the number of transactions that contain X U Y to the number of transactions that contains X. That is confidence (X=>Y) = P (Y X). The unions of transaction records in the clusters that make the dependency maximum are often more representative than other transaction ones. Therefore, we can partition the target transaction table with them to decrease the scale of data mining without loss of the information content. In general, the focus must be more on the cluster groups than the individual customers, since the groups can reflect the characteristics of individual customers. 7.5 PROPOSED DESIGN An efficient CARMS architecture is proposed to discover customer group based rules. In order to obtain the rules, both the customer and the product domains have been bridged. Clustering and Association rule mining were incorporated to analyse the similarity between customer groups and their preferences for products. The complete set of rules must be stored in a separate knowledge base. 7.5.1 CLUSTERING BYK-MEDOIDS The k-means clustering algorithm suffers from the limitation of not responding to outliers and noisy data that could drastically alter the structuring of clusters. The k-medoids algorithm helps in eliminating this sensitivity by using medoid as a measure for similarity computation. The following variables have been used for clustering and association rule mining.

98 Demographic Variables Cust_id, Cust_name, Gender, Age, Education, Marital_status, Address etc., Product Details Product_id, Product_name, Price, Brand, Color etc., Transactional Details Transaction_id (TID), Cust_id, Product_id, Purchase_Date, Gender, Age, Education, Marital_status, etc., RFMT Variables Recency, Frequency, Monetary and Term., RFMT variables were useful in clustering. Customer lifetime value of each customer has been calculated by using recency, frequency, monetary and term variables. The k- medoids algorithm has been applied to cluster the customers into eight groups, according to the weighted RFMT values. Detailed transaction data, demographic variables and RFMT are used to give better results. 7.5.2 PERFORMANCE OFRFMT BASED APRIORI ALGORITHM Firstly, customer segments with similar RFMT values were identified to be able to adopt different marketing strategies for different customer segments. Secondly, demographic variables (age, gender, education etc.,) and RFMT values of customer segments have been used to predict future customer behaviors and to target customer profiles more clearly. Thirdly, association rules were discovered to identify the associations between customer segments, customer profiles and product items purchased, and therefore to recommend products with associated rankings, which results in better customer satisfaction. An association rule mining has been applied to extract recommendation rules, namely, frequent purchase patterns from each group of customers. The extracted frequent purchase patterns represent the common purchasing behavior of customers with similar RFMT values and with similar demographic variables. For example, not all women at the age of 35-40 have the same tendency to purchase a product; so we should also consider their RFMT values, customer segments and the other products frequently purchased together with that product. RFMT based Apriori algorithm has given best and redundant-free rules.

99 Customer segments with similar RFMT values have been identified to be able to adopt different marketing strategies for different customer segments. The following datasets were considered for clustering and association rule mining: 1. Supermarket Dataset 2. Bookstore Dataset 3. Life insurance Dataset Table 7.1 lists the best association rules for cluster 4 of supermarket dataset. Table 7.1: Best Association Rules for Cluster 4 of Supermarket Dataset Minimum support: 0.15 Minimum support: 0.15 Number of cycles performed: 17 1. biscuits=t frozen foods=t fruit=t total=high ==> bread 2. baking needs=t biscuits=t fruit=t total=high ==> bread 3. baking needs=t frozen foods=t fruit=t total=high ==> 4. biscuits=t fruit=t vegetables=t total=high ==> bread and cake=t 5. party snack foods=t fruit=t total=high ==> bread and cake=t 6. biscuits=t frozen foods=t vegetables=t total=high ==> 7. baking needs=t biscuits=t vegetables=t total=high ==> <conf:(0.92)> lift:(1.27) lev:(0.03) conv:(3.35) <conf:(0.92)> lift:(1.27) lev:(0.03) conv:(3.28) <conf:(0.92)> lift:(1.27) lev:(0.03) conv:(3.27) <conf:(0.92)> lift:(1.27) lev:(0.03) conv:(3.26) <conf:(0.91)> lift:(1.27) lev:(0.04) conv:(3.15) <conf:(0.91)> lift:(1.26) lev:(0.03) conv:(3.06) <conf:(0.91)> lift:(1.26) lev:(0.03) [145] conv:(3.01) 8. biscuits=t fruit=t total=high ==> <conf:(0.91)> lift:(1.26) lev:(0.04) conv:(3) 9. frozen foods=t fruit=t vegetables=t total=high ==> bread <conf:(0.91)> lift:(1.26) lev:(0.03) conv:(3) 10. frozen foods=t fruit=t total=high ==> <conf:(0.91)> lift:(1.26) lev:(0.04) conv:(2.92)

100 Table 7.2 lists the comparison of best association rules with minimum support 0.17 and 0.18. Table 7.3 shows the best association rules for cluster 3 of bookstore dataset. The transactions have been stored in binary format for bookstore and life insurance dataset. Table 7.4 shows the comparison of best association rules with minimum support 0.55 and 0.6. Table 7.2: Comparison of Best Association Rules with Minimum Support (0.17 & 0.18) Minimum Support : 0.17 Minimum Support : 0.18 Minimum support: 0.17 (787 instances) Number of cycles performed: 17 Best rules found: Minimum support: 0.18 (833 instances) Number of cycles performed: 17 Best rules found: 1. biscuits=t fruit=t total=high ==> bread and cake=t 1. biscuits=t fruit=t total=high ==> bread 2. frozen foods=t fruit=t total=high ==> bread 2. frozen foods=t fruit=t total=high ==> 3. biscuits=t milk-cream=t total=high ==> 3. biscuits=t vegetables=t total=high ==> 4. biscuits=t vegetables=t total=high ==> bread 4. baking needs=t fruit=t total=high ==> 5. baking needs=t fruit=t total=high ==> bread 6. tissues-paper prd=t fruit=t total=high ==>

101 Table 7.3: Best Association Rules for Cluster 3 of Bookstore Dataset Minimum support: 0.45 Minimum support: 0.45 Number of cycles performed: 11 1. GeogBooks=1 PoliticsBooks=1 ==> RefBooks=0 <conf:(1)> lift:(1.22) lev:(0.08) conv:(13.05) 2. YouthBooks=0 RefBooks=0 ==> EnglishBooks=0 <conf:(0.99)> lift:(1.22) lev:(0.1) conv:(8.63) 3. FrenchBooks=1 ==> RefBooks=0 <conf:(0.99)> lift:(1.21) lev:(0.08) conv:(7.25) 4. FrenchBooks=0 ==> EnglishBooks=0 <conf:(0.99)> lift:(1.22) lev:(0.09) conv:(7.5) 5. YouthBooks=0 ScienceBooks=0 RefBooks=0 ==> <conf:(0.99)> lift:(1.22) EnglishBooks=0 lev:(0.09) conv:(7.5) 6. CookBooks=1 GeogBooks=1 ==> EnglishBooks=0 <conf:(0.99)> lift:(1.22) lev:(0.09) conv:(7.41) 7. ScienceBooks=0 ArtBooks=0 ==> ItBooks=0 <conf:(0.99)> lift:(1.5) lev:(0.16) conv:(13.23) 8. ItBooks=0 FrenchBooks=0 ==> EnglishBooks=0 <conf:(0.99)> lift:(1.21) lev:(0.08) conv:(7.22) 9. ItBooks=0 EnglishBooks=0 ==> FrenchBooks=0 <conf:(0.99)> lift:(1.97) lev:(0.23) conv:(19.25) 10. ScienceBooks=0 FrenchBooks=1 ==> RefBooks=0 <conf:(0.99)> lift:(1.21) lev:(0.08) conv:(6.89)

102 Table 7.4: Comparison of Best Association Rules with Minimum Support (0.55 & 0.6) Minimum Support : 0.55 Minimum Support : 0.6 Minimum support: 0.55 Number of cycles performed: 9 Best rules found: 1. YouthBooks=0 RefBooks=0 ==> EnglishBooks=0 2. YouthBooks=0 ==> EnglishBooks=0 3. CookBooks=1 ==> EnglishBooks=0 4. YouthBooks=0 EnglishBooks=0 ==> RefBooks=0 5. YouthBooks=0 ==> RefBooks=0 6. ChildBooks=1 GeogBooks=1 ==> ScienceBooks=0 7. YouthBooks=0 ==> RefBooks=0 EnglishBooks=0 8. ScienceBooks=0 GeogBooks=1 ==> ChildBooks=1 Minimum support: 0.6 Number of cycles performed: 8 Best rule found: 1. YouthBooks=0 ==> EnglishBooks=0 Table 7.5 shows the best association rules for cluster 3 of life insurance dataset. Table 7.6 lists the comparison of best association rules with minimum support 0.5 and 0.55. Higher confidence should yield better prediction Jo Ting et al. (2006). The association rules were discovered to identify the associations between customer segments, customer profiles and product items purchased, and therefore to recommend products with associated rankings, which results in better customer satisfaction.

103 Table 7.5: Best Association Rules for Cluster 3 of Life Insurance Dataset Minimum support: 0.45 Minimum support: 0.45 Number of cycles performed: 11 1. BimaPatchath=0 JeevanVarsha=0 ==> PensionPlan=0 <conf:(0.99)> lift:(1.5) lev:(0.16) conv:(13.23) 2. JeevanVarsha=0 ==> PensionPlan=0 <conf:(0.97)> lift:(1.47) lev:(0.17) conv:(7.56) 3. WealthPlus=0 BimaPatchath=0 ==> JeevanSaral=0 <conf:(0.94)> lift:(1.15) lev:(0.07) conv:(2.57) 4. JeevanAnand=1 KomalJeevan=1 ==> MarketPlus=1 <conf:(0.94)> lift:(1.24) lev:(0.09) conv:(3.21) <conf:(0.94)> lift:(1.08) 5. MarketPlus=1 JeevanSaral=0 KomalJeevan=1 ==> lev:(0.03) conv:(1.68) BimaPatchath=0 6. WealthPlus=0 ==> JeevanSaral=0 <conf:(0.92)> lift:(1.12) lev:(0.06) conv:(2.01) 7. MarketPlus=1 KomalJeevan=1 ==> BimaPatchath=0 <conf:(0.92)> lift:(1.06) lev:(0.03) conv:(1.43) 8. MarketPlus=1 JeevanAnand=1 ==> BimaPatchath=0 <conf:(0.91)> lift:(1.04) lev:(0.02) conv:(1.24) 9. BimaPatchath=0 KomalJeevan=1 ==> MarketPlus=1 <conf:(0.9)> lift:(1.19) lev:(0.09) conv:(2.22) 10. JeevanSaral=0 KomalJeevan=1 ==> <conf:(0.9)> lift:(1.04) lev:(0.02) conv:(1.18) BimaPatchath=0

104 Table 7.6: Comparison of Best Association Rules with Minimum Support (0.5 & 0.55) Minimum Support : 0.5 Minimum Support : 0.55 Minimum support: 0.5 Number of cycles performed: 10 Best rules found: 1. JeevanVarsha=0 ==> PensionPlan=0 2. WealthPlus=0 BimaPatchath=0 ==> JeevanSaral=0 3. WealthPlus=0 ==> JeevanSaral=0 4. MarketPlus=1 KomalJeevan=1 => BimaPatchath=0 5. BimaPatchath=0 KomalJeevan=1 ==> MarketPlus=1 6. JeevanSaral=0 KomalJeevan=1 ==> BimaPatchath=0 Minimum support: 0.55 Number of cycles performed: 9 Best rules found: 1. WealthPlus=0 ==> JeevanSaral=0 2. MarketPlus=1 KomalJeevan=1 ==> BimaPatchath=0 2. BimaPatchath=0 KomalJeevan=1 ==> MarketPlus=1 7.5.3 PERFORMANCE OFRFMT BASED APRIORI ALGORITHM RFMT based Apriori, despite its simple logic and inherent pruning advantage, suffers from limitations of a huge number of repeated input scans. RFMT based FP Growth algorithm is used to extract important and effective rules. It is an efficient and scalable method for mining the complete set of frequent patterns by pattern fragment growth, using an extended prefix-tree structure for storing compressed and crucial information about frequent patterns named frequent-pattern (FP) tree. This approach is based on the principle of reducing the size of the database representation by maintaining the more frequently occurring patterns near the root and hence increasing the likelihood of sharing nodes in the tree structure. The created FP tree is mined to generate various frequent patterns subject to the minimum support threshold. Table 7.7 shows the best rules for cluster 4 of supermarket dataset. RFMT based FP Growth algorithm has given best rules. Table 7.8 shows the best rules for cluster 3 of bookstore dataset. Table 7.9 lists the best rules for cluster 3 of life insurance dataset.

105 Table 7.7: Best Rules for Supermarket Dataset Minimum Support : 0.15 Minimum Support : 0.18 found 16 rules (displaying top 6) 1. [fruit=t, frozen foods=t, biscuits=t, total=high]: ==> []: 2. [fruit=t, baking needs=t, biscuits=t, total=high]: ==> []: 3. [fruit=t, baking needs=t, frozen foods=t, total=high]: ==> []: 4. fruit=t, vegetables=t, biscuits=t, total=high]: ==> []: 5. fruit=t, party snack foods=t, total=high]: ==> []: 6. [vegetables=t, frozen foods=t, biscuits=t, total=high]: ==> []: found 4 rules (displaying top 4) 1. [fruit=t, biscuits=t, total=high]: ==> []: 2. [fruit=t, frozen foods=t, total=high]: ==> []: 3. [vegetables=t, biscuits=t, total=high]: ==> []: 4. [fruit=t, baking needs=t, total=high]: ==> []: Table 7.8: Best Rules for Bookstore Dataset Minimum Support : 0.4 Minimum Support : 0.45 found 22 rules (displaying top 6) found 2 rules (displaying top 2) 1. [PoliticsBooks=1, FrenchBooks=1]: ==> [GeogBooks=1]: 1. [GeogBooks=1, CookBooks=1]: ==> [ChildBooks=1]: 2. [ChildBooks=1, GeogBooks=1, 2. [FrenchBooks=1]:=> [ChildBooks=1]: PoliticsBooks=1]:==> FenchBooks=1]: 3. [ChildBooks=1, GeogBooks=1, FrenchBooks=1]:==> [PoliticsBooks=1]: 4. [ChildBooks=1, PoliticsBooks=1, FrenchBooks=1]:==> GeogBooks=1]: 5. [GeogBooks=1, PoliticsBooks=1]: ==> [FrenchBooks=1]: 6. [GeogBooks=1, FrenchBooks=1]: ==> [PoliticsBooks=1]:

106 Table 7.9: Best Rules for Life Insurance Dataset Minimum Support : 0.3 Minimum Support : 0.4 found 20 rules (displaying top 6) 1. [KomalJeevan=1, PensionPlan=1]: ==> [JeevanAnand=1]: 2. [MarketPlus=1, KomalJeevan=1, PensionPlan=1]: ==> [JeevanAnand=1]: 3. [KomalJeevan=1, JeevanVarsha=1, PensionPlan=1]: ==> [JeevanAnand=1]: 4. [MarketPlus=1, KomalJeevan=1, JeevanVarsha=1, PensionPlan=1]: ==> [JeevanAnand=1]: 5. [KomalJeevan=1, JeevanVarsha=1]: ==> [MarketPlus=1]: 6. [PensionPlan=1]:==> [JeevanAnand=1]: found 2 rules (displaying top 2) 1. [KomalJeevan=1, JeevanAnand=1]: ==> [MarketPlus=1]: 2. [JeevanVarsha=1]:=> [MarketPlus=1]: 7.6 SUMMARY CARMS is proposed to predict customer behavior. The system involves different consecutive stages communicating with one another in generating rules as the data preprocessing, data partitioning, data transformation and association rule. The customers with similar purchasing behavior have been first grouped by means of clustering techniques. Finally, for each cluster, association rules are used to identify the products that are frequently bought together by the customers from each segment.