MINING CTMSPS IN LBS

Size: px
Start display at page:

Download "MINING CTMSPS IN LBS"

Transcription

1 981 MINING CTMSPS IN LBS Pooja Mauskar*, Manisha Naoghare** *Department of Computer Engineering, SVIT, Chincholi ** Department of Computer Engineering, SVIT, Chincholi ABSTRACT Increasing popularity of wireless technology leads to research in mining and prediction of mobile movements and associated transactions. Discovery of mobile patterns from logs may not be precise enough for predictions since the varying mobile behaviors among users and temporal periods are not considered. In this work, a novel algorithm, namely, CTMSP-Mine is proposed, to discover the CTMSPs. A prediction strategy is proposed to predict the mobile behaviors. In CTMSP- Mine, user clusters are constructed by a novel algorithm named Smart CAST and similarities between users are evaluated by the proposed measure, LBS-Alignment. Also a time segmentation approach is proposed to find segmenting time intervals where similar mobile characteristics exist. behavior, mobile transaction database is complicated. Each cluster has different mobile behaviors at various time intervals. Prediction can be more precise if we can find the corresponding mobile patterns in each user cluster and time interval. Effective mobile behavior mining systems are required to provide precise locationbased service for users. Keywords - LBS, mining techniques, clustering methods, mobile environment I. INTRODUCTION Emerging trends in area wireless communication techniques and the popularity of mobile phones, PDA, and GPS-enabled cellular phones, have contributed to a new business model. Mobile users can request services through their mobile devices via ISAP from anywhere at any time. This business model is known as MC that provides LBS through mobile phones. Popularity of MC is increasing as e- commerce. MC is based on the cellular network composed of several base stations. Communication within hexagonal area called a cell is controlled by base station. While moving information about users locations and service requests are stored in a centralized mobile transaction database. MC scenario in Fig. 1 shows that user moves within mobile network and requests services in the corresponding cell via mobile devices. User moves in a sequence as shown in Fig 1a where cells are underlined only if service is requested there. When user moved to the location A at time 5, requested service is s1 that can be represented via record of service transaction as shown in Fig 1b.As a large amount of mobile transaction records are produced by user's mobile Fig.1 An example for a mobile transaction sequence. a) Moving sequences. b) Service sequences [1]. In this project a novel data mining algorithm named CTMSP-Mine [1] is proposed to effectively mine CTMSPs of user. To effectively predict user s subsequent behavior using discovered CTMSPs, prediction strategies are proposed. To mine CTMSPs, firstly transaction clustering algorithm named smart CAST [18] is proposed that builds cluster model for mobile transactions based on proposed LBS- Alignment similarity measure. Then advantage of GA is taken to produce more suitable time interval table. The proposed method discovers all CTMSPs based on produced user cluster and time interval table. II. LITERATURE SURVEY In recent years, a number of studies have discussed the usage of data mining techniques to discover useful rules/patterns from: World Wide Web (WWW) Transaction databases Mobility data.

2 982 This study can be briefly classified into mobile pattern mining techniques, clustering methods, temporal pattern mining techniques, and mobile behavior predictions. A number of studies have discussed the usage of data mining techniques to discover useful rules/patterns, transaction databases mining association rules [2] are proposed to find important items in a transaction database. Agrawal and Srikant[2] proposed the Apriori algorithm to mine the association rules. Sequential pattern mining was first introduced in [4] to search for time ordered patterns, known as sequential patterns within transaction databases. The clustering analysis can be roughly divided into two categories. The first category is on similarity measures that may affect the final clustering results directly. LCSS [5], DTW [6], ERP [7] and Euclidean distance [9] are most popular similarity measures for string sequence or time series data analysis. Since mobile transaction sequences are not only time series movement string but also with service sequences, it is crucial to properly define the similarity between different sequences. The second category is on the clustering methods. The most well-known clustering method is the k-means algorithm, which is partition based. Other partition-based methods contain k-medoids, Partitioning around Medoids (PAM), etc. These methods partition the data set into k clusters, based on similarities between data items, where k is a parameter specified by the user. For density-based clustering methods, Ben-Dor and Yakhini [8] proposed the Cluster Affinity Search Technique (CAST) that requires an affinity threshold t, where 0 < t < 1. But CAST is basic clustering algorithm and hence cannot evaluate. However, the similarity between mobile transactions cannot be measured by the Euclidean distance. Besides, most clustering methods request the users to set up some parameters before the clustering task. However, in real applications, it is difficult to determine the right parameters manually for the clustering tasks. Previous studies and applications consider time to be an important factor. The segmenting points of the time intervals influence the precision rate of mobile behavior prediction. Because it is not easy to find the best segmentation points of time intervals, the genetic algorithm is generally used to solve such complicated problems. SMAP-Mine was first proposed to discover sequential mobile access rules and predict the user s next locations and services. The mobile behavior predictions can be roughly divided into two categories. The first category is time series-based prediction that can be divided into two types: 1) linear models and 2) nonlinear models. The second category is pattern-based prediction. However, these methods can only predict the next spatial locations of objects. SMAP-Mine [20] was first proposed to discover sequential mobile access rules and predict the user s next locations and services. Yun and Chen [21] proposed the MSP to predict the next mobile behaviors. However, there is no work that considers the temporal factor, i.e., users at different time may have different mobile behaviors. III. IMPLEMENTATION DETAILS 1. Block Diagram Figure 2 shows the conceptual block diagram. In that all the major tasks which are present in system are shown. The system is mainly divided into four parts. Those are: Clustering of mobile users Time segmentation of mobile transaction sequences Mining of mobile behaviors Mobile behavior prediction for mobile users using combined approach Fig 2: Conceptual Block Diagram The methodology is organized into subsequent sections, where first section, section 3 contains clustering of mobile transactions which has subsection 3.1 explains LBS Alignment algorithm and subsection 3.2 gives brief description of CAST algorithm. The section 3.3 gives detailed explanation of segmentation of mobile transactions. Section respectively. Section 3.4 gives detailed description of the mining of mobile transactions with subsection gives frequent transaction mining, gives mobile transaction database transformation and gives CTMSP mining technique. Section 3.5 gives prediction strategies explains GA with subsections explaining selection, crossover and fitness function in , ,

3 System Framework and Methodology Fig. 3. shows the proposed system framework. System has an offline mechanism for CTMSPs mining and an online engine for mobile behavior prediction. When mobile users move within the mobile network, the information which includes time, locations, and service requests will be stored in the mobile transaction database. Table 1 shows an example of mobile transaction database which contains seven records. In each user s record there are tuples consisting of time at which user requests some service, location of user and service number. In the offline data mining mechanism, there are two design techniques and the CTMSP-Mine algorithm to discover the knowledge. First, the CAST algorithm is proposed to cluster the mobile transaction sequences. In this algorithm, the LBS-Alignment is proposed to evaluate the similarity of mobile transaction sequences. behaviors. The first task to tackle is to cluster mobile transaction sequences. A parameter-less clustering algorithm CAST is proposed. Before performing the CAST, a similarity matrix S, based on the mobile transaction database is generated. The entry S i,j in matrix S represents the similarity of the mobile transaction sequences i and j in the database, with the degrees in the range of [0, 1]. TABLE 1 MOBILE TRANSACTION DATABASE 3.1 Location Based Service Alignment algorithm Fig 3: System Framework Second, a GA based time segmentation algorithm is proposed to find the most suitable time intervals. After clustering and segmentation, a user cluster table and a time interval table are generated, respectively. Third, the CTMSP-Mine algorithm is proposed to mine the CTMSPs from the mobile transaction database according to the user cluster table and the time interval table. In the online prediction engine, a behavior prediction strategy is proposed to predict the subsequent behaviors according to the mobile user s previous mobile transaction sequences and current time. The main purpose of this framework is to provide mobile users a precise and efficient mobile behavior prediction system. 3. Clustering mobile transaction database In a mobile transaction database, users in the different user groups may have different mobile transaction A mobile transaction sequence can be viewed as a sequence string, where each element in the LBS Alignment is based on the consideration that two mobile transaction sequences are more similar, when the orders and timestamps of their mobile transactions are more similar. Based on this concept, the TP and the SR in in the LBS-Alignment are generated. The base similarity score is set as 0.5. Two mobile transactions can be aligned if their locations are the same. Otherwise, a location penalty is generated to decrease their similarity score. The location penalty is defined as 0.5/( s 1 + s 2 ),where s 1 and s 2 are the lengths of sequences s 1 and s 2,respectively. When two sequences are totally different, their similarity score is 0.When two mobile transactions are aligned, their time penalty and service reward is measured. TP focuses on their time distance. The farther the time distances between them, the larger their time penalty. TP that is generated to decrease their similarity score is defined as ( s1.time - s2.time )/len, where len indicates the time length. SR focuses on the similarity of the service requests. The more similar their service requests, the larger their service reward. SR that is generated to increase their similarity score is defined as ( s1.services s2.services )/ ( s1.services s2.services ).Fig. 3.3 shows the procedures of an LBS-Alignment measure. Input

4 984 data include two mobile transaction sequences (line 1). Output data are the similarity between two mobile transaction sequences, with the degrees in the range from 0 to 1 (line 2). Some parameters are initialized (line 4 to line 7). The base similarity score is set as 0.5 (line 5). Dynamic programming to calculate M i,j (line 8 to line18) is used. M i,j indicates the value of matrix M in column i and row j, where M is the score matrix of LBS- Alignment. In this procedure, if the locations of two transactions are the same (line 10), both the time penalty (line 11) and the service reward (line 12) are calculated to measure the similarity score (line 13). Otherwise (line 14), the location penalty is generated to decrease the similarity score (line 15). Finally, s.length, s.length is returned as the similarity score of the two mobile transaction sequences (line 19). (3) CAST handles more general inputs. Namely, it allows the user to specify both a real-valued similarity matrix, and a threshold parameter which determines what is considered significantly similar. This parameter controls the number and sizes of the produced clusters. The input to the algorithm is a pair < where is an n-by-n similarity matrix, and t is a similarity cutoff. The clusters are constructed one at a time. The currently constructed cluster is denoted by Copen. The affinity of an element is defined as x, denoted by a(x), to be the sum of similarity values between x and the elements in Copen. An element x is of high affinity if a(x) t Copen. Otherwise, x is called of low affinity. An elements' status (high /low affinity) depends on Copen. Roughly speaking, CAST alternates between adding high affinity elements to Copen, and removing low affinity elements from it. When this process stabilizes Copen is closed and a new cluster is started. A pseudo-code of the algorithm is given in Fig. 3.4.The cleaning" steps in CAST serve to avoid a common shortcoming shared by many popular clustering techniques (such as single linkage,completelinkage, group-average, and centroid).due to their greedy" nature, once a decision to join two clusters is made, it cannot be reversed. Fig. 4: LBS-Alignment algorithm 3.2 Cluster Affinity Search Technique (CAST): The algorithm relies on average similarity (affinity) between unassigned vertices and the current cluster seed to make its next decision. However, it differs from the theoretical algorithm in some aspects: (1) Theoretical algorithm repeats the same process for many initial seeds. Here cleaning steps are used to remove spurious elements from cluster seeds and avoid the repetition. (2) CAST adds (and removes) elements from the current seed one at a time (and not independently, as in the theoretical algorithm). Heuristically, this helps by strengthening the constructed seed, thus improving the decision base for the next step. Fig 5: CAST Algorithm [8]

5 Segmentation of Mobile Transactions In a mobile transaction database, similar mobile behaviors exist under some certain time segments. Hence, it is important to make suitable settings for time segmentation so as to discriminate the characteristics of mobile behaviors under different time segments. A GAbased method is proposed to automatically obtain the most suitable time segmentation table with common mobile behaviors. Fig. 7 shows the procedure of time segmentation method, named Get Number of Time Segmenting Points (GetNTSP) algorithm. The input data are a mobile transaction database D and its time length T (line 01). The output data are the number of time segmenting points (line 02). For each item, total number of occurrences at each time point (line 07 to line 11) is accumulated. Therefore, an item (location, service) can draw a curve of count distribution, as shown in Fig. 8. For all curves, the time points with the largest change rate are found (line 13). The rate of change is defined as(c[i+1]-c[i])/(1+c[i]), where c[i] represents the total number of occurrences for the item at time point i. Count occurrences of all these time points (line 15), and find out the satisfied time points whose counts are larger than or equal to the average of all occurrences from these ones, and then, take these satisfied ones as a set of the TPS (line 17). In the time point sequence, calculate the average time distance a between two neighboring time points (line 18). Then calculate the number of neighboring time point pairs, in which the time distance is higher than a (line 19 to line 23). The result represents the time segmentation count (line 24) Genetic Algorithm Once time segmenting points are obtained from NTSP algorithm to obtain most suitable time interval Genetic Algorithm (GA) is used. Typically GA is a search heuristic/methodology based on process of natural evolution depicting survival of the fittest. GA was initially proposed by John Holland in early 70s while researching Cellular Automata. Evaluating better phenotype/candidate (in this case time segments) using GA is an iterative process which will derive Fitness Function giving optimized/fittest time segment for given random input. The weakest chromosomes become obsolete at the end of iteration as evolution continues to flow through following operating drivers of GA. Fig 8 shows steps in GA Selection Fig 8. Steps in GA For the selection operator, a proportion of the current time segments are selected to product the next population in an iteration resulting a new generation. Individual chromosomes i.e. time segments are selected based on their fitness value. Fitness function measures the quality of the represented solution. Based on fitness value chromosomes are selected. If fitness value is higher than probability of selecting that candidate is more Crossover Fig 7 GetNTSP Algorithm Next step produce second generation of chromosomes is to introduce crossover which essentially is recombination. One-point crossover that involves a crossover probability to this operator is applied to do this. To breed next generation, a crossover point on both parent chromosomes is randomly selected. All time segments beyond the crossover point are interchanged between the two parent chromosomes.

6 986 The resulting chromosomes are the children i.e. next generation. Only the best time segments from the first generation are selected for breeding so that stronger generation is produced Fitness Function A better time interval segmentation will result in higher standard deviation of the frequency table. Therefore, the fitness function of chromosome X is defined in (1). Fitness( X ) Len X 1 1 Nc Ns i1 c1 s1. (3.1) N c -total no. of cells Ns-total no of services T i [c, s]-request count of cell c & service s in time interval T i avg. service request count 3.4 Mining mobile transactions A mobile transaction database is complicated since a huge amount of mobile transaction logs is produced based on the user s mobile behaviors. Data mining is a widely used technique for discovering valuable information in a complex data set and a number of studies have discussed the issue of mobile behavior mining. However, mobile behaviors vary among different user clusters or at various time intervals. The prediction of mobile behavior will be more precise if we can find the corresponding mobile patterns in each user cluster and time interval. To provide precise locationbased services for users, effective mobile behavior mining systems are required pressingly. In order to mine the cluster-based temporal mobile sequential patterns efficiently, a novel method named CTMSP-Mine is proposed to achieve this mining procedure. The entire procedures of CTMSP-Mine algorithm can be divided into three main steps: Nc Ns T c, s i T i 2 support of each cell and service is counted in each user cluster and time interval according to the user cluster table and time interval table. The patterns, i.e., frequent 1-transactions are kept, whose support satisfies the userspecified minimal support threshold T SUP. A candidate 2- transaction is generated by joining two frequent 1- transactions if their user clusters, time intervals, and cells are the same. Then, next patterns are kept, i.e., frequent 2-transactions, whose support is larger than T SUP. Finally, the same procedures are repeated until no candidate transaction is generated. The frequent transactions are shown in Table 2. Besides, construct a service mapping table to transform services into F- Transactions in Table 2. For each service set, use a contiguous and unique symbol LS i (Large Service i) to represent it. The mapping procedure can reduce the time required to check if a mobile sequential pattern is contained in a mobile transaction sequence. After frequent transaction mapping, the frequent 1-CTMSPs can be obtained in Table Mobile transaction database transformation In this phase, F-Transactions are used to transform each mobile transaction sequence S into a frequent mobile transaction sequence. According to Table 2, if a transaction T in S is frequent, T would be transformed into the corresponding F-Transaction. Otherwise, the cell of T would be transformed into a part of path. Table 4 shows the result of frequent mobile transaction database transformed from Table 1. The main objectives and advantages are :1) service sets can be represented by symbols for efficiently processing; and 2) transactions whose support is less than the minimal support threshold can be eliminated to reduce the size of database. Table 2. Frequent Transactions [1] 1) Frequent-Transaction Mining 2) Mobile Transaction Database Transformation, and 3) CTMSP Mining Frequent Transaction Mining In this phase, the frequent transactions (F Transactions) in each user cluster and time interval are mined by applying a modified Apriori algorithm [2]. At first, the

7 987 Table 3.Frequent 1 Transactions [1] The internal nodes in the tree store the frequent mobile transactions, and the leaf nodes store the corresponding paths. Moreover, every parent node of a leaf node is designed as a hash table which stores the combinations of user cluster tables and time interval tables. The procedure of the CTMSP-Tree generation is as follows: Step 1: CTMSP-Mine generates candidate 2-CTMSPs by hashing each combination of frequent transactions from the frequent mobile transaction sequence in each pair of user cluster and time interval, and then, stores the results in the CTMSP-Tree Mobile transaction database transformation In this phase, F-Transactions are used to transform each mobile transaction sequence S into a frequent mobile transaction sequence. According to Table 2, if a transaction T in S is frequent, T would be transformed into the corresponding F-Transaction. Otherwise, the cell of T would be transformed into a part of path. Table 4 shows the result of frequent mobile transaction database transformed from Table 1. The main objectives and advantages are :1) service sets can be represented by symbols for efficiently processing; and 2) transactions whose support is less than the minimal support threshold can be eliminated to reduce the size of database CTMSPs Mining In this phase, all the CTMSPs from the frequent mobile transaction database are mined. Frequent 1- CTMSPs are obtained in the frequent-transaction mining phase. In the Frequent-transaction mining phase. In the mining algorithm, a two-level tree named Cluster-based Temporal Mobile Sequential Pattern Tree (CTMSP-Tree) is utilized. Table 4. Frequent Mobile Transaction Database Step 2: To identify frequent 2-CTMSPs, CTMSP-Mine checks the candidate patterns whose support is larger than the minimal support threshold. Step 3: CTMSP-Mine counts the support of candidate 3- CTMSPs and identifies the frequent 3-CTMSPs. The goal of CTMSP-Tree is to efficiently generate candidate mobile sequential patterns because CTMSP-Tree can quickly compare two patterns whether they have the same first and last transactions. Step 4: Repeat Step 3 until no more candidate patterns can be generated 3.5 Prediction Strategies Three prediction strategies for selecting the appropriate CTMSP are proposed to predict the mobile behaviors of users: 1) The patterns are selected only from the corresponding cluster a user belongs to; 2) The patterns are selected only from the time interval corresponding to current time; and 3) The patterns are selected only from the ones that match the user s recent mobile behaviors. If there exist more than one pattern that satisfies the above conditions, the one with the maximal support is selected. The CTMSPs are selected from the corresponding user cluster and time interval. IV. EXPERIMENTAL RESULTS After completing coding part, implemented algorithms are tested on sample of Dataset shown in TABLE 1. All research work is done in C# on 2.4 GHz machine with 4 GB of memory running Windows 7.

8 988 The chapter is organized into subsequent sections, where first section, section 1 gives results for dataset shown in TABLE1. The results of LBS are shown in 1.1.Subsection 1.2 gives results of CAST algorithm. In subsection 1.3 results of time segmenting algorithm are given. Subsections 1.4, 1.5 and 1.6 gives results of GA, results of CTMSPs and results of prediction strategies respectively. 1. Results for dataset shown in TABLE 1 (Each stepwise result) Dataset provided in TABLE 1 is used for the first testing of algorithms. It is checked, whether it is giving correct results as provided in [1]. main idea is to narrow down the range of pi effectively. Therefore total four threshold values are applied. The results of clustering are shown in figures shown below. Input: N by N similarity matrix Output: Clusters Total 4 affinity threshold values applied. 1) For pi=0.25, obtained clusters are shown in fig.4.1. When the value of pi=0.25 two clusters are generated. First cluster C 1 contains four users with ids {1, 4, 2, 7} and second cluster C 2 contains three users with ids {3, 5,6}. 1.1 Results obtained for LBS Alignment Algorithm LBS Alignment is based on the consideration that two mobile transaction sequences are more similar (refer section 3.1.1), when the orders and timestamps of their mobile transactions are more similar. Based on this concept, the time penalty (TP) and the service reward (SR) in the LBS-Alignment are specifically designed. The base similarity score is set as 0.5. Two mobile transactions can be aligned if their locations are the same. Otherwise, a location penalty is generated to decrease their similarity score. Fig.9: Clusters with pi=0.25 2) For pi=0.50, obtained clusters are shown in fig.4.2. When the value of pi=0.50 six clusters are generated. First cluster C 1 contains two users with ids {1,4}, C 2 contains user with id {2}, C 3 contains user with id {3}, C 4 contains user with id {5}, C 5 contains user with id {6} and C 6 contains user with id {7}. Input: Mobile transaction sequences from TABLE 1 Output: Similarity matrix TABLE 5 shows the similarities between 7 users. First row and column shows the user ids. From second row and second column each M i,j shows similarities between i th and j th users TABLE 5: Similarity Matrix for 7 users Fig.10: Clusters generated with pi=0.50 3) For pi=0.75 and pi=1.0 gives same clusters shown in fig.4.3. When the value of pi is 0.75 and 1.0, number of clusters formed equal to the number of users. As similarity between any users is not more than 0.75, all users will form their own clusters. First cluster C 1 contains user with id {1}, C 2 contains user with id {2}and so far C 7 contains user with id {7}. 1.2 Results of CAST algorithm: In CAST algorithm various threshold [pi] values are applied, ranging from 0 to 1(Refer section 3.1.2).The Fig.11: Clusters generated with pi=0.75 and pi=1.0 In all four cases we got some clusters as a output of a CAST algorithm. All these results are dependent on value of threshold. If threshold is small, less number of

9 989 clusters are formed. As the value of threshold increases more number of clusters are formed. 1.3 Results of Time Segmenting Algorithm Input: Mobile transaction Database and time length (Refer section 3.2). Output: Number of time segmenting points. Accumulative distributions for each location service pair These time points can be sorted as 5(2), 7(1), 10(2), 13(1), 20(1), 25(1), 28(2), and 30(2), where t(n) indicates that the number of time points t is n. Frequency wise time points are arranged in TABLE 7 TABLE 7: FREQUENCY OF TIME POINTS WITH LARGEST CHANGE RATE Maximum time change rates Frequency Time point sequence having frequency equal to or greater than average number of time points (refer section 3.2) is shown in TABLE 8.Here, average number of time points are 2. Fig. 12: Accumulative distributions [1] Fig. 4.4 shows the accumulative distributions for the mobile transaction database in TABLE 1. There are 12 pairs of locations and services, and their time points with the largest change rates are 5, 10, 13, 30, 5, 28, 10, 7, 19, 30, 25, and 28.The same change rates are shown in TABLE 6. TABLE 6: TIME POINTS WITH LARGEST CHANGE RATES (Location, Service) Maximum time change rate (A,S1) 5 (D,S2) 10 (F,{S3, S4}) 13 (K,S2) 30 (B,S1) 5 (D,S4) 28 (H,S3) 10 (C,S3) 7 (K,S5) 19 (A,S3) 30 (G,S1) 25 (F,S3) 28 TABLE 8: TIME POINT SEQUENCE Maximum time change rates Frequency Output: Time interval obtained=1 between 10 and 28. This result tells number time slots in which users request services of same type. Here in the above example there were only 7 users and limited log entries, so we got only 1 time segmenting point. 1.4 Results of GA Input: Number of time segmenting points and slots limits (refer section 3.3). Output: Time segmenting points. Frequency table obtained after selecting chromosomes is shown in Fig. 4.5.

10 990 Fig.13 Frequency tables of {13} and {20}. (a) T1: [1-13], T2: [14-32]. (b) T1: [1-20], T2: [21-32] Fitness values for {13} =1.738 and for {20} = After applying repeated crossover, finally best fitness obtained is {20} 1.5 Results of CTMSP Mining Input: Frequent transaction table (refer section 3.4) Output: CTMSP trees Step by step generated CTMSP trees are shown in Fig. 14: Fig. 14: Two CTMSP Tree After applying linear trimming technique Three CTMPs tree is generated as shown in Fig 15 Fig.15: Final CTMSP Tree Results of Prediction Strategies Final output of the system is the prediction (refer section 3.5). This framework gives next probable location and service users may request, using CTMSP tree. If network provider asks for next location of user 1, then it gives next location is A and service is 1.In this way we get predictions for all users. V. CONCLUSION AND FUTURE WORK In this paper, various algorithms for mining mobile transactions are proposed. Smart CAST algorithm and GA can be used for clustering and time segmentation approach respectively. Along with this, Cluster-Based Temporal Mobile Sequential Patterns mining technique can be used to predict next probable location of user. Using various evaluation measures we are going to test performance of the algorithms. Further work can be extended by improving time segmenting technique. Also CTMSP mining method can be applied to real datasets. In addition, the CTMSP-Mine can be used with other applications, such as GPS navigations, with the aim to enhance precision for predicting user behaviors. REFERENCES [1] Eric Hsueh-Chan Lu, Vincent S.Tseng and Philip S. Yu, Mining Cluster-Based Temporal Mobile Sequential Patterns in Location-Based Service Environment, IEEE Trans. Knowledge and Data engineering, vol. 23, no. 6, June [2] R. Agrawal, T. Imielinski, and A. Swami, Mining Association Rule between Sets of Items in Large Databases, Proc. ACM SIGMOD Conf. Management of Data, pp , May [3] L. Chen, M. Tamer O zsu, and V. Oria, Robust and Fast Similarity Search for Moving Object Trajectories, Proc. ACM SIGMOD Conf. Management of Data, pp , June [4] M.-S. Chen, J.-S. Park, and P.S. Yu, Efficient Data Mining for Path Traversal Patterns, IEEE Trans. Knowledge and Data Eng., vol. 10, no. 2, pp , Apr [5] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, Basic Local Alignment Search Tool, J. Molecular Biology, vol. 215,no. 3, pp , Oct [6] L. Chen and R. Ng, On the Marriage of Lp- Norms and Edit Distance, Proc. 30th Int l Conf. Very Large Databases, pp , Aug [7] L. Chen, M. Tamer O zsu, and V. Oria, Robust and Fast Similarity Search for Moving Object

11 991 Trajectories, Proc. ACM SIGMOD Conf. Management of Data, pp , June [8] A. Ben-Dor and Z. Yakhini, Clustering Gene Expression Patterns, J. Computational Biology, vol. 6, no. 3, pp , July1999. [9] J. Han and M. Kamber, Data Mining: Concepts and Techniques, second ed., Morgan Kaufmann, Sept [10] H. Jeung, Q. Liu, H.T. Shen, and X. Zhou, A Hybrid Prediction Model for Moving Objects, Proc. 24th Int l Conf. Data Eng., pp , Apr [11] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Mar [12] S.C. Lee, J. Paik, J. Ok, I. Song, and U.M. Kim, Efficient Mining of User Behaviors by Temporal Mobile Access Patterns, Int l J. Computer Science Security, vol. 7, no. 2, pp , Feb [13] Y.B. Lin, GSM Network Signaling, ACM Mobile Computing and Comm., vol. 1, no. 2, pp , July [14] A. Monreale, F. Pinelli, R. Trasarti, and F. Giannotti, WhereNext: A Location Predictor on Trajectory Pattern Mining, Proc. 15th Int l Conf. Knowledge Discovery and Data Mining, pp , June2009. [15] W.C. Peng and M.S. Chen, Developing Data Allocation Schemes by Incremental Mining of User Moving Patterns in a Mobile Computing System, IEEE Trans. Knowledge and Data Eng., vol. 15, no. 1, pp , Feb [16] Y.B. Lin, GSM Network Signaling, ACM Mobile Computing and Comm., vol. 1, no. 2, pp , July [17] V.S. Tseng, H.C. Lu, and C.H. Huang, Mining Temporal Mobile Sequential Patterns in Location-Based Service Environments, Proc. 13th IEEE Int l Conf. Parallel and Distributed Systems, pp. 1-8, Dec [18] V.S. Tseng and C. Kao, Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method, IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 4, pp ,Oct.-Dec [19] J. Veijalainen, Transaction in Mobile Electronic Commerce, Proc. Eighth Int l Workshop Foundations of Models and Languages for Data and Objects, pp , Sept [20] V.S. Tseng and W.C. Lin, Mining Sequential Mobile Access Patterns Efficiently in Mobile Web Systems, Proc. 19th Int l Conf. Advanced Information Networking and Applications, pp , Mar [21] C.H. Yun and M.S. Chen, Mining Mobile Sequential Patterns in a Mobile Commerce Environment, IEEE Trans. Systems, Man, and Cybernetics, Part C, vol. 37, no. 2, pp , Mar [22] Wen-Chih Peng and Ming-Syan Chen, Allocation of Shared Data based on Mobile User Movement, Proceedings of third international conference on mobile data Management 2002

SURVEY ON PERSONAL MOBILE COMMERCE PATTERN MINING AND PREDICTION

SURVEY ON PERSONAL MOBILE COMMERCE PATTERN MINING AND PREDICTION SURVEY ON PERSONAL MOBILE COMMERCE PATTERN MINING AND PREDICTION S. Jacinth Evangeline, K.M. Subramanian, Dr. K. Venkatachalam Abstract Data Mining refers to extracting or mining knowledge from large amounts

More information

A NEW METHOD FOR FINDING SIMILAR PATTERNS IN MOVING BODIES

A NEW METHOD FOR FINDING SIMILAR PATTERNS IN MOVING BODIES A NEW METHOD FOR FINDING SIMILAR PATTERNS IN MOVING BODIES Prateek Kulkarni Goa College of Engineering, India kvprateek@gmail.com Abstract: An important consideration in similarity-based retrieval of moving

More information

PTclose: A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets

PTclose: A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets : A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets J. Tahmores Nezhad ℵ, M.H.Sadreddini Abstract In recent years, various algorithms for mining closed frequent

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

Temporal Weighted Association Rule Mining for Classification

Temporal Weighted Association Rule Mining for Classification Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider

More information

Where Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf

Where Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf Where Next? Data Mining Techniques and Challenges for Trajectory Prediction Slides credit: Layla Pournajaf o Navigational services. o Traffic management. o Location-based advertising. Source: A. Monreale,

More information

Appropriate Item Partition for Improving the Mining Performance

Appropriate Item Partition for Improving the Mining Performance Appropriate Item Partition for Improving the Mining Performance Tzung-Pei Hong 1,2, Jheng-Nan Huang 1, Kawuu W. Lin 3 and Wen-Yang Lin 1 1 Department of Computer Science and Information Engineering National

More information

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach ABSTRACT G.Ravi Kumar 1 Dr.G.A. Ramachandra 2 G.Sunitha 3 1. Research Scholar, Department of Computer Science &Technology,

More information

Mining Quantitative Association Rules on Overlapped Intervals

Mining Quantitative Association Rules on Overlapped Intervals Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang,

More information

DMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE

DMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE DMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE Saravanan.Suba Assistant Professor of Computer Science Kamarajar Government Art & Science College Surandai, TN, India-627859 Email:saravanansuba@rediffmail.com

More information

Using Association Rules for Better Treatment of Missing Values

Using Association Rules for Better Treatment of Missing Values Using Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine Intelligence Group) National University

More information

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest Bhakti V. Gavali 1, Prof. Vivekanand Reddy 2 1 Department of Computer Science and Engineering, Visvesvaraya Technological

More information

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Hidden Markov Models Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi Sequential Data Time-series: Stock market, weather, speech, video Ordered: Text, genes Sequential

More information

The k-means Algorithm and Genetic Algorithm

The k-means Algorithm and Genetic Algorithm The k-means Algorithm and Genetic Algorithm k-means algorithm Genetic algorithm Rough set approach Fuzzy set approaches Chapter 8 2 The K-Means Algorithm The K-Means algorithm is a simple yet effective

More information

AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR UTILITY MINING. Received April 2011; revised October 2011

AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR UTILITY MINING. Received April 2011; revised October 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 7(B), July 2012 pp. 5165 5178 AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR

More information

Trajectory analysis. Ivan Kukanov

Trajectory analysis. Ivan Kukanov Trajectory analysis Ivan Kukanov Joensuu, 2014 Semantic Trajectory Mining for Location Prediction Josh Jia-Ching Ying Tz-Chiao Weng Vincent S. Tseng Taiwan Wang-Chien Lee Wang-Chien Lee USA Copyright 2011

More information

Mining High Average-Utility Itemsets

Mining High Average-Utility Itemsets Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

Detect tracking behavior among trajectory data

Detect tracking behavior among trajectory data Detect tracking behavior among trajectory data Jianqiu Xu, Jiangang Zhou Nanjing University of Aeronautics and Astronautics, China, jianqiu@nuaa.edu.cn, jiangangzhou@nuaa.edu.cn Abstract. Due to the continuing

More information

Applying Data Mining to Wireless Networks

Applying Data Mining to Wireless Networks Applying Data Mining to Wireless Networks CHENG-MING HUANG 1, TZUNG-PEI HONG 2 and SHI-JINN HORNG 3,4 1 Department of Electrical Engineering National Taiwan University of Science and Technology, Taipei,

More information

Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce

Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce Maria Malek and Hubert Kadima EISTI-LARIS laboratory, Ave du Parc, 95011 Cergy-Pontoise, FRANCE {maria.malek,hubert.kadima}@eisti.fr

More information

Optimization of Association Rule Mining through Genetic Algorithm

Optimization of Association Rule Mining through Genetic Algorithm Optimization of Association Rule Mining through Genetic Algorithm RUPALI HALDULAKAR School of Information Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, Madhya Pradesh India Prof. JITENDRA

More information

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

Mining Frequent Itemsets for data streams over Weighted Sliding Windows Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology

More information

Discovery of Frequent Itemset and Promising Frequent Itemset Using Incremental Association Rule Mining Over Stream Data Mining

Discovery of Frequent Itemset and Promising Frequent Itemset Using Incremental Association Rule Mining Over Stream Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.923

More information

TRAJECTORY PATTERN MINING

TRAJECTORY PATTERN MINING TRAJECTORY PATTERN MINING Fosca Giannotti, Micro Nanni, Dino Pedreschi, Martha Axiak Marco Muscat Introduction 2 Nowadays data on the spatial and temporal location is objects is available. Gps, GSM towers,

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

A Hybrid Genetic Algorithm for the Distributed Permutation Flowshop Scheduling Problem Yan Li 1, a*, Zhigang Chen 2, b

A Hybrid Genetic Algorithm for the Distributed Permutation Flowshop Scheduling Problem Yan Li 1, a*, Zhigang Chen 2, b International Conference on Information Technology and Management Innovation (ICITMI 2015) A Hybrid Genetic Algorithm for the Distributed Permutation Flowshop Scheduling Problem Yan Li 1, a*, Zhigang Chen

More information

Mining of Web Server Logs using Extended Apriori Algorithm

Mining of Web Server Logs using Extended Apriori Algorithm International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Maintenance of fast updated frequent pattern trees for record deletion

Maintenance of fast updated frequent pattern trees for record deletion Maintenance of fast updated frequent pattern trees for record deletion Tzung-Pei Hong a,b,, Chun-Wei Lin c, Yu-Lung Wu d a Department of Computer Science and Information Engineering, National University

More information

A Novel Texture Classification Procedure by using Association Rules

A Novel Texture Classification Procedure by using Association Rules ITB J. ICT Vol. 2, No. 2, 2008, 03-4 03 A Novel Texture Classification Procedure by using Association Rules L. Jaba Sheela & V.Shanthi 2 Panimalar Engineering College, Chennai. 2 St.Joseph s Engineering

More information

An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST

An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST Alexander Chan 5075504 Biochemistry 218 Final Project An Analysis of Pairwise

More information

Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree

Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree Virendra Kumar Shrivastava 1, Parveen Kumar 2, K. R. Pardasani 3 1 Department of Computer Science & Engineering, Singhania

More information

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 5, No. 2, April 2011 CSES International 2011 ISSN 0973-4406 A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

More information

Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points

Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points Dr. T. VELMURUGAN Associate professor, PG and Research Department of Computer Science, D.G.Vaishnav College, Chennai-600106,

More information

A Novel Method of Optimizing Website Structure

A Novel Method of Optimizing Website Structure A Novel Method of Optimizing Website Structure Mingjun Li 1, Mingxin Zhang 2, Jinlong Zheng 2 1 School of Computer and Information Engineering, Harbin University of Commerce, Harbin, 150028, China 2 School

More information

An Efficient Tree-based Fuzzy Data Mining Approach

An Efficient Tree-based Fuzzy Data Mining Approach 150 International Journal of Fuzzy Systems, Vol. 12, No. 2, June 2010 An Efficient Tree-based Fuzzy Data Mining Approach Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Abstract 1 In the past, many algorithms

More information

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey G. Shivaprasad, N. V. Subbareddy and U. Dinesh Acharya

More information

FUFM-High Utility Itemsets in Transactional Database

FUFM-High Utility Itemsets in Transactional Database Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 3, March 2014,

More information

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, rajashreeshettar@rvce.edu.in Abstract

More information

Implementation of CHUD based on Association Matrix

Implementation of CHUD based on Association Matrix Implementation of CHUD based on Association Matrix Abhijit P. Ingale 1, Kailash Patidar 2, Megha Jain 3 1 apingale83@gmail.com, 2 kailashpatidar123@gmail.com, 3 06meghajain@gmail.com, Sri Satya Sai Institute

More information

Utility Mining: An Enhanced UP Growth Algorithm for Finding Maximal High Utility Itemsets

Utility Mining: An Enhanced UP Growth Algorithm for Finding Maximal High Utility Itemsets Utility Mining: An Enhanced UP Growth Algorithm for Finding Maximal High Utility Itemsets C. Sivamathi 1, Dr. S. Vijayarani 2 1 Ph.D Research Scholar, 2 Assistant Professor, Department of CSE, Bharathiar

More information

Keshavamurthy B.N., Mitesh Sharma and Durga Toshniwal

Keshavamurthy B.N., Mitesh Sharma and Durga Toshniwal Keshavamurthy B.N., Mitesh Sharma and Durga Toshniwal Department of Electronics and Computer Engineering, Indian Institute of Technology, Roorkee, Uttarkhand, India. bnkeshav123@gmail.com, mitusuec@iitr.ernet.in,

More information

An Enhanced K-Medoid Clustering Algorithm

An Enhanced K-Medoid Clustering Algorithm An Enhanced Clustering Algorithm Archna Kumari Science &Engineering kumara.archana14@gmail.com Pramod S. Nair Science &Engineering, pramodsnair@yahoo.com Sheetal Kumrawat Science &Engineering, sheetal2692@gmail.com

More information

An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining

An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining P.Subhashini 1, Dr.G.Gunasekaran 2 Research Scholar, Dept. of Information Technology, St.Peter s University,

More information

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a

Research on Applications of Data Mining in Electronic Commerce. Xiuping YANG 1, a International Conference on Education Technology, Management and Humanities Science (ETMHS 2015) Research on Applications of Data Mining in Electronic Commerce Xiuping YANG 1, a 1 Computer Science Department,

More information

Parallel Mining of Maximal Frequent Itemsets in PC Clusters

Parallel Mining of Maximal Frequent Itemsets in PC Clusters Proceedings of the International MultiConference of Engineers and Computer Scientists 28 Vol I IMECS 28, 19-21 March, 28, Hong Kong Parallel Mining of Maximal Frequent Itemsets in PC Clusters Vong Chan

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering

Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of

More information

Normalization based K means Clustering Algorithm

Normalization based K means Clustering Algorithm Normalization based K means Clustering Algorithm Deepali Virmani 1,Shweta Taneja 2,Geetika Malhotra 3 1 Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi Email:deepalivirmani@gmail.com

More information

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE Vandit Agarwal 1, Mandhani Kushal 2 and Preetham Kumar 3

More information

Efficient Remining of Generalized Multi-supported Association Rules under Support Update

Efficient Remining of Generalized Multi-supported Association Rules under Support Update Efficient Remining of Generalized Multi-supported Association Rules under Support Update WEN-YANG LIN 1 and MING-CHENG TSENG 1 Dept. of Information Management, Institute of Information Engineering I-Shou

More information

Maintenance of the Prelarge Trees for Record Deletion

Maintenance of the Prelarge Trees for Record Deletion 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31, 2007 105 Maintenance of the Prelarge Trees for Record Deletion Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Department of

More information

Item Set Extraction of Mining Association Rule

Item Set Extraction of Mining Association Rule Item Set Extraction of Mining Association Rule Shabana Yasmeen, Prof. P.Pradeep Kumar, A.Ranjith Kumar Department CSE, Vivekananda Institute of Technology and Science, Karimnagar, A.P, India Abstract:

More information

Comparative Study of Subspace Clustering Algorithms

Comparative Study of Subspace Clustering Algorithms Comparative Study of Subspace Clustering Algorithms S.Chitra Nayagam, Asst Prof., Dept of Computer Applications, Don Bosco College, Panjim, Goa. Abstract-A cluster is a collection of data objects that

More information

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix

Privacy-Preserving of Check-in Services in MSNS Based on a Bit Matrix BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 2 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0032 Privacy-Preserving of Check-in

More information

Keywords: Frequent itemset, closed high utility itemset, utility mining, data mining, traverse path. I. INTRODUCTION

Keywords: Frequent itemset, closed high utility itemset, utility mining, data mining, traverse path. I. INTRODUCTION ISSN: 2321-7782 (Online) Impact Factor: 6.047 Volume 4, Issue 11, November 2016 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case

More information

Order Preserving Clustering by Finding Frequent Orders in Gene Expression Data

Order Preserving Clustering by Finding Frequent Orders in Gene Expression Data Order Preserving Clustering by Finding Frequent Orders in Gene Expression Data Li Teng and Laiwan Chan Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong Abstract.

More information

KEYWORDS: Clustering, RFPCM Algorithm, Ranking Method, Query Redirection Method.

KEYWORDS: Clustering, RFPCM Algorithm, Ranking Method, Query Redirection Method. IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY IMPROVED ROUGH FUZZY POSSIBILISTIC C-MEANS (RFPCM) CLUSTERING ALGORITHM FOR MARKET DATA T.Buvana*, Dr.P.krishnakumari *Research

More information

CSCI6405 Project - Association rules mining

CSCI6405 Project - Association rules mining CSCI6405 Project - Association rules mining Xuehai Wang xwang@ca.dalc.ca B00182688 Xiaobo Chen xiaobo@ca.dal.ca B00123238 December 7, 2003 Chen Shen cshen@cs.dal.ca B00188996 Contents 1 Introduction: 2

More information

CONTENT ADAPTIVE SCREEN IMAGE SCALING

CONTENT ADAPTIVE SCREEN IMAGE SCALING CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT

More information

An Algorithm for Mining Large Sequences in Databases

An Algorithm for Mining Large Sequences in Databases 149 An Algorithm for Mining Large Sequences in Databases Bharat Bhasker, Indian Institute of Management, Lucknow, India, bhasker@iiml.ac.in ABSTRACT Frequent sequence mining is a fundamental and essential

More information

Categorization of Sequential Data using Associative Classifiers

Categorization of Sequential Data using Associative Classifiers Categorization of Sequential Data using Associative Classifiers Mrs. R. Meenakshi, MCA., MPhil., Research Scholar, Mrs. J.S. Subhashini, MCA., M.Phil., Assistant Professor, Department of Computer Science,

More information

A Genetic Algorithm-Based Approach for Energy- Efficient Clustering of Wireless Sensor Networks

A Genetic Algorithm-Based Approach for Energy- Efficient Clustering of Wireless Sensor Networks A Genetic Algorithm-Based Approach for Energy- Efficient Clustering of Wireless Sensor Networks A. Zahmatkesh and M. H. Yaghmaee Abstract In this paper, we propose a Genetic Algorithm (GA) to optimize

More information

An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets

An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.8, August 2008 121 An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets

More information

Voronoi-based Trajectory Search Algorithm for Multi-locations in Road Networks

Voronoi-based Trajectory Search Algorithm for Multi-locations in Road Networks Journal of Computational Information Systems 11: 10 (2015) 3459 3467 Available at http://www.jofcis.com Voronoi-based Trajectory Search Algorithm for Multi-locations in Road Networks Yu CHEN, Jian XU,

More information

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE Sinu T S 1, Mr.Joseph George 1,2 Computer Science and Engineering, Adi Shankara Institute of Engineering

More information

A Conflict-Based Confidence Measure for Associative Classification

A Conflict-Based Confidence Measure for Associative Classification A Conflict-Based Confidence Measure for Associative Classification Peerapon Vateekul and Mei-Ling Shyu Department of Electrical and Computer Engineering University of Miami Coral Gables, FL 33124, USA

More information

Review on Data Mining Techniques for Intrusion Detection System

Review on Data Mining Techniques for Intrusion Detection System Review on Data Mining Techniques for Intrusion Detection System Sandeep D 1, M. S. Chaudhari 2 Research Scholar, Dept. of Computer Science, P.B.C.E, Nagpur, India 1 HoD, Dept. of Computer Science, P.B.C.E,

More information

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results Yaochun Huang, Hui Xiong, Weili Wu, and Sam Y. Sung 3 Computer Science Department, University of Texas - Dallas, USA, {yxh03800,wxw0000}@utdallas.edu

More information

On Multiple Query Optimization in Data Mining

On Multiple Query Optimization in Data Mining On Multiple Query Optimization in Data Mining Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland {marek,mzakrz}@cs.put.poznan.pl

More information

ANALYSIS OF DENSE AND SPARSE PATTERNS TO IMPROVE MINING EFFICIENCY

ANALYSIS OF DENSE AND SPARSE PATTERNS TO IMPROVE MINING EFFICIENCY ANALYSIS OF DENSE AND SPARSE PATTERNS TO IMPROVE MINING EFFICIENCY A. Veeramuthu Department of Information Technology, Sathyabama University, Chennai India E-Mail: aveeramuthu@gmail.com ABSTRACT Generally,

More information

Generation of Potential High Utility Itemsets from Transactional Databases

Generation of Potential High Utility Itemsets from Transactional Databases Generation of Potential High Utility Itemsets from Transactional Databases Rajmohan.C Priya.G Niveditha.C Pragathi.R Asst.Prof/IT, Dept of IT Dept of IT Dept of IT SREC, Coimbatore,INDIA,SREC,Coimbatore,.INDIA

More information

ETP-Mine: An Efficient Method for Mining Transitional Patterns

ETP-Mine: An Efficient Method for Mining Transitional Patterns ETP-Mine: An Efficient Method for Mining Transitional Patterns B. Kiran Kumar 1 and A. Bhaskar 2 1 Department of M.C.A., Kakatiya Institute of Technology & Science, A.P. INDIA. kirankumar.bejjanki@gmail.com

More information

Keywords Data alignment, Data annotation, Web database, Search Result Record

Keywords Data alignment, Data annotation, Web database, Search Result Record Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web

More information

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering Nghiem Van Tinh 1, Vu Viet Vu 1, Tran Thi Ngoc Linh 1 1 Thai Nguyen University of

More information

Using Pattern-Join and Purchase-Combination for Mining Web Transaction Patterns in an Electronic Commerce Environment

Using Pattern-Join and Purchase-Combination for Mining Web Transaction Patterns in an Electronic Commerce Environment Using Pattern-Join and Purchase-Combination for Mining Web Transaction Patterns in an Electronic Commerce Environment Ching-Huang Yun and Ming-Syan Chen Department of Electrical Engineering National Taiwan

More information

OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT

OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT Asif Ali Khan*, Laiq Hassan*, Salim Ullah* ABSTRACT: In bioinformatics, sequence alignment is a common and insistent task. Biologists align

More information

SETM*-MaxK: An Efficient SET-Based Approach to Find the Largest Itemset

SETM*-MaxK: An Efficient SET-Based Approach to Find the Largest Itemset SETM*-MaxK: An Efficient SET-Based Approach to Find the Largest Itemset Ye-In Chang and Yu-Ming Hsieh Dept. of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan, Republic

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Document Clustering using Feature Selection Based on Multiviewpoint and Link Similarity Measure

Document Clustering using Feature Selection Based on Multiviewpoint and Link Similarity Measure Document Clustering using Feature Selection Based on Multiviewpoint and Link Similarity Measure Neelam Singh neelamjain.jain@gmail.com Neha Garg nehagarg.february@gmail.com Janmejay Pant geujay2010@gmail.com

More information

Efficient Incremental Mining of Top-K Frequent Closed Itemsets

Efficient Incremental Mining of Top-K Frequent Closed Itemsets Efficient Incremental Mining of Top- Frequent Closed Itemsets Andrea Pietracaprina and Fabio Vandin Dipartimento di Ingegneria dell Informazione, Università di Padova, Via Gradenigo 6/B, 35131, Padova,

More information

Improved Frequent Pattern Mining Algorithm with Indexing

Improved Frequent Pattern Mining Algorithm with Indexing IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.

More information

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining Miss. Rituja M. Zagade Computer Engineering Department,JSPM,NTC RSSOER,Savitribai Phule Pune University Pune,India

More information

e-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data

e-ccc-biclustering: Related work on biclustering algorithms for time series gene expression data : Related work on biclustering algorithms for time series gene expression data Sara C. Madeira 1,2,3, Arlindo L. Oliveira 1,2 1 Knowledge Discovery and Bioinformatics (KDBIO) group, INESC-ID, Lisbon, Portugal

More information

Generating Cross level Rules: An automated approach

Generating Cross level Rules: An automated approach Generating Cross level Rules: An automated approach Ashok 1, Sonika Dhingra 1 1HOD, Dept of Software Engg.,Bhiwani Institute of Technology, Bhiwani, India 1M.Tech Student, Dept of Software Engg.,Bhiwani

More information

Dynamic Broadcast Scheduling in DDBMS

Dynamic Broadcast Scheduling in DDBMS Dynamic Broadcast Scheduling in DDBMS Babu Santhalingam #1, C.Gunasekar #2, K.Jayakumar #3 #1 Asst. Professor, Computer Science and Applications Department, SCSVMV University, Kanchipuram, India, #2 Research

More information

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Available online at  ScienceDirect. Procedia Computer Science 89 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 89 (2016 ) 341 348 Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) Parallel Approach

More information

Knowledge discovery from XML Database

Knowledge discovery from XML Database Knowledge discovery from XML Database Pravin P. Chothe 1 Prof. S. V. Patil 2 Prof.S. H. Dinde 3 PG Scholar, ADCET, Professor, ADCET Ashta, Professor, SGI, Atigre, Maharashtra, India Maharashtra, India

More information

USING FREQUENT PATTERN MINING ALGORITHMS IN TEXT ANALYSIS

USING FREQUENT PATTERN MINING ALGORITHMS IN TEXT ANALYSIS INFORMATION SYSTEMS IN MANAGEMENT Information Systems in Management (2017) Vol. 6 (3) 213 222 USING FREQUENT PATTERN MINING ALGORITHMS IN TEXT ANALYSIS PIOTR OŻDŻYŃSKI, DANUTA ZAKRZEWSKA Institute of Information

More information

IMPROVING APRIORI ALGORITHM USING PAFI AND TDFI

IMPROVING APRIORI ALGORITHM USING PAFI AND TDFI IMPROVING APRIORI ALGORITHM USING PAFI AND TDFI Manali Patekar 1, Chirag Pujari 2, Juee Save 3 1,2,3 Computer Engineering, St. John College of Engineering And Technology, Palghar Mumbai, (India) ABSTRACT

More information

Association Rule Mining from XML Data

Association Rule Mining from XML Data 144 Conference on Data Mining DMIN'06 Association Rule Mining from XML Data Qin Ding and Gnanasekaran Sundarraj Computer Science Program The Pennsylvania State University at Harrisburg Middletown, PA 17057,

More information

A New Technique to Optimize User s Browsing Session using Data Mining

A New Technique to Optimize User s Browsing Session using Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

DISCOVERING ACTIVE AND PROFITABLE PATTERNS WITH RFM (RECENCY, FREQUENCY AND MONETARY) SEQUENTIAL PATTERN MINING A CONSTRAINT BASED APPROACH

DISCOVERING ACTIVE AND PROFITABLE PATTERNS WITH RFM (RECENCY, FREQUENCY AND MONETARY) SEQUENTIAL PATTERN MINING A CONSTRAINT BASED APPROACH International Journal of Information Technology and Knowledge Management January-June 2011, Volume 4, No. 1, pp. 27-32 DISCOVERING ACTIVE AND PROFITABLE PATTERNS WITH RFM (RECENCY, FREQUENCY AND MONETARY)

More information

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid

An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid An Application of Genetic Algorithm for Auto-body Panel Die-design Case Library Based on Grid Demin Wang 2, Hong Zhu 1, and Xin Liu 2 1 College of Computer Science and Technology, Jilin University, Changchun

More information

Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets

Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets American Journal of Applied Sciences 2 (5): 926-931, 2005 ISSN 1546-9239 Science Publications, 2005 Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets 1 Ravindra Patel, 2 S.S.

More information

Analyzing Outlier Detection Techniques with Hybrid Method

Analyzing Outlier Detection Techniques with Hybrid Method Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,

More information

Mining Fastest Path from Trajectories with Multiple. Destinations in Road Networks

Mining Fastest Path from Trajectories with Multiple. Destinations in Road Networks Mining Fastest Path from Trajectories with Multiple Destinations in Road Networks Eric Hsueh-Chan Lu 1, Wang-Chien Lee 2, and Vincent S. Tseng 1 1 Department of Computer Science and Information Engineering,

More information

Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management

Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management Kranti Patil 1, Jayashree Fegade 2, Diksha Chiramade 3, Srujan Patil 4, Pradnya A. Vikhar 5 1,2,3,4,5 KCES

More information

COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching

COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1 Database Searching In database search, we typically have a large sequence database

More information

Web Data mining-a Research area in Web usage mining

Web Data mining-a Research area in Web usage mining IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,

More information