Prediction of user navigation patterns by mining the temporal web usage evolution

Size: px
Start display at page:

Download "Prediction of user navigation patterns by mining the temporal web usage evolution"

Transcription

1 Soft Comput (28) 12: DOI 1.17/s y FOCUS Prediction of user navigation patterns by mining the temporal web usage evolution Vincent S. Tseng Kawuu Weicheng Lin Jeng-Chuan Chang Published online: 23 May 27 Springer-Verlag 27 Abstract Advances in the data mining technologies have enabled the intelligent Web abilities in various applications by utilizing the hidden user behavior patterns discovered from the Web logs. Intelligent methods for discovering and predicting user s patterns is important in supporting intelligent Web applications like personalized services. Although numerous studies have been done on Web usage mining,few of them consider the temporal evolution characteristic in discovering web user s patterns. In this paper, we propose a novel data mining algorithm named Temporal N-Gram (TN- Gram) for constructing prediction models of Web user navigation by considering the temporality property in Web usage evolution. Moreover, three kinds of new measures are proposed for evaluating the temporal evolution of navigation patterns under different time periods. Through experimental evaluation on both of real-life and simulated datasets, the proposed TN-Gram model is shown to outperform other approaches like N-gram modeling in terms of prediction precision, in particular when the web user s navigating behavior changes significantly with temporal evolution. Keywords Temporal patterns Navigation patterns Data mining Personalized services V. S. Tseng (B) K. W. Lin J.-C. Chang Institute of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, ROC tsengsm@mail.ncku.edu.tw K. W. Lin linwc@idb.csie.ncku.edu.tw J.-C. Chang invers@idb.csie.ncku.edu.tw 1 Introduction Advances in the data mining technologies have enabled the intelligent Web abilities in various applications like page recommendation, page prefetching and personalization navigation by utilizing the hidden user behavior patterns discovered from the Web logs (Borges and Levene 1999; Nanopoulos et al. 23; Padmanabhan and Mogul 1996). The behavior patterns contain a lot of useful information because the patterns directly reflect the Web site usage of users, and thus form the basis of intelligent Web development. However, discovering the patterns from the big amount of Web logs is challenging, and it is becoming an important research topic of data mining recently, namely Web Usage Mining. For the research on Web mining, numerous studies have been done on discovering the users behavior patterns in various aspects. In Tan and Kumar (22), the authors apply the association rules to the discovery of associated pageviews. An intuitive application, for example, is using the discovered associated pages to improve the Web site structure. For the linking characteristic of Web sites, several studies (Tan et al. 2) discussed the indirect association relation. The sequential pattern (Agrawal and Srikant 1995), which reveals the sequential page-views of users, was a widely discussed topic. Moreover, some studies focused on developing the clustering methods (Frias-Martinez and Karamcheti 22; Wang and Zaiane 22) to cluster the users with similar behavior or cluster the Web pages. Most of past studies assumed the Web usage patterns are invariant with time (Frias-Martinez and Karamcheti 22; Gündüz and Özsu 23a,b; Pitkow and Pirolli 1999, Srivastava et al. 2; Su et al. 2; Tan and Kumar 22) and few of them took into account the temporal characteristic or temporal evolution of Web usage. In fact, user s Web usage patterns may change with time, i.e., a Web visitor may

2 158 V. S. Tseng et al. have different behavior on the same Web at different time. For instance, students may usually search and browse professional literature at day time and visit the auction web sites at night. Obviously, the temporal evolution of Web usage patterns is an important feature that should be considered in order to construct an effective prediction models for user navigation patterns. In this paper, we propose a novel method named Temporal N-Gram (TN-Gram) for constructing prediction models of Web user navigation by considering the important factor of temporal evolution of user s navigation patterns. Moreover, three kinds of measures, namely Support-based Fundamental Rule Changes, Confidence-based Fundamental Rule Changes and Prediction Rule Changes, are proposed for evaluating the temporal evolution of navigation patterns under different time periods by utilizing the concept of fundamental rule changes proposed in Liu et al. (21). Through experimental evaluation on both of real-life and simulated datasets, our method is shown to outperform other existing algorithms like N-gram modeling in terms of the prediction when user s navigating behavior changes with time. The rest of this paper is organized as follows. Section 2 briefly reviews the research motivation and the related work about prediction methods on web usage mining. Section 3 presents the proposed prediction model. Section 4 describes three methods for measuring the change of temporality of the web log. In Sect. 5, we give the experimental result. Finally, we conclude our work in Sect Motivation and related work Three main steps in constructing a prediction model for Web usage are presented in Srivastava et al. (2). The first step is data preprocessing for cleaning the dataset and converting the Web log into a session file that contains a click-stream of page-views for each visitor. The second step is pattern discovery. A popular approach is the N-gram model (Su et al. 2), which discovers user navigation patterns that hold the order, adjacency and recency information. The third step is pattern analysis that predicts the user s next request. In Yang et al. (24) present different methods systematically for building association rule-based prediction models from web logs. They focus on three features of association rules, namely the order, adjacency, and recency. According to these features, five types of association rules are considered, namely subset rules, subsequence rules, latest subsequence rules, substring rules and latest substring rules, as shown in Table 1. The motivation of this research is to take into account the temporality, which is ignored in previous studies. Some studies have also considered other features of rules. Frias-Martinez and Karamcheti (22) propose a prediction model for user access sequences, called sequential Table 1 Comparisons of different prediciton models Order Adjacency Recency Temporality Subset Subsequence Y Kerning-1latest Y Y Subsequence Kerning-1substring Y Y Latest Substring Y Y Y Our work Y Y Y Y association rules, for capturing the sequential properties and temporality on the visited web pages. Note that the temporality discussed in Frias-Martinez and Karamcheti (22) is different from the one we target in this work. The temporality defined in Frias-Martinez and Karamcheti (22)is the time distance between the antecedent and the consequent pages, while our temporality indicates the starting access time of a user session. Gündüz and Özsu (23a,b) also consider the feature of time, but the feature denotes the time spent on the sequences of visiting pages. Since the start time of a session is the access time of the first request, it is interesting to consider the time of the latest request. Zukerman et al. propose the Time and Second-order Time Markov models that consider temporal information (Nicholson et al. 1998). The Time Markov model depends on the latest request while the Second-order Time Markov model focuses on the latest request and the referring request before it. The Markov model is a well-know model for stochastic processes (Papoulis 1991) and it has become a well-suited approach for modeling and predicting users behavior in web mining. A k th -order Markov model for navigation patterns equals a k-gram model in definition. In Nanopoulos et al. (23), Alexandros et al. collect four traditional models, namely Dependency Graph (DG) (Padmanabhan and Mogul 1996), m-order Prediction-by-Partial-Match (PPM) (Palpanas and Mendelzon 1999), Markov models, and Markov models for ordering (WM o ), where WM o is a generalization of DG and one-order PPM. In addition to various Markov models, the longest repeating subsequences model was proposed for prediction in Pitkow and Pirolli (1999). However, none of the existing studies took into consideration the factor of temporal evolution of Web usage patterns in constructing the prediction models. This motivates our research in complementing the insufficiency for the literature. 3 Proposed prediction model The proposed model for predicting temporal navigation patterns consist of two main components, namely the calendar

3 Prediction of user navigation patterns by mining the temporal web usage evolution 159 schema for representing the timing dimension and the Temporal N-gram model for predicting user s temporal navigation patterns. In the following, we describe the calendar schema and temporal N-gram model, respectively. As discussed in previous sections, the existing studies lack the consideration of temporality factor in constructing the user navigation models. To support the dimension of temporality for navigation patterns, a calendar schema is needed. Here, we use the calendar schema for rule patterns proposed by Li et al. (22) as the base. Specifically, a calendar schema C is defined as follows: C = (G n : D n, G n 1 : D n 1,...,G 1 : D 1 ). (1) G i is a granularity name like year or month, and D i is a finite subset of the valid value corresponding to G i.for example, if we want to investigate the temporal navigation patterns in units of every hour, we may design a calendar schema in form like (hour : ), (hour : 1),..., (hour : 23) with 24 segments. Before building the N-gram model for pattern discovery, it is necessary to perform data preprocessing, i.e., converting a web log into a session file. In our work, we eliminate multimedia files like gif or jpg and script files like js, and a session is considered as an abnormal access pattern if its length L is such that L < 3orL >(5 L avg ), where L avg is the average session length. An abnormal access pattern might result from a web spider or a visitor s incautious access through a search engine. For the modeling and prediction of temporal navigation patterns, we propose a new approach named Temporal N-Gram (abbreviated as TN-Gram). The proposed approach is based on the N-Gram model (Su et al. 2), and the main difference is that we utilize the calendar schema to determine the temporality of a session. Moreover, all I-gram models, which are also known as all K th Markov models, are discovered by TN-Gram. In the following, we list the two algorithms, namely the TN-Gram_Building and TN- Gram_Predict, for building the models and performing predictions, respectively. Here, a hash table H[] is produced for each calendar schema (or the granularity). For example, given a calendar schema with each hour as the granularity, twenty four hash table H[] will be produced. Besides, the term C(S) indicates the calendar schema that matches the starting time of the session S. After the TN-Gram model is built, we use a support-based mechanism to prune the mined patterns with low support. The following lists the algorithm for predicting user s navigation pattern. The prediction algorithm takes as input the built TN-Gram model and the user s active session, and it returns the predicted page the user will navigate in the next action. Algorithm TN-Gram_Building Input: L // the session file C // the calendar schema Output: H 1 [], H 2 [],..., H n [] where n = C //H i [] is the result of the n-gram model corresponding to C[i] Begin T[] := //a hash table for counting H[] := //the result table Max[] := //records the maximum count For i := 1 to L Do S:=L[i] For j := 1 to S Do P := substring(s, j, n) //P is the antecedent R := substring(s, j + 1, 1) //R is the next click T k [P, R] := T k [P, R] + 1 where C(S) C k If T k [P, R] > Max k [P] Then H k [P] := R End If End For End For Return H[] End Algorithm TN-Gram_Predict Input: H[] //generated from Algorithm Model_Building S //user s active session Output: R //the predicted item (or web page) Begin For i := S downto 1 Do If S is an index in hash table HC(S) Then R := HC(S)[S] Return R End If S := S after removing the first element End For Return No Matched Prediction End 4 Detection for changes of temporality In this research, we also propose new methods for discovering the evolution of navigation patterns. Since our prediction model is based on N-gram, we will detect the evolution of the mined navigation models. Considering the nature of temporality, we separate the original session file by calendar schemas. To find out the evolutional changes, we perform the following two basic steps: 1. Rule generation. We first mine navigation rules in each sub-dataset that is separated from the original session file according to the calendar schema. We define R, the set of navigation rules as given in Liu et al. (21). R ={r r to(r 1 UR 2 )} (2) Each rule in R is associated with a support and value. For different datasets under varied time periods, different rule sets will be discovered. 2. Identification of rule changes. After R is generated, we can identify the changes of rules between rule sets under different time periods. The two fundamental rule changes

4 16 V. S. Tseng et al. for support and addressed in Liuetal. (21)are used as our base. To address the of the prediction, we propose another measure to evaluate the prediction rule changes. 4.1 Support-based fundamental rule changes Since the support is the ratio of the pattern consisted of the antecedent and the consequence in all sessions, we have the same formula to calculate the expected support of navigation rules as association rules. Consider the rule, AB y, where both A and B are m-gram divided from the antecedent, and y is the consequence as the predictor. Let ABy be the pattern. We can intuitively assume that the support of ABy increases with the increase in support of both AB and By. The expected support is calculated by the formulas as follows. ErAB(sup t2(r)) = min sup t1(r) sup t2(rab), 1 sup t1(rab) sup t1(r) ErBy(sup t2(r)) = min sup t2(rby), 1 sup t1(rby) (4) 4.2 Confidence-based fundamental rule changes For navigation rules, the continuity is the important element although it is not considered in association rules. Hence, we have to modify the definition of expected s. As described above, considering the rule, AB y, we may investigate the relationship among A B, B y, and AB y. Here, A B means that A appears in front of B. Our assumption is that the of AB y increases if the of B y increases or the of A B decreases. It is proven as follows: sup(aby) sup(ab) = sup(b) conf(a B) sup(aby) conf(b y) sup(by) = conf(a B) conf(a B) sup(aby) sup(by) conf(b y) conf(ab y) = sup(aby) = We have the result: conf(b y) conf(ab y) (6) conf(a B) According to (4), the expected s are computed as follows: Er A B(conft2(r)) conft1(ra B) = min conft1(r) Er B y(conft2(r)) = min conft2(ra B), 1 conft1(r) conft2(rb y), 1 conft1(rb y) (3) (5) (7) (8) 4.3 Changes of prediction rule For the two kinds of rule changes described above, a change is fundamental if it can not be explained by other changes. Even though we can determine the changes of rules between two time periods, we can not conclude they will influent the of prediction. For example, assume that we have a navigation rule in one sub-dataset (time period 1) as a, b y with support and as 1 and 2%, respectively. For the same rule, assuming that the support and are 1 and 7%, respectively, in another sub-dataset (time period 2). In this case, we shall consider the changes are significant. Notice that the fundamental rule changes significantly in both support and, but the of prediction is not reduced if there exists no other rule of the same antecedence with a higher. The key point here is that we notice the difference of the subsequence instead of the support or of the rules. In rule generation, we replace R by P as P ={p p (P 1 UP 2 )} (9) Here, P denotes a prediction rule that is a navigation rule with the highest for one antecedent. Therefore, it is different from R, which is an antecedent and corresponds to the only subsequence in P. If the subsequences of the prediction rule in both sub-datasets are the same, the prediction rule is not a change. Otherwise, we determine the change with Chi-square test as follows. Consider a prediction rule in time period 1, A x with a 1 %, and the other rule in time period 2, A y with b 2 %. Meanwhile, consider the of A x in time period 2 and the of A y in time period 2. The former value is a 2 % and the later value is b 1 %. As an evidence, we have the results that a 1 > b 1 and b 2 > a 2 as shown in Table 2. We utilize Chi-square test to test the homogeneity of (a 1, b 1 ) and (a 2, b 2 ). If the null hypothesis is rejected, it is crucial for the request following A in both time periods. For this reason, we determine whether it is a change only when the homogeneity of either (a 1, b 1 )or(a 2, b 2 ) is rejected. 5 Experimental evaluation In our work, given an active session, the prediction model will predict one page as the next request of a user. We define two Table 2 Example for changes of prediction rule Confidence Time period 1 (%) Time period 2 (%) A x a 1 a 2 A y b 1 b 2

5 Prediction of user navigation patterns by mining the temporal web usage evolution 161 measures to evaluate the prediction model, namely and recall. The is defined as the ratio of requests predicted correctly to all recommendation requests. To determine the practicality, we use the recall measure as defined in Su et al. (2) as the evaluation measure. Our definition of recall is the same as the applicability, which is the ratio of predicted requests to all actual requests. In the rest of this section, we describe the two real data-sets and a simulated data, and then present the experimental results. 5.1 Datasets The first Web log we tested is NASA log from the NASA Kennedy Space Center server in Florida. It contains 1,569,898 requests and 72,198 IPs aggregated as 51,132 sessions involving 4,737 pages. The second log is Clark-Net log. Clark-Net is a a commercial Internet site provider for the Metro Baltimore-Washington DC area. The log contains 1,654,882 requests and 85,137 IP s from August 28 th, 1995 to September 3 rd, The session file contains 3,78 sessions and 13,988 pages. For these two logs, we use 3 min of access interval as the threshold to identify a session. The third log we tested is a simulated data. We assume a complete tree as a web structure with five branches for each node and the tree depth as seven. Considering the backtracking problem, we employ exponential distribution on the depth for the probability of back-tracking. After the tree is constructed, we set the property of each node as among a normal node, temporal node, or strong node. A normal node is a random node such that a user visiting it will also visit its children randomly. A temporal node is the kind of nodes that a user visits according to the temporality we define. A strong node carries strong temporal property so that a user will visit some particular child with high probability in reflecting the temporal behavior. Note that we construct the simulated user navigation model based on the node property. The simulated data contains 22,32 sessions with 19,531 pages. For all tested log data, a session file is divided into two parts for the experiments, namely the training part with 8% of the whole session file chosen randomly, and the testing part for the rest. 5.2 Experimental results We define two simple calendar schemas in our experiments named Weekdays and Hours. Weekdays calendar sets each weekday as the unit of the schema, while Hours defines the schema, (hour: 7), (hour: 8 15), (hour: 16 23) for three divisions of one day. Moreover, we use the term All to indicate the case that no calendar schema is used (i.e., traditional N-gram model). We denote FC for Fundamental rule Changes and PC for Prediction rule Changes, respectively. changes [~7]-[8~15] FC (conf/nasa) PC (NASA) FC (sup/clarknet) [8~15]-[16~23] Hours [16~23]-[~7] FC (sup/nasa) FC (conf/clarknet) PC (clarknet) Fig. 1 Changes on Hours in NASA and ClarkNet logs changes FC (/NASA) FC (support/nasa) PC (NASA) FC (/clarknet) FC (support/clarknet) PC (clarknet) Sun.-Mon. Mon.-Tue. Tue.-Wed. Wed.-Thu. Thu.-Fri. Fri.-Sat. Sat.-Sun. Fig. 2 Changes on Weekdays in NASA and ClarkNet logs Figures 1 and 2 show the results of changes for the NASA data (blue line) and ClarkNet data (red line). Although the fundamental changes show the changes between each time slot in and support, we focus on the prediction rule changes in the following discussions. Note that the prediction rule changes of Hours (solid line) are stable for both data. However, the changes are more unstable for Weekdays. In addition, the changes of Weekdays in ClarkNet data are more distinct than in NASA data. Therefore, it is interesting to note that, the of Weekdays increases more in ClarkNet data than in NASA data compared to the All model. We show the for the both data in Figs. 3 and 4. In Fig. 3, the largest gap among the three lines is 1.5%, while it becomes 4% as shown in Fig. 4 with <.5. However, the of Weekdays is less than others when the >.5, as shown in Fig All Hours Weekdays Fig. 3 The on the NASA log.6.55 All Hours Weekdays Fig. 4 The on the Clarknet log

6 162 V. S. Tseng et al. This is because the number of rules with high in Weekdays is more than in All case. In other words, the Weekdays model can capture the inherent property that is implicit when the temporality is ignored. The recall clarifies this feature as illustrated in Fig. 5. Although we have investigated the for all pageviews, some page-views may lack the property of temporality. For example, users may have temporal behavior when they visit a general homepage for variable products. However, it may not be the case when users visit a detail pageview of particular products. Hence, we are interested in the of the page-views with prediction rule changes. Figure 6 shows the and it is observed that the is more distinct than that for all page-views. Finally, we simulate a special dataset with evident Hours property in order to test different kinds of temporal properties. Figure 7 shows the on the simulated dataset under different settings of. Although the simulated dataset carries the Hours property, it is not clear whether the Hours model is a good model through Fig. 7. This is because the page-views with Hours property take up only two percent of the total data in our simulated data. However, as shown in Fig. 8, the proposed TN-gram model outperforms substantially traditional N-gram model in terms of if we consider only the temporal page-views. The experimental results show that the average value of prediction rule changes is 4 and 48% for Weekdays and Hours, respectively (the figures are not shown here due to space limitation). recall All Hours Weekdays Fig. 5 The recall on the Clarknet log All Temporal Model NASA-Weekdays NASA-Hours Clarknet-Weekdays Clarknet-Hours Fig. 6 The of the temporal model and All All Temporal Model Sim.-Weekdays Sim.-Hours Fig. 8 The on simulated data for different models 6 Conclusions Our work aims at exploring the temporality property for identifying the time period in which user s navigation patterns change significantly so as to improve the prediction. This can provide useful insight for intelligent websites in strategy planning like personalized services and marketing promotion. In this paper, we have proposed a novel method named Temporal N-Gram (TN-Gram) for constructing prediction models of Web user navigation. After the prediction model is constructed, three kinds of new measures, namely Support-based Fundamental Rule Changes, Confidence-based Fundamental Rule Changes, and Changes of Prediction Rules are used to evaluate the temporal evolution of navigation patterns. For empirical evaluation, we adopted two real datasets and we also design a simulator to generate dataset that carries the temporal navigation characteristics of users. Through experimental evaluation on both of the reallife and simulated datasets, the proposed TN-Gram method is shown to outperform other existing approaches like N-gram modeling in terms of the prediction precision. For the future work, we will apply the TN-Gram model on different kinds of web sites like popular auction sites so as to evaluate its performance and effectiveness in more details. Moreover, we will also consider the user group issue and integrate it with TN-Gram to discover more interesting patterns. Besides, since the discovered temporal evolution can be exploited in wide applications, we will apply the TN-Gram method on applications like personalized services, with the aim to enhance the richness and quality of applications in web systems. Acknowledgments This research was supported by Ministry of Economic Affairs, Taiwan, ROC, under grant no. 93-EC-17-A , and by National Science Council, Taiwan, ROC, under grant no. NSC H all 3-interval weekdays Fig. 7 The on simulated dataset by varying References Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the international conference on data engineering (ICDE), Taipei, Taiwan, March 1995 Borges J, Levene M (1999) Data mining of user navigation patterns. In: Proceedings of the workshop on web usage analysis and user profiling (WEBKDD 99), San Diego, CA, August 15, 1999, pp 31 36

7 Prediction of user navigation patterns by mining the temporal web usage evolution 163 Frias-Martinez E, Karamcheti V (22) A prediction model for user access sequences. In: Proceedings of the WEBKDD workshop: web mining for usage patterns and user profiles, ACM SIGKDD international conference on knowledge discovery and data mining, July 22 Gündüz Ş, Özsu MT (23a) A user interest model for web page navigation. In: Proceedings of international workshop on data mining for actionable knowledge (DMAK), Seoul, Korea, April 23, pp Gündüz Ş, Özsu MT (23b) A web page prediction model based on click-stream tree representation of user behavior. In: Proceedings of the ninth ACM international conference on knowledge discovery and data mining (KDD), Washington, DC, August 23, pp Li Y, Ning P, Wang XS, Jajodia S (22) Discovering calendar-based temporal association rules. J Data Knowl Eng (DKE) 44(2): Liu B, Hsu W, Ma Y (21) Discovering the set of fundamental rule changes. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD-21), San Francisco, CA, August 2 23, 21 Nanopoulos D, Katsaros, Manolopoulos Y (23) A data mining algorithm for generalized web prefetching. IEEE Transactions on Knowledge and Data Engineering Nicholson E, Zukerman I, Albrech DW (1998) A decision-theoretic approach for pre-sending information on the WWW. In: Proceedings of the fifth Pacific Rim international conference on artificial intelligence, 1998, pp Padmanabhan V, Mogul J (1996) Using predictive prefetching to improve world wide web latency. ACM SIGCOMM Computer Comm Rev 26(3) Palpanas T, Mendelzon A (1999) Web prefetching using partial match prediction. In: Proceedings of the fourth web caching workshop (WCW 99), March 1999 Papoulis A (1991) Probability, random variables, and stochastic processes. McGraw Hill, New York Pitkow J, Pirolli P (1999) Mining longest repeating subsequences to predict world wide web surfing. In: Proceedings of the USENIX symposium on Internet technologies and systems (USITS 99), October 1999 Srivastava J, Cooley R, Deshpande M, Tan P (2) Web usage mining: discovery and applications of usage patterns from web data. In: SIGKDD Explorations, ACM SIGKDD, January 2 Su Z, Yang Q, Lu Y, Zhang H (2) Whatnext: a prediction system for web requests using n-gram sequence models. In: Proceedings of the first international conference on web information systems and engineering conference, Hong Kong, June 2, pp 2 27 Tan P, Kumar V (22) Mining association patterns in web usage data. In: Proceedings of the international conference on advances in infrastructure for e-business, e-education, e-science, and e-medicine on the Internet Tan P, Kumar V, Srivastava J (2) Indirect association: mining higher order dependencies. In: Proceedings of the fourth European conference on principles and practice of knowledge discovery in databases, Lyon, France, pp Wang W, Zaiane OR (22) Clustering web sessions by sequence alignment. In: Proceedings of the third international workshop on management of information on the web in conjunction with 13th international conference on database and expert systems applications DEXA 22, Aix en Provence, France, September 2 6, pp Yang Q, Li T, Wang K (24) Building association rule based sequential classifiers for web document prediction. J Data Min Knowl Discov 8(3): Zukerman I, Albrecht DW, Nicholson AE (1999) Predicting user s request on the WWW. In: Proceedings of the seventh international conference on user modeling, 1999

WEB-LOG CLEANING FOR CONSTRUCTING SEQUENTIAL CLASSIFIERS

WEB-LOG CLEANING FOR CONSTRUCTING SEQUENTIAL CLASSIFIERS Applied Artificial Intelligence, 17:431 441, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219291 u WEB-LOG CLEANING FOR CONSTRUCTING SEQUENTIAL CLASSIFIERS QIANG

More information

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, rajashreeshettar@rvce.edu.in Abstract

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Probability Measure of Navigation pattern predition using Poisson Distribution Analysis

Probability Measure of Navigation pattern predition using Poisson Distribution Analysis Probability Measure of Navigation pattern predition using Poisson Distribution Analysis Dr.V.Valli Mayil Director/MCA Vivekanandha Institute of Information and Management Studies Tiruchengode Ms. R. Rooba,

More information

Merging Data Mining Techniques for Web Page Access Prediction: Integrating Markov Model with Clustering

Merging Data Mining Techniques for Web Page Access Prediction: Integrating Markov Model with Clustering www.ijcsi.org 188 Merging Data Mining Techniques for Web Page Access Prediction: Integrating Markov Model with Clustering Trilok Nath Pandey 1, Ranjita Kumari Dash 2, Alaka Nanda Tripathy 3,Barnali Sahu

More information

Pattern Classification based on Web Usage Mining using Neural Network Technique

Pattern Classification based on Web Usage Mining using Neural Network Technique International Journal of Computer Applications (975 8887) Pattern Classification based on Web Usage Mining using Neural Network Technique Er. Romil V Patel PIET, VADODARA Dheeraj Kumar Singh, PIET, VADODARA

More information

Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming

Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming Dr.K.Duraiswamy Dean, Academic K.S.Rangasamy College of Technology Tiruchengode, India V. Valli Mayil (Corresponding

More information

A Study on User Future Request Prediction Methods Using Web Usage Mining

A Study on User Future Request Prediction Methods Using Web Usage Mining International Journal of Computational Engineering Research Vol, 03 Issue, 4 A Study on User Future Request Prediction Methods Using Web Usage Mining Dilpreet kaur 1, Sukhpreet Kaur 2 1, Master of Technology

More information

Mining Web Logs for Personalized Site Maps

Mining Web Logs for Personalized Site Maps Mining Web Logs for Personalized Site Maps Fergus Toolan Nicholas Kushmerick Smart Media Institute, Computer Science Department, University College Dublin {fergus.toolan, nick}@ucd.ie Abstract. Navigating

More information

Mohri, Kurukshetra, India

Mohri, Kurukshetra, India Volume 4, Issue 8, August 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Revised Two

More information

INTRODUCTION. Chapter GENERAL

INTRODUCTION. Chapter GENERAL Chapter 1 INTRODUCTION 1.1 GENERAL The World Wide Web (WWW) [1] is a system of interlinked hypertext documents accessed via the Internet. It is an interactive world of shared information through which

More information

Using Pattern-Join and Purchase-Combination for Mining Web Transaction Patterns in an Electronic Commerce Environment

Using Pattern-Join and Purchase-Combination for Mining Web Transaction Patterns in an Electronic Commerce Environment Using Pattern-Join and Purchase-Combination for Mining Web Transaction Patterns in an Electronic Commerce Environment Ching-Huang Yun and Ming-Syan Chen Department of Electrical Engineering National Taiwan

More information

AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR UTILITY MINING. Received April 2011; revised October 2011

AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR UTILITY MINING. Received April 2011; revised October 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 7(B), July 2012 pp. 5165 5178 AN EFFICIENT GRADUAL PRUNING TECHNIQUE FOR

More information

Characterizing Web Usage Regularities with Information Foraging Agents

Characterizing Web Usage Regularities with Information Foraging Agents Characterizing Web Usage Regularities with Information Foraging Agents Jiming Liu 1, Shiwu Zhang 2 and Jie Yang 2 COMP-03-001 Released Date: February 4, 2003 1 (corresponding author) Department of Computer

More information

A Customizable Behavior Model for Temporal Prediction of Web User Sequences

A Customizable Behavior Model for Temporal Prediction of Web User Sequences A Customizable Behavior Model for Temporal Prediction of Web User Sequences Enrique Frías-Martínez and Vijay Karamcheti Courant Institute of Mathematical Sciences, New York University 715 Broadway, New

More information

FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data

FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University,

More information

IJITKMSpecial Issue (ICFTEM-2014) May 2014 pp (ISSN )

IJITKMSpecial Issue (ICFTEM-2014) May 2014 pp (ISSN ) A Review Paper on Web Usage Mining and future request prediction Priyanka Bhart 1, Dr.SonaMalhotra 2 1 M.Tech., CSE Department, U.I.E.T. Kurukshetra University, Kurukshetra, India 2 HOD, CSE Department,

More information

A Novel Method of Optimizing Website Structure

A Novel Method of Optimizing Website Structure A Novel Method of Optimizing Website Structure Mingjun Li 1, Mingxin Zhang 2, Jinlong Zheng 2 1 School of Computer and Information Engineering, Harbin University of Commerce, Harbin, 150028, China 2 School

More information

Pre-processing of Web Logs for Mining World Wide Web Browsing Patterns

Pre-processing of Web Logs for Mining World Wide Web Browsing Patterns Pre-processing of Web Logs for Mining World Wide Web Browsing Patterns # Yogish H K #1 Dr. G T Raju *2 Department of Computer Science and Engineering Bharathiar University Coimbatore, 641046, Tamilnadu

More information

A Comparative Study of Web Prefetching Techniques Focusing on User s Perspective

A Comparative Study of Web Prefetching Techniques Focusing on User s Perspective A Comparative Study of Web Prefetching Techniques Focusing on User s Perspective Josep Domènech Ana Pont Julio Sahuquillo José A. Gil Department of Computing Engineering (DISCA) Universitat Politècnica

More information

CLASSIFICATION OF WEB LOG DATA TO IDENTIFY INTERESTED USERS USING DECISION TREES

CLASSIFICATION OF WEB LOG DATA TO IDENTIFY INTERESTED USERS USING DECISION TREES CLASSIFICATION OF WEB LOG DATA TO IDENTIFY INTERESTED USERS USING DECISION TREES K. R. Suneetha, R. Krishnamoorthi Bharathidasan Institute of Technology, Anna University krs_mangalore@hotmail.com rkrish_26@hotmail.com

More information

Appropriate Item Partition for Improving the Mining Performance

Appropriate Item Partition for Improving the Mining Performance Appropriate Item Partition for Improving the Mining Performance Tzung-Pei Hong 1,2, Jheng-Nan Huang 1, Kawuu W. Lin 3 and Wen-Yang Lin 1 1 Department of Computer Science and Information Engineering National

More information

Mining High Average-Utility Itemsets

Mining High Average-Utility Itemsets Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering

More information

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Hybrid Feature Selection for Modeling Intrusion Detection Systems Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,

More information

Keywords Data alignment, Data annotation, Web database, Search Result Record

Keywords Data alignment, Data annotation, Web database, Search Result Record Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web

More information

Fuzzy Cognitive Maps application for Webmining

Fuzzy Cognitive Maps application for Webmining Fuzzy Cognitive Maps application for Webmining Andreas Kakolyris Dept. Computer Science, University of Ioannina Greece, csst9942@otenet.gr George Stylios Dept. of Communications, Informatics and Management,

More information

The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria Astuti

The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria Astuti Information Systems International Conference (ISICO), 2 4 December 2013 The Comparison of CBA Algorithm and CBS Algorithm for Meteorological Data Classification Mohammad Iqbal, Imam Mukhlash, Hanim Maria

More information

Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications

Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications Daniel Mican, Nicolae Tomai Babes-Bolyai University, Dept. of Business Information Systems, Str. Theodor

More information

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE Vandit Agarwal 1, Mandhani Kushal 2 and Preetham Kumar 3

More information

Recommendation Models for User Accesses to Web Pages (Invited Paper)

Recommendation Models for User Accesses to Web Pages (Invited Paper) Recommendation Models for User Accesses to Web Pages (Invited Paper) Ṣule Gündüz 1 and M. Tamer Özsu2 1 Department of Computer Science, Istanbul Technical University Istanbul, Turkey, 34390 gunduz@cs.itu.edu.tr

More information

A Dynamic Clustering-Based Markov Model for Web Usage Mining

A Dynamic Clustering-Based Markov Model for Web Usage Mining A Dynamic Clustering-Based Markov Model for Web Usage Mining José Borges School of Engineering, University of Porto, Portugal, jlborges@fe.up.pt Mark Levene Birkbeck, University of London, U.K., mark@dcs.bbk.ac.uk

More information

Effectiveness of Crawling Attacks Against Web-based Recommender Systems

Effectiveness of Crawling Attacks Against Web-based Recommender Systems Effectiveness of Crawling Attacks Against Web-based Recommender Systems Runa Bhaumik, Robin Burke, Bamshad Mobasher Center for Web Intelligence School of Computer Science, Telecommunication and Information

More information

A Conflict-Based Confidence Measure for Associative Classification

A Conflict-Based Confidence Measure for Associative Classification A Conflict-Based Confidence Measure for Associative Classification Peerapon Vateekul and Mei-Ling Shyu Department of Electrical and Computer Engineering University of Miami Coral Gables, FL 33124, USA

More information

Proxy Server Systems Improvement Using Frequent Itemset Pattern-Based Techniques

Proxy Server Systems Improvement Using Frequent Itemset Pattern-Based Techniques Proceedings of the 2nd International Conference on Intelligent Systems and Image Processing 2014 Proxy Systems Improvement Using Frequent Itemset Pattern-Based Techniques Saranyoo Butkote *, Jiratta Phuboon-op,

More information

Temporal Support in Sequential Pattern Mining

Temporal Support in Sequential Pattern Mining Temporal Support in Sequential Pattern Mining Leticia I. Gómez 1 and Bart Kuijpers 2 and Alejandro A. Vaisman 3 1 Instituto Tecnólogico de Buenos Aires lgomez@itba.edu.ar 2 Buenos Aires University, Hasselt

More information

Efficient Prediction of Web Accesses on a Proxy Server

Efficient Prediction of Web Accesses on a Proxy Server Efficient Prediction of Web Accesses on a Proxy Server Wenwu Lou Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong wwlou@cs.ust.hk Hongjun

More information

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University

More information

Effectively Capturing User Navigation Paths in the Web Using Web Server Logs

Effectively Capturing User Navigation Paths in the Web Using Web Server Logs Effectively Capturing User Navigation Paths in the Web Using Web Server Logs Amithalal Caldera and Yogesh Deshpande School of Computing and Information Technology, College of Science Technology and Engineering,

More information

Web Usage Mining: How to Efficiently Manage New Transactions and New Clients

Web Usage Mining: How to Efficiently Manage New Transactions and New Clients Web Usage Mining: How to Efficiently Manage New Transactions and New Clients F. Masseglia 1,2, P. Poncelet 2, and M. Teisseire 2 1 Laboratoire PRiSM, Univ. de Versailles, 45 Avenue des Etats-Unis, 78035

More information

A COMPARISON OF PREDICTION ALGORITHMS FOR PREFETCHING IN THE CURRENT WEB

A COMPARISON OF PREDICTION ALGORITHMS FOR PREFETCHING IN THE CURRENT WEB Journal of Web Engineering, Vol. 11, No. 1 (2012) 064 078 c Rinton Press A COMPARISON OF PREDICTION ALGORITHMS FOR PREFETCHING IN THE CURRENT WEB JOSEP DOMENECH, JULIO SAHUQUILLO, JOSE A. GIL, and ANA

More information

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)

More information

A Framework for Predictive Web Prefetching at the Proxy Level using Data Mining

A Framework for Predictive Web Prefetching at the Proxy Level using Data Mining IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.6, June 2008 303 A Framework for Predictive Web Prefetching at the Proxy Level using Data Mining Jyoti Pandey 1, Amit Goel

More information

Temporal Weighted Association Rule Mining for Classification

Temporal Weighted Association Rule Mining for Classification Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider

More information

The influence of caching on web usage mining

The influence of caching on web usage mining The influence of caching on web usage mining J. Huysmans 1, B. Baesens 1,2 & J. Vanthienen 1 1 Department of Applied Economic Sciences, K.U.Leuven, Belgium 2 School of Management, University of Southampton,

More information

UMCS. Annales UMCS Informatica AI 7 (2007) Data mining techniques for portal participants profiling. Danuta Zakrzewska *, Justyna Kapka

UMCS. Annales UMCS Informatica AI 7 (2007) Data mining techniques for portal participants profiling. Danuta Zakrzewska *, Justyna Kapka Annales Informatica AI 7 (2007) 153-161 Annales Informatica Lublin-Polonia Sectio AI http://www.annales.umcs.lublin.pl/ Data mining techniques for portal participants profiling Danuta Zakrzewska *, Justyna

More information

An Automatic Reply to Customers Queries Model with Chinese Text Mining Approach

An Automatic Reply to Customers  Queries Model with Chinese Text Mining Approach Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 15-17, 2007 71 An Automatic Reply to Customers E-mail Queries Model with Chinese Text Mining Approach

More information

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams *

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams * An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams * Jia-Ling Koh and Shu-Ning Shin Department of Computer Science and Information Engineering National Taiwan Normal University

More information

CLASSIFICATION FOR SCALING METHODS IN DATA MINING

CLASSIFICATION FOR SCALING METHODS IN DATA MINING CLASSIFICATION FOR SCALING METHODS IN DATA MINING Eric Kyper, College of Business Administration, University of Rhode Island, Kingston, RI 02881 (401) 874-7563, ekyper@mail.uri.edu Lutz Hamel, Department

More information

Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal

Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal Mohd Helmy Ab Wahab 1, Azizul Azhar Ramli 2, Nureize Arbaiy 3, Zurinah Suradi 4 1 Faculty of Electrical

More information

Characterizing Home Pages 1

Characterizing Home Pages 1 Characterizing Home Pages 1 Xubin He and Qing Yang Dept. of Electrical and Computer Engineering University of Rhode Island Kingston, RI 881, USA Abstract Home pages are very important for any successful

More information

A Hierarchical Document Clustering Approach with Frequent Itemsets

A Hierarchical Document Clustering Approach with Frequent Itemsets A Hierarchical Document Clustering Approach with Frequent Itemsets Cheng-Jhe Lee, Chiun-Chieh Hsu, and Da-Ren Chen Abstract In order to effectively retrieve required information from the large amount of

More information

Link Prediction for Social Network

Link Prediction for Social Network Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue

More information

Efficiently Mining Positive Correlation Rules

Efficiently Mining Positive Correlation Rules Applied Mathematics & Information Sciences An International Journal 2011 NSP 5 (2) (2011), 39S-44S Efficiently Mining Positive Correlation Rules Zhongmei Zhou Department of Computer Science & Engineering,

More information

Keywords Web Mining, Web Usage Mining, Web Structure Mining, Web Content Mining.

Keywords Web Mining, Web Usage Mining, Web Structure Mining, Web Content Mining. Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Framework to

More information

THE STUDY OF WEB MINING - A SURVEY

THE STUDY OF WEB MINING - A SURVEY THE STUDY OF WEB MINING - A SURVEY Ashish Gupta, Anil Khandekar Abstract over the year s web mining is the very fast growing research field. Web mining contains two research areas: Data mining and World

More information

Web Usage Mining. Overview Session 1. This material is inspired from the WWW 16 tutorial entitled Analyzing Sequential User Behavior on the Web

Web Usage Mining. Overview Session 1. This material is inspired from the WWW 16 tutorial entitled Analyzing Sequential User Behavior on the Web Web Usage Mining Overview Session 1 This material is inspired from the WWW 16 tutorial entitled Analyzing Sequential User Behavior on the Web 1 Outline 1. Introduction 2. Preprocessing 3. Analysis 2 Example

More information

Efficient Remining of Generalized Multi-supported Association Rules under Support Update

Efficient Remining of Generalized Multi-supported Association Rules under Support Update Efficient Remining of Generalized Multi-supported Association Rules under Support Update WEN-YANG LIN 1 and MING-CHENG TSENG 1 Dept. of Information Management, Institute of Information Engineering I-Shou

More information

Survey Paper on Web Usage Mining for Web Personalization

Survey Paper on Web Usage Mining for Web Personalization ISSN 2278 0211 (Online) Survey Paper on Web Usage Mining for Web Personalization Namdev Anwat Department of Computer Engineering Matoshri College of Engineering & Research Center, Eklahare, Nashik University

More information

APD-A Tool for Identifying Behavioural Patterns Automatically from Clickstream Data

APD-A Tool for Identifying Behavioural Patterns Automatically from Clickstream Data APD-A Tool for Identifying Behavioural Patterns Automatically from Clickstream Data I-Hsien Ting, Lillian Clark, Chris Kimble, Daniel Kudenko, and Peter Wright Department of Computer Science, The University

More information

Mining of Web Server Logs using Extended Apriori Algorithm

Mining of Web Server Logs using Extended Apriori Algorithm International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Maintenance of the Prelarge Trees for Record Deletion

Maintenance of the Prelarge Trees for Record Deletion 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31, 2007 105 Maintenance of the Prelarge Trees for Record Deletion Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Department of

More information

A Survey on Web Personalization of Web Usage Mining

A Survey on Web Personalization of Web Usage Mining A Survey on Web Personalization of Web Usage Mining S.Jagan 1, Dr.S.P.Rajagopalan 2 1 Assistant Professor, Department of CSE, T.J. Institute of Technology, Tamilnadu, India 2 Professor, Department of CSE,

More information

Effective Prediction of Web-user Accesses: A Data Mining Approach

Effective Prediction of Web-user Accesses: A Data Mining Approach Effective Prediction of Web-user Accesses: A Data Mining Approach Alexandros Nanopoulos Dimitris Katsaros Yannis Manolopoulos Data Engineering Lab, Department of Informatics Aristotle University, Thessaloniki

More information

Improving the prediction of next page request by a web user using Page Rank algorithm

Improving the prediction of next page request by a web user using Page Rank algorithm Improving the prediction of next page request by a web user using Page Rank algorithm Claudia Elena Dinucă, Dumitru Ciobanu Faculty of Economics and Business Administration Cybernetics and statistics University

More information

AccWeb Improving Web Performance via Prefetching

AccWeb Improving Web Performance via Prefetching AccWeb Improving Web Performance via Prefetching Qizhe Cai Wei Hu Yueyang Qiu {qizhec,huwei,yqiu}@cs.princeton.edu Abstract We present AccWeb (Accelerated Web), a web service that improves user experience

More information

Semantic Clickstream Mining

Semantic Clickstream Mining Semantic Clickstream Mining Mehrdad Jalali 1, and Norwati Mustapha 2 1 Department of Software Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran 2 Department of Computer Science, Universiti

More information

EXTRACTION OF INTERESTING PATTERNS THROUGH ASSOCIATION RULE MINING FOR IMPROVEMENT OF WEBSITE USABILITY

EXTRACTION OF INTERESTING PATTERNS THROUGH ASSOCIATION RULE MINING FOR IMPROVEMENT OF WEBSITE USABILITY ISTANBUL UNIVERSITY JOURNAL OF ELECTRICAL & ELECTRONICS ENGINEERING YEAR VOLUME NUMBER : 2009 : 9 : 2 (1037-1046) EXTRACTION OF INTERESTING PATTERNS THROUGH ASSOCIATION RULE MINING FOR IMPROVEMENT OF WEBSITE

More information

Popularity-Based PPM: An Effective Web Prefetching Technique for High Accuracy and Low Storage

Popularity-Based PPM: An Effective Web Prefetching Technique for High Accuracy and Low Storage Proceedings of 22 International Conference on Parallel Processing, (ICPP 22), Vancouver, Canada, August 18-21, 22. Popularity-Based : An Effective Web Prefetching Technique for High Accuracy and Low Storage

More information

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 20 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(20), 2014 [12526-12531] Exploration on the data mining system construction

More information

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery Ninh D. Pham, Quang Loc Le, Tran Khanh Dang Faculty of Computer Science and Engineering, HCM University of Technology,

More information

A Method of Identifying the P2P File Sharing

A Method of Identifying the P2P File Sharing IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.11, November 2010 111 A Method of Identifying the P2P File Sharing Jian-Bo Chen Department of Information & Telecommunications

More information

IJMIE Volume 2, Issue 9 ISSN:

IJMIE Volume 2, Issue 9 ISSN: WEB USAGE MINING: LEARNER CENTRIC APPROACH FOR E-BUSINESS APPLICATIONS B. NAVEENA DEVI* Abstract Emerging of web has put forward a great deal of challenges to web researchers for web based information

More information

A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph

A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph T. Vijaya Kumar, H. S. Guruprasad, Bharath Kumar K. M., Irfan Baig, and Kiran Babu S. Abstract To have

More information

A Novel Algorithm for Associative Classification

A Novel Algorithm for Associative Classification A Novel Algorithm for Associative Classification Gourab Kundu 1, Sirajum Munir 1, Md. Faizul Bari 1, Md. Monirul Islam 1, and K. Murase 2 1 Department of Computer Science and Engineering Bangladesh University

More information

Using Petri Nets to Enhance Web Usage Mining 1

Using Petri Nets to Enhance Web Usage Mining 1 Using Petri Nets to Enhance Web Usage Mining 1 Shih-Yang Yang Department of Information Management Kang-Ning Junior College of Medical Care and Management Nei-Hu, 114, Taiwan Shihyang@knjc.edu.tw Po-Zung

More information

Mining Quantitative Association Rules on Overlapped Intervals

Mining Quantitative Association Rules on Overlapped Intervals Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang,

More information

Purna Prasad Mutyala et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5), 2011,

Purna Prasad Mutyala et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5), 2011, Weighted Association Rule Mining Without Pre-assigned Weights PURNA PRASAD MUTYALA, KUMAR VASANTHA Department of CSE, Avanthi Institute of Engg & Tech, Tamaram, Visakhapatnam, A.P., India. Abstract Association

More information

Prioritizing the Links on the Homepage: Evidence from a University Website Lian-lian SONG 1,a* and Geoffrey TSO 2,b

Prioritizing the Links on the Homepage: Evidence from a University Website Lian-lian SONG 1,a* and Geoffrey TSO 2,b 2017 3rd International Conference on E-commerce and Contemporary Economic Development (ECED 2017) ISBN: 978-1-60595-446-2 Prioritizing the Links on the Homepage: Evidence from a University Website Lian-lian

More information

A Hybrid Web Personalization Model Based on Site Connectivity

A Hybrid Web Personalization Model Based on Site Connectivity A Hybrid Web Personalization Model Based on Site Connectivity Miki Nakagawa, Bamshad Mobasher {mnakagawa,mobasher}@cs.depaul.edu School of Computer Science, Telecommunication, and Information Systems DePaul

More information

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

Mining Frequent Itemsets for data streams over Weighted Sliding Windows Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology

More information

Handling Missing Values via Decomposition of the Conditioned Set

Handling Missing Values via Decomposition of the Conditioned Set Handling Missing Values via Decomposition of the Conditioned Set Mei-Ling Shyu, Indika Priyantha Kuruppu-Appuhamilage Department of Electrical and Computer Engineering, University of Miami Coral Gables,

More information

Research Article Apriori Association Rule Algorithms using VMware Environment

Research Article Apriori Association Rule Algorithms using VMware Environment Research Journal of Applied Sciences, Engineering and Technology 8(2): 16-166, 214 DOI:1.1926/rjaset.8.955 ISSN: 24-7459; e-issn: 24-7467 214 Maxwell Scientific Publication Corp. Submitted: January 2,

More information

A User Behavior Model for Web Page Navigation

A User Behavior Model for Web Page Navigation A User Behavior Model for Web Page Navigation Ṣule Gündüz and M. Tamer Özsu October 2002 on leaving from Department of Computer Science, Istanbul Technical University, Istanbul, Turkey. School Of Computer

More information

TEMPORAL data mining is a research field of growing

TEMPORAL data mining is a research field of growing An Optimal Temporal and Feature Space Allocation in Supervised Data Mining S. Tom Au, Guangqin Ma, and Rensheng Wang, Abstract This paper presents an expository study of temporal data mining for prediction

More information

Using Association Rules for Better Treatment of Missing Values

Using Association Rules for Better Treatment of Missing Values Using Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine Intelligence Group) National University

More information

Sensor Based Time Series Classification of Body Movement

Sensor Based Time Series Classification of Body Movement Sensor Based Time Series Classification of Body Movement Swapna Philip, Yu Cao*, and Ming Li Department of Computer Science California State University, Fresno Fresno, CA, U.S.A swapna.philip@gmail.com,

More information

Applications of Concurrent Access Patterns in Web Usage Mining

Applications of Concurrent Access Patterns in Web Usage Mining Applications of Concurrent Access Patterns in Web Usage Mining Jing Lu 1, Malcolm Keech 2, and Cuiqing Wang 3 1 Southampton Solent University, Southampton UK, SO14 0YN 2 University of Bedfordshire, Park

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo

More information

Web Usage Mining Approaches for User s Request Prediction: A Survey

Web Usage Mining Approaches for User s Request Prediction: A Survey Web Usage Mining Approaches for User s : A Survey 1 Avneet Saluja, 2 Dr. Bhupesh Gour, 3 Lokesh Singh 1 M.Tech Scholar, 2 H.O.D., 3 Assistant Professor Department of Comp. Sci. & Engg. **Technocrats Institute

More information

An Average Linear Time Algorithm for Web. Usage Mining

An Average Linear Time Algorithm for Web. Usage Mining An Average Linear Time Algorithm for Web Usage Mining José Borges School of Engineering, University of Porto R. Dr. Roberto Frias, 4200 - Porto, Portugal jlborges@fe.up.pt Mark Levene School of Computer

More information

WhatNext: A Prediction System for Web Requests using N-gram Sequence Models

WhatNext: A Prediction System for Web Requests using N-gram Sequence Models WhatNext: A Prediction System for Web Requests using N-gram Sequence Models Zhong Su* Qiang Yang*, Ye Lu* Hongjiang Zhang Department of Computing Science and Technology Tsinghua University Beijing 100084,

More information

An Improved Markov Model Approach to Predict Web Page Caching

An Improved Markov Model Approach to Predict Web Page Caching An Improved Markov Model Approach to Predict Web Page Caching Meenu Brala Student, JMIT, Radaur meenubrala@gmail.com Mrs. Mamta Dhanda Asstt. Prof, CSE, JMIT Radaur mamtanain@gmail.com Abstract Optimization

More information

Data Mining of Web Access Logs Using Classification Techniques

Data Mining of Web Access Logs Using Classification Techniques Data Mining of Web Logs Using Classification Techniques Md. Azam 1, Asst. Prof. Md. Tabrez Nafis 2 1 M.Tech Scholar, Department of Computer Science & Engineering, Al-Falah School of Engineering & Technology,

More information

Mining Web Logs for Prediction Models in WWW Caching and Prefetching

Mining Web Logs for Prediction Models in WWW Caching and Prefetching Mining Web Logs for Prediction Models in WWW Caching and Prefetching Qiang Yang School of Computing Science Simon Fraser University Burnaby, BC, Canada V5A 1S6 qyang @ cs.sfu.ca Haining Henry Zhang IBM

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Context-based Navigational Support in Hypermedia

Context-based Navigational Support in Hypermedia Context-based Navigational Support in Hypermedia Sebastian Stober and Andreas Nürnberger Institut für Wissens- und Sprachverarbeitung, Fakultät für Informatik, Otto-von-Guericke-Universität Magdeburg,

More information

High Utility Web Access Patterns Mining from Distributed Databases

High Utility Web Access Patterns Mining from Distributed Databases High Utility Web Access Patterns Mining from Distributed Databases Md.Azam Hosssain 1, Md.Mamunur Rashid 1, Byeong-Soo Jeong 1, Ho-Jin Choi 2 1 Database Lab, Department of Computer Engineering, Kyung Hee

More information

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Expert Systems: Final (Research Paper) Project Daniel Josiah-Akintonde December

More information

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery : Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj}@cs.uiuc.edu,

More information