Identification of Navigational Paths of Users Routed through Proxy Servers for Web Usage Mining
|
|
- Maryann Reed
- 5 years ago
- Views:
Transcription
1 Identification of Navigational Paths of Users Routed through Proxy Servers for Web Usage Mining The web log file gives a detailed account of who accessed the web site, what pages were requested, and in what order and how long each page was viewed. However, log files are not only unstructured but also distorted in many cases. Especially, log files could be seriously distorted when web pages are requested by the users routed through proxy servers. Therefore, preparative processing is necessary prior to the analysis and discovery of meaningful information. In this article, an algorithm is developed to identify the users and their navigational paths when users are routed through proxy servers. The proposed algorithm is then experimentally evaluated using a real website and ten groups of users, each with two or three people. The experimental results show that the average ratios of correct and incorrect page restoration are 78% and 4.1%, respectively, which indicate that the proposed algorithm can be used as a reasonable tool for identifying the navigational paths of the users routed through proxy servers. Keywords: Web Log File ; Proxy Server ; User Identification 1
2 Identification of Navigational Paths of Users Routed through Proxy Servers for Web Usage Mining Yong Soo Kim* and ong-jin Yum** * Department of Industrial Engineering, Korea dvanced Institute of Science and Technology, Gusung-Dong, Yusung-Gu, Taejon , Korea. Tel : ; Fax: ; mmps@kaist.ac.kr ** Corresponding author. Department of Industrial Engineering, Korea dvanced Institute of Science and Technology, Gusung-Dong, Yusung-Gu, Taejon , Korea. Tel : ; Fax: ; bjyum@kaist.ac.kr. 2
3 1 Introduction The World Wide Web (WWW) continues to grow at an astounding rate in terms of traffic volume, size and complexity. long with this growth, the complexity of such tasks as web site design, web server design, etc. has also increased. n important input to these design tasks is the analysis of how a web site is being used, and therefore, it is necessary for web designers to analyze and discover relevant information from the WWW [1]. Log files (access, agent, error, and referrer log files) which are recorded by web servers contain information on all incoming requests, user s browser type, operating system, server error type and the referrer header. Log files, however, are unstructured as well as distorted in many cases. In particular, log files could be seriously distorted when web pages are requested by the users routed through proxy servers. proxy server acts as an intermediary between the users of a local network and the internet so that the local network can ensure security, administrative control and caching service. proxy server is associated with or constitutes a part of the gateway server that separates the local network from the outside ones, and also serves as a firewall that protects the local network from outside intrusion. When a web page is requested by a user in the local network, the proxy server uses its own IP address to request the page, which is called IP masquerading. Therefore, the IP address of the proxy server, and not that of the user, is recorded in the log files [3]. Furthermore, a proxy server also functions as a cache server that looks into its local cache of previously downloaded web pages. If it finds the page, it returns the page to the user without forwarding the request to the internet. In this case, the request is not recorded in the log file [2]. 3
4 s mentioned above, log files could be distorted due to IP masquerading and cache service of the proxy server. For example, assume that two users who are routed through a proxy server visit the same web site. If one visits pages --C-D and the other visits pages - -F later, then pages --C-D will be requested from the web server for the first user. However, only the page F will be requested for the second user because pages - are already downloaded in the proxy server. Therefore, the log file will record requests for pages --C- D-F instead of pages --C-D---F. Such a distortion in a log file needs to be corrected to analyze and discover useful information from the WWW. Cooley et al. [1] presented an algorithm for identifying the users routed thorough a proxy server without cache service. In this paper, the work by Cooley et al. is extended to the case where cache service is also provided by the proxy server. 2 lgorithm The proposed algorithm identifies the navigational paths of the users routed through a proxy server. The algorithm constructs the browsing path for each user by analyzing the access log in conjunction with the referrer log, agent log, and site topology. The access log shows who accessed the web site and what pages were requested. The referrer log contains the referrer header of an incoming request and the agent log shows what browser was used. 2.1 Overview of the Proposed lgorithm The idea behind the proposed algorithm is as follows. The records with the same IP address and the same agent are assumed to be of a single user. However, if the access log, 4
5 referrer and site topology indicate otherwise, then the records are divided into paths of different users. In addition, a path completion procedure is provided for a newly generated user path. In the algorithm, the index page means the first page of the web site and a parent page is an immediate upper page of the current page in the web site topology. brief description of the algorithm is shown in Figure 1. <Fig. 1> 2.2 Detailed Procedures In Stage 1, records with the same IP address are sorted in chronological order. Procedures for Stages 2 and 3 are as follows. Stage 2 Figure 2 shows a flowchart for the procedures in Stage 2 for each record in the same IP address. tree represents the navigational path of a single user. <Fig. 2> First, in the case where the requested page is the first record, a tree is constructed with that page alone (see step 9 in Figure 2). Otherwise, it is decided whether the requested page should be assigned to a tree or reserved. If there exists a tree which meets condition 1 or 2 below, the requested page is assigned to that tree. 5
6 Condition 1: The agent of the tree is identical to that of the requested page and the tree contains a page which is identical to the referrer of the requested page (i.e. when the record is assigned to an existing tree, the site topology and referrer of the record is taken into consideration (see Figure 3)). <Fig.3> Condition 2: There is only one tree with its agent being identical to that of the requested page, and in the path from the most recently assigned page of the tree to the requested page in the site topology, the intermediate pages appear in other trees. The reason for the assignment of the requested page to the tree satisfying condition 2 is as follows. Due to the cache service of a proxy server, previously requested pages do not appear in the access log. Therefore, if the intermediate pages (including the referrer of the currently requested page) appear in other trees, then it is likely that the currently requested page was requested by the user through those intermediate pages. In Figure 4, the intermediate pages in the path from C to F in the site topology are and D. Since and D appear in Tree 2, F is assigned to Tree 1. <Fig. 4> In the case of condition 3 (i.e., there are multiple trees with the same agent as that of the requested page), the page is reserved for a later assignment based on additional decision criteria 6
7 (see step 4 in Figure 2). If the page does not meet condition 1, 2 or 3, the page is regarded as a new user (see step 8 in Figure 2). fter ssignment of the page to a tree which meets condition 1 or 2 (step 2 in Figure 2), it is checked whether or not two-user navigational patterns are found in the tree. time sequence of assignments in which an unlikely backtracking exists suggests a two-user navigational pattern. If the tree is suspected to have the navigational pattern of two users, then it is split into two trees. Of the two branches under consideration, the one with the request time of its first page being more recent is taken off the tree. Condition for splitting the tree into two trees is as follows. Condition 4: The referrer of the requested page is not identical to any of the pages in the path from the page assigned to the tree just prior to the requested page to the index page in the site topology. s shown in Figure 5, the time sequence of the pages in the tree suggests a two-user navigational pattern since it is highly unlikely that a single user, after navigating in the order of --C-D-F, would backtrack to C to access E. Therefore, the tree is split into two separate trees. s for deciding which branch to take off the tree, D-F is chosen since C was requested before D from. Thus, it is most likely that --C-E represents the navigational pattern of one user, and D-F can be regarded as the navigational pattern of the other. <Fig. 5> 7
8 In the case of tree splitting (step 3 in Figure 2) or assignment reservation (step 4 in Figure 2), the page in question can be reassigned according its order within the time sequence. If the request time of the first page in the split branch or of the reserved page is later than that of the lastly assigned page in each of the existing trees and if there is only one such tree (steps 5,6 in Figure 2), then it is reassigned to that tree. For example, in Figure 6, the request time of D, which is the first page in the split branch, is later than that of, which is the lastly assigned page in the tree, and therefore D-F is assigned to Tree 1. The reason for this reassignment is based on the assumption that the record is of a single user with the same agent. <Fig. 6> On the other hand, in the case where the condition in step 5 in Figure 2 is satisfied but the condition in step 6 is not, assignment is reserved again. If the condition in step 5 in Figure 2 is not satisfied, the page or the split branch is regarded as a new user s (step 8 in Figure 2). Finally, the page or the split branch regarded as a new user s in step 8 in Figure 2 needs a path completion. Path completion is carried out by connecting the index page to the first page of the new user tree along the shortest path. This serves to restore pages which have been skipped in the access log file of the web server. For example, in Figure 7, the path is completed by connecting the index page to D, the first page of the new user tree, using the shortest path in the site topology (see Figure 4). <Fig. 7> 8
9 Stage 3 If there exist reserved branches after Stage 2 is completed for all records, then these branches are assigned to an existing tree or regarded as a new user tree according to the site topology and their orders within the time sequence. Detailed procedures for Stage 3 are as follows. Stage 3: ssignment of a reserved branch (or a page) to another tree or regarding it as a new user tree. Step 1. If the last page of a tree, which has the same agent as the reserved branch, has an earlier request time than the first page of the reserved branch, the following is carried out: i) If such a tree is unique, assign the reserved branch to the tree. ii) If either the first page or the parent page of the reserved branch is identical to the lastly assigned page or to its parent page of a tree, then the reserved branch is assigned to that tree. Step 2. If the reserved branch is not assigned in Step1, it is regarded as a new user tree. In this case, the shortest path from the index page to the reserved branch constitutes a new user path. s an illustration of Step 1, see Figure 8. Since the parent page of E, the first page in the reserved branch, is C in the site topology (see Figure 4), and C is the lastly assigned page, the reserved branch E-G is assigned to Tree 1. <Fig. 8> 9
10 2.3 Example ssume that a web site (see Figure 9) is visited by three users who are routed through a proxy server and that the paths of the three users in chronological order are --F-G, -C-H and --E-D, respectively. Suppose that the agent logs for --F-G and -C-H are identical while the agent log for --E-D is a different one (the italicized path denotes a different agent log). In such a case, each of the repeatedly requested pages is recorded only once in the log file. Therefore, the recorded requested pages are --F-C-E-G-D-H in this example. <Fig. 9> ased on the access log, referrer log, agent log and site topology, a hierarchical tree is constructed as shown in Figure 10 (by steps 9 and 2 in Figure 2). The tree --F-C, as depicted in Figure 10, can be considered as a one-user navigational pattern since the tree does not satisfy condition 4. <Fig. 10> Since the agent of the fifth record, E, is different from that of the previous tree (--F-C), it is classified as Others by step 1 in Figure 2 and is regarded as a new user s. Then, the path is completed by connecting E to the index page along the shortest path (by Path completion in Figure 2) as follows. 10
11 <Fig. 11> Since the agent of the sixth record, G, is identical to that of --F-C and the referrer of G is F, the tree is expanded as shown in Figure 12 (by step 2 in Figure 2). <Fig. 12> However, --F-C-G is not considered as a single-user navigational path by the proposed algorithm since the tree satisfies condition 4. Therefore, step 3 in Figure 2 is carried out by separating C from --F-C-G and completing the path from to C according to the referrer of C as follows. <Fig. 13> The seventh and the eighth records, H and D, respectively, are assigned to the agent and referrer of the records in step 2 in Figure 2. s shown in Figure 14, the proposed algorithm correctly identifies the three users and their navigational patterns. <Fig. 14> 3. Performance Evaluation Experiments were conducted to evaluate the proposed algorithm in Section 2. n experimental homepage was established on a web server and participants navigated the website 11
12 through a proxy server. Then, the performance of the proposed algorithm was evaluated by analyzing the experimental results. 3.1 Experimental Environment real website of a company was used as the experimental website. The website is composed of fifty pages with the structure as shown in Figure 15. <Fig. 15> The softwares IIS(Internet Information Server, Microsoft) and Midpoint [4] were used to host the web site and the proxy server, respectively. The constructed experimental environment is shown in Figure 16. <Fig. 16> The experiments were conducted using five groups of two users and another five groups of three users. Experiments were conducted twice for each group. That is, in the first experiment, each group was asked to navigate about ten pages (8~12pages), and in the second experiment, about fifteen pages (13~17pages). This results in ten samples from ten-pagenavigation tasks and another ten samples of fifteen-page- navigation tasks. In the experiments with two users, each computer had the same operating system and browser. That is, the users had identical agent logs. However, in the experiment with three users, two computers had the same operating system and browser but the third used a different 12
13 browser to reflect different market shares of the browsers. In 1999, Explorer and Netscape were used by 68.57% and 29.53% of the users, respectively, while others accounted for less than 1.9% [5], which indicates that about two thirds of the users use Explorer while one third of the users use Netscape. Therefore, the experimental conditions were set up such that two users had identical agent logs while the third user had a different one. 3.2 Experimental Results Correct and incorrect page restoration ratios were used to evaluate the proposed algorithm. The correct page restoration ratio is a relative measure of how well the navigated path was restored using the proposed algorithm and the incorrect page restoration ratio is a relative measure of mistakenly assigned paths. For example, assume that, for the actual navigational path of C-C1, the following path was restored using the proposed algorithm. <Fig. 17> Since the path -1- and C-C1 were correctly restored, the correct page restoration ratio is no. of correctly identified pages total no. of pages in the navigational path 5 = 100 = 62.5(%) 8 (1) Since the path D-D1 is mistakenly restored, the incorrect page restoration ratio is total no. of pages not in the navigational path total no. of pages in the identified path 2 (2) = 100 = 28.6(%) 7 13
14 The experimental results are shown in Table 1. <Table 1> 4. Conclusion n algorithm is proposed for identifying the navigational paths of the users who are routed through proxy servers and is evaluated by conducting experiments. The experimental results show a correct page restoration ratio of 78% and incorrect page restoration ratio of 4.1% on average, which indicates that the proposed algorithm can be used as a reasonable tool for the identification of the navigational paths of the users routed through proxy servers. Future work will include further tests to validate the algorithm as well as its improvement using information from cookies. References [1] Cooley R, Mobasher, and Srivastava J (1999) Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1(1), [2] Pitkow J (1997) In search of reliable usage data on the WWW. Computer Networks and ISDN Systems, 20, [3] [4] [5] 14
15 Figure 1 Stages of the proposed algorithm Figure 2 Flowchart for Stage 2 Figure 3 Example of step 2 in Figure 2 Figure 4 Example of step 2 in Figure 2 Figure 5 Example of step 3 in Figure 2 Figure 6 Example of step 7 in Figure 2 Figure 7 Example of Path completion Figure 8 Example of Stage 3 Figure 9 Structure of the web site Figure 10 lgorithm execution example (1) Figure 11 lgorithm execution example (2) Figure 12 lgorithm execution example (3) Figure 13 lgorithm execution example (4) Figure 14 lgorithm execution example (5) Figure 15 Structure of the experimental website Figure 16 Constructed experimental environment Figure 17 Restoration example Figure Captions Table 1 Experimental results Table Titles 15
16 Stage 1 Sorting records with the same IP address in chronological order Stage 2 Making a hierarchical tree based on access, agent, referrer log and site topology Stage 3 ssignment of a reserved branch to another tree or regarding it as a new user tree ssignment of a reserved branch which cannot be assigned in Stage 2. - Construction of a tree - ssignment of a page to a tree - Splitting of the constructed tree if two-user navigational patterns are found. - Path completion <Figure1> 16
17 Stage 1 Condition 1 or 2 Is the requested page the first record? 1 No Page ssignment condition? (See Fig. 3 and 4) Yes 9 Construction of a tree with that page alone. Others 2 ssignment of the page to a tree which meets condition 1 or 2. (See Fig. 3 and 4) 4 Condition 3 Reservation of the assignment of the page No 3 re two-user navigational patterns found? (Condition 4 satisfied? ) (See Fig. 5) Yes Splitting of the constructed tree into two trees (See Fig. 5) No Reservation of assigning the reserved page or the split branch to another tree. 5 Is the request time of the reserved page or the first page of the split branch later than that of the lastly assigned page in each of the existing trees? (See Fig 6) 7 6 Yes Is there only one tree which satisfies the above condition? (See Fig. 6) Yes Reassignment of the reserved page or split branch to another tree. (See Fig. 6) 8 No Regarded as a new user s Path completion (See Fig. 7) Stage 3 <Figure 2> 17
18 Tree 1 Tree 2 Tree 1 Tree 2 C D C D Trees 1 and 2 have the same agent. New record: E The referrer of the new record: C E E is assigned to Tree1. <Figure 3> C D E F G Site topology Tree 1 Tree 2 Tree 1 Tree 2 C D C D D F gent of Trees 1 and 2 are different. New record: F The referrer of the new record: D The agent of the new record, F, is identical to that of Tree1. Most recently assigned page to Tree1: C <Figure 4> 18
19 1 2 C 3 D 4 E 6 F 5 - Subscripts indicate the order of assignment. - Index page: - The requested page E was assigned to the tree in step 2 in Fig The referrer of the requested page: C - Page assigned to the tree just prior to the requested page : F 1 2 C 3 D 4 E 6 F 5 - Since the referrer C of the current page E does not belong to the path, --D-F, the branch C-E or D-F is taken off the tree. 1 2 C 3 D 4 E 6 F 5 C E D F Consider the request time of the first page of each branch. Since the request time of D is later than that of C, D-F is taken off the tree. <Figure 5> Tree Split branch D 3 F 4 Tree 1 D F <Figure 6> 19
20 New user tree D F fter path completion D F <Figure 7> Tree 1 Tree 2 Reserved branch Tree 1 Tree 2 E G C D C D E G <Figure 8> C D E F H G <Figure 9> 20
21 C F <Figure 10> C F C F E E <Figure 11> C F G <Figure 12> 21
22 C C C F G F G F G E E <Figure 13> C H F E G D <Figure 14> 22
23 Home C D C1 C2 C3 C4 C5 D1 D2 D3 D4 <Figure 15> User computers Web server (IIS) Proxy server (Midpoint) <Figure 16> 23
24 (navigated path) C C1 (restored path) 1 C C1 D D1 <Figure 17> 24
25 Correct restoration ratio Incorrect restoration ratio Two users visiting about ten pages Two users visiting about fifteen pages Three users visiting about ten pages Three users visiting about fifteen pages 86.8 % 79.3% 72.5% 73.4% 3.8% 0.9% 4.1% 7.5% <Table 1> 25
SEQUENTIAL PATTERN MINING FROM WEB LOG DATA
SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, rajashreeshettar@rvce.edu.in Abstract
More informationScalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme
Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme Jung-Rim Kim, Seong Soo Chun, Seok-jin Oh, and Sanghoon Sull School of Electrical Engineering, Korea University,
More informationWeb Data mining-a Research area in Web usage mining
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,
More informationAssociation-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications
Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications Daniel Mican, Nicolae Tomai Babes-Bolyai University, Dept. of Business Information Systems, Str. Theodor
More informationThe influence of caching on web usage mining
The influence of caching on web usage mining J. Huysmans 1, B. Baesens 1,2 & J. Vanthienen 1 1 Department of Applied Economic Sciences, K.U.Leuven, Belgium 2 School of Management, University of Southampton,
More informationWeb Usage Mining: A Research Area in Web Mining
Web Usage Mining: A Research Area in Web Mining Rajni Pamnani, Pramila Chawan Department of computer technology, VJTI University, Mumbai Abstract Web usage mining is a main research area in Web mining
More informationEffectively Capturing User Navigation Paths in the Web Using Web Server Logs
Effectively Capturing User Navigation Paths in the Web Using Web Server Logs Amithalal Caldera and Yogesh Deshpande School of Computing and Information Technology, College of Science Technology and Engineering,
More informationLoad Balancing Overview
The "Load Balancing" feature is available only in the Barracuda Web Application Firewall 460 and above. A load balancer is a networking device that distributes traffic across multiple back-end servers
More informationData Mining of Web Access Logs Using Classification Techniques
Data Mining of Web Logs Using Classification Techniques Md. Azam 1, Asst. Prof. Md. Tabrez Nafis 2 1 M.Tech Scholar, Department of Computer Science & Engineering, Al-Falah School of Engineering & Technology,
More informationPattern Classification based on Web Usage Mining using Neural Network Technique
International Journal of Computer Applications (975 8887) Pattern Classification based on Web Usage Mining using Neural Network Technique Er. Romil V Patel PIET, VADODARA Dheeraj Kumar Singh, PIET, VADODARA
More informationImproved Data Preparation Technique in Web Usage Mining
International Journal of Computer Networks and Communications Security VOL.1, NO.7, DECEMBER 2013, 284 291 Available online at: www.ijcncs.org ISSN 2308-9830 C N C S Improved Data Preparation Technique
More informationKnowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey
Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey G. Shivaprasad, N. V. Subbareddy and U. Dinesh Acharya
More informationImproving Web Site Navigational Design and Performance from Web Log Data
Int'l Conf. Grid, Cloud, & Cluster Computing GCC'16 9 Improving Web Site Navigational Design and Performance from Web Log Data Esther Amo-Nyarko, Wei Hao Department of Computer Science, Northern Kentucky
More informationFuzzy Cognitive Maps application for Webmining
Fuzzy Cognitive Maps application for Webmining Andreas Kakolyris Dept. Computer Science, University of Ioannina Greece, csst9942@otenet.gr George Stylios Dept. of Communications, Informatics and Management,
More informationWeb Mining Using Cloud Computing Technology
International Journal of Scientific Research in Computer Science and Engineering Review Paper Volume-3, Issue-2 ISSN: 2320-7639 Web Mining Using Cloud Computing Technology Rajesh Shah 1 * and Suresh Jain
More informationHow to Make the Client IP Address Available to the Back-end Server
How to Make the Client IP Address Available to the Back-end Server For Layer 4 - UDP and Layer 4 - TCP services, the actual client IP address is passed to the server in the TCP header. No further configuration
More informationHow to Configure Mobile VPN for Forcepoint NGFW TECHNICAL DOCUMENT
How to Configure Mobile VPN for Forcepoint NGFW TECHNICAL DOCUMENT Table of Contents TABLE OF CONTENTS 1 BACKGROUND 2 WINDOWS SERVER CONFIGURATION STEPS 2 CONFIGURING USER AUTHENTICATION 3 ACTIVE DIRECTORY
More informationContext-based Navigational Support in Hypermedia
Context-based Navigational Support in Hypermedia Sebastian Stober and Andreas Nürnberger Institut für Wissens- und Sprachverarbeitung, Fakultät für Informatik, Otto-von-Guericke-Universität Magdeburg,
More informationA Design of Cooperation Management System to Improve Reliability in Resource Sharing Computing Environment
A Design of Cooperation Management System to Improve Reliability in Resource Sharing Computing Environment Ji Su Park, Kwang Sik Chung 1, Jin Gon Shon Dept. of Computer Science, Korea National Open University
More informationIntegrating Network QoS and Web QoS to Provide End-to-End QoS
Integrating Network QoS and Web QoS to Provide End-to-End QoS Wang Fei Wang Wen-dong Li Yu-hong Chen Shan-zhi State Key Lab of Networking and Switching, Beijing University of Posts & Telecommunications,
More informationEvolutionary Linkage Creation between Information Sources in P2P Networks
Noname manuscript No. (will be inserted by the editor) Evolutionary Linkage Creation between Information Sources in P2P Networks Kei Ohnishi Mario Köppen Kaori Yoshida Received: date / Accepted: date Abstract
More informationWeb Usage Mining: Discovery Of Mined Data Patterns and their Applications
Web Usage Mining: Discovery Of Mined Data Patterns and their Applications Arun Singh 1 Avinav Pathak 1 Dheeraj Sharma 1 (Associate Professor) (Lecturer) (Assistant Professor) IIMT Engineering College,
More informationCharacterizing Web Usage Regularities with Information Foraging Agents
Characterizing Web Usage Regularities with Information Foraging Agents Jiming Liu 1, Shiwu Zhang 2 and Jie Yang 2 COMP-03-001 Released Date: February 4, 2003 1 (corresponding author) Department of Computer
More informationA Review Paper on Web Usage Mining and Pattern Discovery
A Review Paper on Web Usage Mining and Pattern Discovery 1 RACHIT ADHVARYU 1 Student M.E CSE, B. H. Gardi Vidyapith, Rajkot, Gujarat, India. ABSTRACT: - Web Technology is evolving very fast and Internet
More informationAdvanced Data Structures and Algorithms
dvanced ata Structures and lgorithms ssociate Professor r. Raed Ibraheem amed University of uman evelopment, College of Science and Technology Computer Science epartment 2015 2016 epartment of Computer
More informationChapter 3 Process of Web Usage Mining
Chapter 3 Process of Web Usage Mining 3.1 Introduction Users interact frequently with different web sites and can access plenty of information on WWW. The World Wide Web is growing continuously and huge
More informationA Novel Method of Optimizing Website Structure
A Novel Method of Optimizing Website Structure Mingjun Li 1, Mingxin Zhang 2, Jinlong Zheng 2 1 School of Computer and Information Engineering, Harbin University of Commerce, Harbin, 150028, China 2 School
More informationA Web Page Recommendation system using GA based biclustering of web usage data
A Web Page Recommendation system using GA based biclustering of web usage data Raval Pratiksha M. 1, Mehul Barot 2 1 Computer Engineering, LDRP-ITR,Gandhinagar,cepratiksha.2011@gmail.com 2 Computer Engineering,
More informationPersonalized Navigation in the Semantic Web
Personalized Navigation in the Semantic Web Michal Tvarožek Institute of Informatics and Software Engineering Faculty of Informatics and Information Technology, Slovak University of Technology, Ilkovičova
More informationPre-Processing of Query Logs in Web Usage Mining
Industrial Engineering & Management Systems Vol 11, No 1, Mar 2012, pp.82-86 ISSN 1598-7248 EISSN 2234-6473 http://dx.doi.org/10.7232/iems.2012.11.1.082 2012 KIIE Pre-Processing of Query Logs in Web Usage
More informationExtended Search Administration
IBM Lotus Extended Search Extended Search Administration Version 4 Release 0.1 SC27-1404-02 IBM Lotus Extended Search Extended Search Administration Version 4 Release 0.1 SC27-1404-02 Note! Before using
More informationIJITKMSpecial Issue (ICFTEM-2014) May 2014 pp (ISSN )
A Review Paper on Web Usage Mining and future request prediction Priyanka Bhart 1, Dr.SonaMalhotra 2 1 M.Tech., CSE Department, U.I.E.T. Kurukshetra University, Kurukshetra, India 2 HOD, CSE Department,
More informationThe worlds largest collection of optics & photonics research. SPIEDigitalLibrary.org
The worlds largest collection of optics & photonics research What s Available: SPIE Conference Proceedings Series starting with Vol. 1200. SPIE Journals: - Optical Engineering from Vol. 29 - Journal of
More informationAn Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database
An Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database Toru Fukumoto Canon Inc., JAPAN fukumoto.toru@canon.co.jp Abstract: A large number of digital images are stored on the
More informationI. Introduction II. Keywords- Pre-processing, Cleaning, Null Values, Webmining, logs
ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: An Enhanced Pre-Processing Research Framework for Web Log Data
More informationAN EFFECTIVE SEARCH ON WEB LOG FROM MOST POPULAR DOWNLOADED CONTENT
AN EFFECTIVE SEARCH ON WEB LOG FROM MOST POPULAR DOWNLOADED CONTENT Brindha.S 1 and Sabarinathan.P 2 1 PG Scholar, Department of Computer Science and Engineering, PABCET, Trichy 2 Assistant Professor,
More informationCisco Accessibility Conformance Report VPAT Version 2.1
Cisco Accessibility Conformance Report VPAT Version 2.1 Name of Product/Version: Cisco Webex Events v4.3 Product Description: Cisco Webex Events is the browser-based version of Webex Events. The application
More informationCisco Accessibility Conformance Report VPAT Version 2.0
Cisco Accessibility Conformance Report VPAT Version 2.0 Name of Product/Version: Cisco WebEx Web App Meeting Center v3.7 Product Description: Cisco WebEx Web App is the browser-based version of WebEx Meeting
More informationand the Forensic Science CC Spring 2007 Prof. Nehru
and the Introduction The Internet, (Information superhighway), has opened a medium for people to communicate and to access millions of pieces of information from computers located anywhere on the globe.
More informationActivating Intrusion Prevention Service
Activating Intrusion Prevention Service Intrusion Prevention Service Overview Configuring Intrusion Prevention Service Intrusion Prevention Service Overview Intrusion Prevention Service (IPS) delivers
More informationBehaviour Recovery and Complicated Pattern Definition in Web Usage Mining
Behaviour Recovery and Complicated Pattern Definition in Web Usage Mining Long Wang and Christoph Meinel Computer Department, Trier University, 54286 Trier, Germany {wang, meinel@}ti.uni-trier.de Abstract.
More informationPRISM-FHF The Fred Hollows Foundation
PRISM-FHF The Fred Hollows Foundation SECURITY ADMINISTRATOR S GUIDE Version 1.2 TABLE OF CONTENTS INTRODUCTION... 4 OVERVIEW... 4 SECURITY CONSOLE... 6 ACCESSING THE SECURITY CONSOLE... 6 VIEWING THE
More informationCHAPTER - 3 PREPROCESSING OF WEB USAGE DATA FOR LOG ANALYSIS
CHAPTER - 3 PREPROCESSING OF WEB USAGE DATA FOR LOG ANALYSIS 48 3.1 Introduction The main aim of Web usage data processing is to extract the knowledge kept in the web log files of a Web server. By using
More informationA Three Dimensional Interface for Temporal Information Retrieval
A Three Dimensional Interface for Temporal Information Retrieval HoWook Jang, ManSoo Kim Natural Language Processing Section, Electronics and Telecommunications Research Institute 161 Kajong-Dong, Yusong-Gu,
More informationEEC-684/584 Computer Networks
EEC-684/584 Computer Networks Lecture 14 wenbing@ieee.org (Lecture nodes are based on materials supplied by Dr. Louise Moser at UCSB and Prentice-Hall) Outline 2 Review of last lecture Internetworking
More informationUser Session Identification Using Enhanced Href Method
User Session Identification Using Enhanced Href Method Department of Computer Science, Constantine the Philosopher University in Nitra, Slovakia jkapusta@ukf.sk, psvec@ukf.sk, mmunk@ukf.sk, jskalka@ukf.sk
More informationCharacterizing Home Pages 1
Characterizing Home Pages 1 Xubin He and Qing Yang Dept. of Electrical and Computer Engineering University of Rhode Island Kingston, RI 881, USA Abstract Home pages are very important for any successful
More informationFeatures of a proxy server: - Nowadays, by using TCP/IP within local area networks, the relaying role that the proxy
Que: -Proxy server Introduction: Proxy simply means acting on someone other s behalf. A Proxy acts on behalf of the client or user to provide access to a network service, and it shields each side from
More informationText Document Clustering Using DPM with Concept and Feature Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,
More informationJavelin Workbench Tutorial. Version 3.0 September, 2009
Javelin Workbench Tutorial Version 3.0 September, 2009 OVERVIEW The Javelin Workbench Beginner Tutorial walks you through the steps of building online feedback forms for the purposes of data collection.
More informationFM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data
FM-WAP Mining: In Search of Frequent Mutating Web Access Patterns from Historical Web Usage Data Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University,
More informationOracle Best Practices for Managing Fusion Application: Discovery of Fusion Instance in Enterprise Manager Cloud Control 12c
An Oracle White Paper July, 2014 Oracle Best Practices for Managing Fusion Application: Discovery of Fusion Instance in Enterprise Manager Cloud Control 12c Executive Overview... 2 Applicable versions
More informationConfiguring HTTP Header Load Balancing
12 CHAPTER Configuring HTTP Header Load Balancing This chapter describes how to configure HTTP header load balancing by creating an HTTP header field group and configuring HTTP header fields. Information
More informationPre-processing of Web Logs for Mining World Wide Web Browsing Patterns
Pre-processing of Web Logs for Mining World Wide Web Browsing Patterns # Yogish H K #1 Dr. G T Raju *2 Department of Computer Science and Engineering Bharathiar University Coimbatore, 641046, Tamilnadu
More informationSathyamangalam, 2 ( PG Scholar,Department of Computer Science and Engineering,Bannari Amman Institute of Technology, Sathyamangalam,
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 8, Issue 5 (Jan. - Feb. 2013), PP 70-74 Performance Analysis Of Web Page Prediction With Markov Model, Association
More informationDesign of memory efficient FIFO-based merge sorter
LETTER IEICE Electronics Express, Vol.15, No.5, 1 11 Design of memory efficient FIFO-based merge sorter Youngil Kim a), Seungdo Choi, and Yong Ho Song Department of Electronics and Computer Engineering,
More informationInformation Filtering and user profiles
Information Filtering and user profiles Roope Raisamo (rr@cs.uta.fi) Department of Computer Sciences University of Tampere http://www.cs.uta.fi/sat/ Information Filtering Topics information filtering vs.
More information12 Web Usage Mining. With Bamshad Mobasher and Olfa Nasraoui
12 Web Usage Mining With Bamshad Mobasher and Olfa Nasraoui With the continued growth and proliferation of e-commerce, Web services, and Web-based information systems, the volumes of clickstream, transaction
More informationCLASSIFICATION OF WEB LOG DATA TO IDENTIFY INTERESTED USERS USING DECISION TREES
CLASSIFICATION OF WEB LOG DATA TO IDENTIFY INTERESTED USERS USING DECISION TREES K. R. Suneetha, R. Krishnamoorthi Bharathidasan Institute of Technology, Anna University krs_mangalore@hotmail.com rkrish_26@hotmail.com
More informationDifferential Compression and Optimal Caching Methods for Content-Based Image Search Systems
Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Di Zhong a, Shih-Fu Chang a, John R. Smith b a Department of Electrical Engineering, Columbia University, NY,
More informationCisco Accessibility Conformance Report VPAT Version 2.1
Cisco Accessibility Conformance Report VPAT Version 2.1 Name of Product/Version: Cisco Email Security Virtual Appliance (ESA) v11.1 Product Description: Cisco Email Security provides defense against phishing,
More informationCollaborative Filtering Based on Iterative Principal Component Analysis. Dohyun Kim and Bong-Jin Yum*
Collaborative Filtering Based on Iterative Principal Component Analysis Dohyun Kim and Bong-Jin Yum Department of Industrial Engineering, Korea Advanced Institute of Science and Technology, 373-1 Gusung-Dong,
More informationOn the Effectiveness of Web Usage Mining for Page Recommendation and Restructuring
On the Effectiveness of Web Usage Mining for Recommendation and Restructuring Hiroshi Ishikawa, Manabu Ohta, Shohei Yokoyama, Junya Nakayama, and Kaoru Katayama Tokyo Metropolitan University Abstract.
More informationHigh Utility Web Access Patterns Mining from Distributed Databases
High Utility Web Access Patterns Mining from Distributed Databases Md.Azam Hosssain 1, Md.Mamunur Rashid 1, Byeong-Soo Jeong 1, Ho-Jin Choi 2 1 Database Lab, Department of Computer Engineering, Kyung Hee
More informationHow to Evaluate the Effectiveness of URL Normalizations
How to Evaluate the Effectiveness of URL Normalizations Sang Ho Lee 1, Sung Jin Kim 2, and Hyo Sook Jeong 1 1 School of Computing, Soongsil University, Seoul, Korea shlee@computing.ssu.ac.kr, hsjeong@ssu.ac.kr
More information1.4 VPN Processing Principle and Communication Method
This section contains a description of operation principle and communication method of VPN that can be constructed by SoftEther VPN. An overview of the modules and functions that was used by VPN communications.
More informationComparatively Analysis of Fix and Dynamic Size Frequent Pattern discovery methods using in Web personalisation
Comparatively nalysis of Fix and Dynamic Size Frequent Pattern discovery methods using in Web personalisation irija Shankar Dewangan1, Samta ajbhiye2 Computer Science and Engineering Dept., SSCET Bhilai,
More informationOpen Systems Interconnection (OSI) Routing Protocol
CHAPTER 41 Open Systems Interconnection (OSI) Protocol Background The International Organization for Standardization (O) developed a complete suite of routing protocols for use in the Open Systems Interconnection
More informationGmail Labels + Filters
Gmail Labels + Filters Table of Contents Purpose Logging In What ARE labels Creating labels How can you USE labels What ARE filters Creating filters How are labels useful Logging In Open browser and go
More informationWeb Crawlers Detection. Yomna ElRashidy
Web Crawlers Detection Yomna ElRashidy yomna.elrashidi@aucegypt.com Outline A web crawler is a program that traverse the web autonomously with the purpose of discovering and retrieving content and knowledge
More informationLog Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal
Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal Mohd Helmy Ab Wahab 1, Azizul Azhar Ramli 2, Nureize Arbaiy 3, Zurinah Suradi 4 1 Faculty of Electrical
More informationComputer Science 461 Final Exam May 22, :30-3:30pm
NAME: Login name: Computer Science 461 Final Exam May 22, 2012 1:30-3:30pm This test has seven (7) questions, each worth ten points. Put your name on every page, and write out and sign the Honor Code pledge
More informationDesign and Implementation of A P2P Cooperative Proxy Cache System
Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,
More informationDesign and Development of Secure Data Cache Framework. Please purchase PDF Split-Merge on to remove this watermark.
Design and Development of Secure Data Cache Framework CHAPTER 6 DESIGN AND DEVELOPMENT OF A SECURE DATA CACHE FRAMEWORK The nodes of the MANETs act as host and a router without trustworthy gateways. An
More informationWBEM-based SLA Management across multi-domain networks for QoS-guaranteed DiffServ-over-MPLS Provisioning
WBEM-based SLA Management across multi-domain networks for QoS-guaranteed DiffServ-over-MPLS Provisioning Jong-Cheol Seo 1, Hyung-Soo Kim 2, Dong-Sik Yun 2, Young-Tak Kim 1, 1 Dept. of Information and
More informationemetrics Study Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca Blue Martini Software {lmason, zijian, ronnyk,
emetrics Study Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca Blue Martini Software {lmason, zijian, ronnyk, brianf}@bluemartini.com December 5 th 2001 2001 Blue Martini Software 1. Introduction Managers
More informationAn Efficient Algorithm for AS Path Inferring
An Efficient Algorithm for AS Path Inferring Yang Guoqiang and Dou Wenhua National Univernity of Defence Technololy, China yanggq@nudt.edu.cn Abstract Discovering the AS paths between two ASes are invaluable
More informationWe will discuss about three different static routing algorithms 1. Shortest Path Routing 2. Flooding 3. Flow Based Routing
In this lecture we will discuss about Routing algorithms Congestion algorithms Lecture 19 The routing algorithm is that part of the network layer software, which is responsible for deciding which output
More informationMining for User Navigation Patterns Based on Page Contents
WSS03 Applications, Products and Services of Web-based Support Systems 27 Mining for User Navigation Patterns Based on Page Contents Yue Xu School of Software Engineering and Data Communications Queensland
More informationSurvey Paper on Web Usage Mining for Web Personalization
ISSN 2278 0211 (Online) Survey Paper on Web Usage Mining for Web Personalization Namdev Anwat Department of Computer Engineering Matoshri College of Engineering & Research Center, Eklahare, Nashik University
More informationSupporting World-Wide Web Navigation Through History Mechanisms
Supporting World-Wide Web Navigation Through History Mechanisms Linda Tauscher Computer Science Department, University of Calgary tauscher@cpsc.ucalgary.ca Cite as: Tauscher, L. (1996) Supporting World
More informationProxying. Why and How. Alon Altman. Haifa Linux Club. Proxying p.1/24
Proxying p.1/24 Proxying Why and How Alon Altman alon@haifux.org Haifa Linux Club Proxying p.2/24 Definition proxy \Prox"y\, n.; pl. Proxies. The agency for another who acts through the agent; authority
More informationDevelopment of Massive Data Transferring Method for UPnP based Robot Middleware
Development of Massive Data Transferring Method for UPnP based Robot Middleware Kyung San Kim, Sang Chul Ahn, Yong-Moo Kwon, Heedong Ko, and Hyoung-Gon Kim Imaging Media Research Center Korea Institute
More informationA Survey on Web Personalization of Web Usage Mining
A Survey on Web Personalization of Web Usage Mining S.Jagan 1, Dr.S.P.Rajagopalan 2 1 Assistant Professor, Department of CSE, T.J. Institute of Technology, Tamilnadu, India 2 Professor, Department of CSE,
More informationCisco Accessibility Conformance Report VPAT Version 2.1
Cisco Accessibility Conformance Report VPAT Version 2.1 Name of Product/Version: Cisco Social Miner v12.0 Product Description: Cisco Social Miner is a social media customer care solution that can help
More informationAvailable online at ScienceDirect. Procedia Computer Science 56 (2015 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 56 (2015 ) 266 270 The 10th International Conference on Future Networks and Communications (FNC 2015) A Context-based Future
More informationDeploying F5 with Microsoft Remote Desktop Services
Deployment Guide Deploying F5 with IMPORTANT: This guide has been archived. There are two newer deployment guides and downloadable iapp templates available for Remote Desktop Services, one for the Remote
More informationMainspring A Clockwork Logic Tool
Mainspring A Clockwork Logic Tool Mainspring Tutorial For version 2.0 ii TABLE OF CONTENTS INTRO...1 GET TO MAINSPRING... 2 EXPERIMENT, EXPERIMENT...3 ADD A NEW PAGE...5 EDIT AN EXISTING PAGE...7 RENAME
More informationPatternRank: A Software-Pattern Search System Based on Mutual Reference Importance
PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance Atsuto Kubo, Hiroyuki Nakayama, Hironori Washizaki, Yoshiaki Fukazawa Waseda University Department of Computer Science
More informationA Traceback Attack on Freenet
A Traceback Attack on Freenet Guanyu Tian, Zhenhai Duan Florida State University {tian, duan}@cs.fsu.edu Todd Baumeister, Yingfei Dong University of Hawaii {baumeist, yingfei}@hawaii.edu Abstract Freenet
More informationClients Continued... & Letters. Campaigns Continued To create a Custom Campaign you must first name the campaign and select
Clients Continued... Campaigns Continued To create a Custom Campaign you must first name the campaign and select what type of campaign it will be. Next you will add letters to your campaign from your letter
More informationA Study on the IoT Sensor Interaction Transmission System based on BigData
Vol.123 (SoftTech 2016), pp.220-224 http://dx.doi.org/10.14257/astl.2016.123.41 A Study on the IoT Sensor Interaction Transmission System based on BigData Jin-Tae Park 1, Gyung-Soo Phyo 1 and Il-Young
More informationsamwin 5.1 R3 User Manual
samwin 5.1 R3 User Manual Version 1.0 Last Modified September 17, 2012 Contents 1 Introduction... 3 2 Using the samwin contact center suite Operator Console... 4 2.1 Basic Information about Control...
More informationAn advanced data leakage detection system analyzing relations between data leak activity
An advanced data leakage detection system analyzing relations between data leak activity Min-Ji Seo 1 Ph. D. Student, Software Convergence Department, Soongsil University, Seoul, 156-743, Korea. 1 Orcid
More informationValerus Internet Access Guide
Valerus Internet Access Guide XX285-40-02 Vicon Industries Inc. does not warrant that the functions contained in this equipment will meet your requirements or that the operation will be entirely error
More informationA New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph
A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph T. Vijaya Kumar, H. S. Guruprasad, Bharath Kumar K. M., Irfan Baig, and Kiran Babu S. Abstract To have
More informationClustering Analysis based on Data Mining Applications Xuedong Fan
Applied Mechanics and Materials Online: 203-02-3 ISSN: 662-7482, Vols. 303-306, pp 026-029 doi:0.4028/www.scientific.net/amm.303-306.026 203 Trans Tech Publications, Switzerland Clustering Analysis based
More informationLink Recommendation Method Based on Web Content and Usage Mining
Link Recommendation Method Based on Web Content and Usage Mining Przemys law Kazienko and Maciej Kiewra Wroc law University of Technology, Wyb. Wyspiańskiego 27, Wroc law, Poland, kazienko@pwr.wroc.pl,
More informationTHE STUDY OF WEB MINING - A SURVEY
THE STUDY OF WEB MINING - A SURVEY Ashish Gupta, Anil Khandekar Abstract over the year s web mining is the very fast growing research field. Web mining contains two research areas: Data mining and World
More informationConstruction of Web Community Directories by Mining Usage Data
Construction of Web Community Directories by Mining Usage Data Dimitrios Pierrakos 1, Georgios Paliouras 1, Christos Papatheodorou 2, Vangelis Karkaletsis 1, Marios Dikaiakos 3 1 Institute of Informatics
More information