A Survey in Web Page Clustering Techniques

Size: px
Start display at page:

Download "A Survey in Web Page Clustering Techniques"

Transcription

1 A Survey in Web Page Clustering Techniques Antonio LaTorre, José M. Peña, Víctor Robles, María S. Pérez Department of Computer Architecture and Technology, Technical University of Madrid, Madrid, Spain, {atorre, jmpena, vrobles, Abstract. Web page clustering is one of the major preprocessing steps in web mining analysis. In data mining, preprocessing is a key task to ensure reliability and quality of the knowledge extracted by the whole mining process. As the amount of data to process is potentially infinite if dynamic web pages are considered, the need of preprocessing this information seems necessary to deal with this computational problem. Considering individual pages could also not provide additional information. In order to deal with this issue, this contribution proposes the use of web clustering techniques as a first step of the web usage mining process. We provide an overview of techniques in graph partitioning and their application to the problem of web page clustering, considering the taxonomy and characteristics of different web site structures/features (e.g. frames, dynamic code, CMS-based sites). This study demonstrate that the benefit of using these techniques depends considerably on input data and on users navigational habits. Keywords. Web usage/structure mining, Web page clustering, Graph partitioning. 1 Introduction Web page clustering is one of the major and most important preprocessing steps in web mining analysis. In this context (Web Usage/Context Mining) items to be studied are web pages. Web page clustering puts together web pages in groups, based on similarity or other relationship measures. Tightly-couple pages, pages in the same cluster, are considered as singular items for following data analysis steps. A complete data mining analysis could be performed by using web pages information as it appears in web logs, but when the number of pages to take into account increases (i.e., in a corporative largescale web server or a server using dynamic web pages) this process could be quite hard or even unbearable. In order to deal with this issue, web page clustering appears as a reasonable solution. These techniques group pages together based on some kind of relationship measure. Pages in the same cluster will be considered as a single item for further data analysis steps. Recently, a number of approaches have been developed dealing with specific aspects of Web usage mining for the purpose of automatically discovering user profiles. For example, Perkowitz and Etzioni [18] proposed the idea of optimizing the structure of Web sites based co-occurrence patterns of pages within usage data for the site. Schechter et al [20] have developed techniques for using path profiles of users to predict future HTTP requests, which can be used for network and proxy caching. Spiliopoulou

2 et al [22], Cooley et al [19], and Buchner and Mulvenna [5] have applied data mining techniques to extract usage patterns from Web logs, for the purpose of deriving marketing intelligence. Shahabi et al [8], Yan et al [24], and Nasraoui et al [17] have proposed clustering of user sessions to predict future user behavior. There are some different techniques that will be discussed in next section, but within this study we have considered the graph partitioning approach. These techniques are based on the concept of weighted graphs. B. Hendrickson and R. Leland [12] define the graph partitioning problem as: Given A graph G = (V, E) with (possibly unitary) weights on the edges and/or vertices and a parameter p. Find A partitioning of the vertices of G into p sets in such a way that the sums of the vertex weights in each set are as equal as possible, and the sum of the weights of edges crossing between sets is minimized. Many techniques and algorithms have been proposed, using different similarity measures and strategies that have been proved to influence in the quality of obtained results [23]. Some of them have been implemented by B. Hendrickson and R. Leland. Most important are inertial method, spectral partitioning, Kernighan-Lin method and a multi-level variant of Kernighan-Lin algorithm. A detailed description of them can be found in [13]. Other different approaches to graph partitioning, like the Expectation- Maximization Algorithm or the one proposed by Peña et al. [16] using optimization methods to obtain the best collection clusters according to a cluster quality metric are out of the scope of this study. The main motivation for this contribution has been already mentioned: the need of some techniques to be able to treat in a reasonable way the huge amount of data coming from web server logs. As some implementations of these techniques have been already done, we wanted to test them with real data to try to obtain some preliminary conclusions about the effect on the web mining result. The following section gives an overview of most important page clustering techniques. Section 3 presents the study we have performed, while section 4 presents main results obtained. Some discussion is done in section 5 and main conclusions are shown in section 6. 2 Most important Web Page Clustering techniques Web page clustering deal with a set of web pages hosted on a web server to obtain a collection of web page sets (clusters). These clusters are applied in the following steps of the mining process instead of original pages. There are three web clustering criteria: semantic, structure, and usage based. 2.1 Semantic Clustering Cooley in [9] suggests the usage of some semantical hints to get profit in the mining process of web data, proposing that several analysis cannot be achieved without additional metainformation on structure.

3 Semantical web page clustering are based on the concept of web page hierarchies. The lowest level leaves in these hierarchies are web pages, that are grouped in higherlevel nodes based on semantical affinities. For example, product web pages are clustered in several product families that are later grouped in a cluster for all products, beside other clusters of corporative or support information can also be defined. Semantical hierarchies can be defined following many different criteria, depending on the objectives and strategies of this analysis, and, hence, many different collections of clusters can be provided. This web page clustering techniques requires, anyway, some domain information, either from the domain experts or retrieved by any semantic repository. In this later case, there is a range of possible paths, from META-like information provided on the page contents, to Semantic Web principles, including also CMS-based web sites. 2.2 Graph Partitioning for Web Page Clustering Structure and usage page clustering are both very similar. These two approaches build a web page graph, in which nodes are the different web pages and arcs are the links among these pages. These links can be defined by the actual web links, in the case only web structure is considered or may be weighted by the usage of these transitions. In this last case, web log file is scanned to analyze the frequency of the transitions. In all these cases web clustering problem is translated in what is called graph partitioning. The graph partitioning problem is NP-hard, and it remains NP-hard even when the number of subsets is 2 or when some unbalancing is allowed [6]. For large graphs (with more than 100 vertices), heuristics algorithms which find suboptimal solutions are the only viable option. Proposed strategies can be classified in combinatorial approaches [15, 10], based on geometric representations [21, 11], multilevel schemes [25, 14], evolutionary optimization [3] and genetic algorithms [7]. We can find also hybrid schemes [2] that combines different approaches. There are a lot of graph partitioning algorithms and, as we cannot describe every single algorithm, we have selected those that we consider more relevant. We will start talking about four graph partitioning heuristics we have used in this study and then we will give a brief description of other clustering algorithms we find interesting. Simple partitioning methods. These techniques are intended to generate initial partition sets to be refined by using a local optimization strategy. The quality of these initial sets is very problem dependent, but they are often surprisingly good because data locality is often implicit in the vertex numbering [13]. In this category we find three main methods: (i) linear scheme, that assigns vertex to sets in order (if we have n vertices and p sets, first n/p vertices will be assigned to set 1, and so on), (ii) random scheme, where vertices are randomly assigned to sets preserving balance and (iii) scattered method, where vertices are processed in order with next vertex being assigned to the smallest set. Spectral partitioning. Spectral methods use eigenvector of a matrix constructed from the graph to decide how to partition the graph. The connection between eigenvectors and partitions may seem so surprising, but it has been proved that these

4 techniques are quite good at finding the right general area of the graph where cuts should be done. However, they often do not behave properly obtaining fine details. It is therefore advisable to use a local refinement algorithm to improve its results, like the generalized Kernighan-Lin one that will be described next. Kernighan-Lin. This is one of the most popular graph partitioning techniques. As it is quite old, it dates to the 70 s, many extensions and improvements have been done. The linear implementation by Fiduccia & Mattheyses is probably the most well-known of these improvements and it is often credited with the original algorithm [13]. KL usually does not find good partitions unless it is given a good initial one. This is why it is generally used as a local optimization technique, where it performs quite better. Multilevel-KL. This method was originally described in [12]. It is the most suitable option to deal with very large problems where high quality partitions are needed. It works by creating a sequence of increasingly smaller graphs approximating the original one, partitioning the smallest graph, and projecting this partition back through the intermediate levels. Kernighan-Lin is invoked every few levels of projection to refine the partition [13]. 3 Experimentation To evaluate these different page clustering techniques we have selected a very wellknown web mining problem, web page associations. Figure 1 shows the over all experimental scenario. Web page association analysis tries to extract the set of pages that usually appear together in the same web session. Sessions are extracted from the web log by using a referrer-based technique [4]. The web page associations analysis could be done at the web page level, but it would drive to a large set of association rules with very low support. This extracted model is, in most cases, useless. Instead of that we have performed a preprocessing step consisting on web page clustering. We construct a graph by using sessions information and this graph is partitioned by applying the algorithms described in previous section. This is the step that we want to study, and therefore these algorithms have been evaluated. The final result of the web page clustering step is a set of clusters containing the pages that belong to each of them. With this information, original sessions are rewritten translating each page into the cluster it belongs to. These rewritten sessions are then processed by the Apriori algorithm [1] to extract common associations. The data sets we have used come from the log files from Universia is a community portal about Spanish-speaking higher education that involves 379 universities world-wide. It contains information about education, organized in different groups depending on the user profile. We have used these log files from four different days. The total size of the logs files is 3 GBytes and they contain a total of user clicks in approx different pages.

5 Fig. 1. Experimental scenario 4 Results This section presents the results of our experiment. Table 1 shows the number of discovered association rules and their support values for our clustered web pages sets obtained by using the graph partitioning algorithms described earlier in this paper. We can see that those results are not as satisfactory as we thought at the beginning. The number of rules discovered by using clustered data is even worse than those obtained with raw data from web logs. Most of the algorithms give us only a single rule that, as we will discuss later, is the obvious one in every web usage analysis. The support of these rules is always 100% as it represents the arrival of users to the web site. We do not find important differences between algorithms with and without local improvement. This can be only justified by the kind of data we are working with. In next section we will discuss why these results are not as good as we expected and we will try to find some explanation for this behavior. Algorithm Local Improvement Rules Discovered Best Support Mean Support None No 4 100% 29% Linear No 1 100% 100% Linear K-L 1 100% 100% Spectral No 1 100% 100% Spectral K-L 1 100% 100% Random No 2 100% 57,65% Random K-L 1 100% 100% Scattered No 2 100% 59,2% Scattered K-L 1 100% 100% Multilevel K-L No 1 100% 100% Table 1. Clustering algorithms applied and rules discovered

6 5 Discussion With this study we wanted to prove that a preliminary step of preprocessing is necessary in the field of web usage mining. The huge amount of data to process makes it impossible to use all the information in web logs and so it seems very useful to group related data together. We have found that even with this preprocessing step results obtained were not as satisfactory as one could thing. The reason to these poor results appears very clearly if we perform a deeper analysis of input data. We have observed that most visits to this web site come directly from popular search engines (Google and Yahoo, for example) and that most of them are single-page visits what means that the user searches the information he needs, he accesses to this information and then he leaves the site. This makes impossible to obtain relevant results and it could explain why when we applied web partitioning techniques the Apriori algorithm performed poorer. These algorithms group tightly-coupled pages together what means that, if most people access same pages (main page, for example) and then leave, those accesses will be in same cluster and then other clusters become irrelevant: all the information appear in one cluster and then, the only rule that could be inferred is that every single visit to the site accesses one page of main cluster (we call main cluster the one with single-page visits). It would be interesting to do some research within other sites to find out if this is the general situation in web servers world wide or if it is a particular case with universia. Some research on isolated web sites like corporative webs that can be only reached from the inside of the institution would be interesting too. 6 Conclusions In this paper we have presented the importance of a preliminary step of data preprocessing in every data mining process. In this particular case, we have used techniques of graph partitioning to speed up the process and to obtain better association rules by using only relevant information. Then we have presented some of the most important graph partitioning algorithms and the experimental scenario we have worked with. Results of this experiment demonstrate that the improvement introduced by this preprocessing step depends dramatically on the quality of input data. In this case, our study led us to the conclusion that our input data was not good enough. However, we can not take universia responsible for this problem, as we have said before. Search engines and users navigational habits have so much to do with these results. For future research, it would be interesting to develop the two lines that we mentioned before to see if we can state that this is the normal way web sites behave and nothing can be done at this respect or if, otherwise, there are some situations where this preprocessing step would be useful. 7 Acknowledgements We would like to thanks for the possibility of working with their log files.

7 References 1. Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages , R. Baños, C. Gil, J. Ortega, and F.G. Montoya. Multilevel heuristic algorithm for graph partitioning. In Proceedings of the 3rd European Workshop on Evolutionary Computation in Combinatorial Optimization. LNCS 2611, pages , R. Baños, C. Gil, J. Ortega, and F.G. Montoya. Partición de grafos mediante optimización evolutiva paralela. In Proceedings de las XIV Jornadas de Paralelismo, pages , B. Berendt, B. Mobasher, M. Spiliopoulou, and J. Wiltshire. Measuring the accuracy of sessionizers for web usage analysis. In Workshop on Web Mining at the First SIAM International Conference on Data Mining, pages 7 14, A. Buchner and M. D. Mulvenna. Discovering internet marketing intelligence through online analytical Web usage mining. SIGMOD Record, 4(27), T.N. Bui and C. Jones. Finding good approximate vertex and edge partitions is np-hard. Information Processing Letters, 42: , T.N. Bui and B. Moon. Genetic algorithms and graph partitioning. IEEE Transactions on Computers, 45(7): , J. Adibi C. Shahabi, A. M. Zarkesh and V. Shah. Knowledge discovery from users Web-page navigation. In Workshop on Research Issues in Data Engineering, Birmingham, England, Robert Cooley. The use of web structure and content to identify subjectively interesting web usage patterns. ACM Transactions on Internet Technology, 3(2), C. Fiduccia and R. Mattheyses. A linear time heuristic for improving network partitions. In Proceedings of the 19th IEEE Design Automation Conference, pages , J. Gilbert, G. Miller, and S. Teng. Geometric mesh partitioning: Implementation and experiments. In Proceedings of the 9th International Parallel Processing Symposium, pages , B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In Proc. Supercomputing 95, December B. Hendrickson and R. Leland. The Chaco User s Guide Version 2.0. pages 1 44, G. Karypis and V. Kumar. Multilevel K-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1):96 129, B.W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphics. The Bell Systems Technical Journal, pages , Jose M. Pe na, Victor Robles, Oscar Marban, and Maria S. Pérez. Bayesian methods to estimate future load in web farms. In AWIC 2004, pages , A. Joshi O. Nasraoui, H. Frigui and R. Krishnapuram. Mining Web access logs using relational competitive fuzzy clustering. In Eight International Fuzzy Systems Association World Congress, August M. Perkowitz and O. Etzioni. Adaptive Web sites: automaticlly synthesizing Web pages. In Fifteenth National Conference on Artificial Intelligence, Madison, WI, B. Mobasher R. Cooley and J. Srivastava. Data preparation for mining World Wide Web browsing patterns. Journal of Knowledge and Information Systems, 1(1), M. Krishnan S. Schechter and M. D. Smith. Using path profiles to predict HTTP requests. In 7th International World Wide Web Conference, Brisbane, Australia, H.D. Simon and S. Teng. How good is recursive bisection? SIAM Journal of Scientific Computing, 18(5): , 1997.

8 22. M. Spiliopoulou and L. C. Faulstich. WUM: A Web Utilization Miner. In EDBT Workshop WebDB98, Valencia, Spain, LNCS 1590, Springer Verlag, A. Strehl, J. Ghosh, and R. Mooney. Impact of Similarity Measures on Web-Page Clustering. In Workshop for Artificial Intelligence for Web Search, July H. Garcia-Molina T. Yan, M. Jacobsen and U. Dayal. From user access patterns to dynamic hypertext linking. In 5th International World Wide Web Conference, Paris, France, C. Walshaw and M. Cross. Mesh partitioning: a multilevel balancing and refinement algorithm. SIAM Journal of Science Computation, 22(1):63 80, 2000.

Recommendation Models for User Accesses to Web Pages (Invited Paper)

Recommendation Models for User Accesses to Web Pages (Invited Paper) Recommendation Models for User Accesses to Web Pages (Invited Paper) Ṣule Gündüz 1 and M. Tamer Özsu2 1 Department of Computer Science, Istanbul Technical University Istanbul, Turkey, 34390 gunduz@cs.itu.edu.tr

More information

Bayesian Methods to Estimate Future Load in Web Farms

Bayesian Methods to Estimate Future Load in Web Farms Bayesian Methods to Estimate Future Load in Web Farms José M.Peña 1,Víctor Robles 1, Óscar Marbán 2,andMaría S. Pérez 1 1 DATSI, Universidad Politécnica de Madrid, Madrid, Spain {jmpena, vrobles, mperez}@fi.upm.es

More information

A Framework for Personal Web Usage Mining

A Framework for Personal Web Usage Mining A Framework for Personal Web Usage Mining Yongjian Fu Ming-Yi Shih Department of Computer Science Department of Computer Science University of Missouri-Rolla University of Missouri-Rolla Rolla, MO 65409-0350

More information

Multilevel Graph Partitioning

Multilevel Graph Partitioning Multilevel Graph Partitioning George Karypis and Vipin Kumar Adapted from Jmes Demmel s slide (UC-Berkely 2009) and Wasim Mohiuddin (2011) Cover image from: Wang, Wanyi, et al. "Polygonal Clustering Analysis

More information

Survey Paper on Web Usage Mining for Web Personalization

Survey Paper on Web Usage Mining for Web Personalization ISSN 2278 0211 (Online) Survey Paper on Web Usage Mining for Web Personalization Namdev Anwat Department of Computer Engineering Matoshri College of Engineering & Research Center, Eklahare, Nashik University

More information

A Survey on Web Personalization of Web Usage Mining

A Survey on Web Personalization of Web Usage Mining A Survey on Web Personalization of Web Usage Mining S.Jagan 1, Dr.S.P.Rajagopalan 2 1 Assistant Professor, Department of CSE, T.J. Institute of Technology, Tamilnadu, India 2 Professor, Department of CSE,

More information

Efficient FM Algorithm for VLSI Circuit Partitioning

Efficient FM Algorithm for VLSI Circuit Partitioning Efficient FM Algorithm for VLSI Circuit Partitioning M.RAJESH #1, R.MANIKANDAN #2 #1 School Of Comuting, Sastra University, Thanjavur-613401. #2 Senior Assistant Professer, School Of Comuting, Sastra University,

More information

Mining Web Logs to Improve Website Organization

Mining Web Logs to Improve Website Organization Mining Web Logs to Improve Website Organization Ramakrishnan Srikant IBM Almaden Research Center 650 Harry Road San Jose, CA 95120 Yinghui Yang Dept. of Operations & Information Management Wharton Business

More information

A Framework for Self Adaptive Websites: Tactical versus Strategic Changes

A Framework for Self Adaptive Websites: Tactical versus Strategic Changes A Framework for Self Adaptive Websites: Tactical versus Strategic Changes Filip.Coenen, Gilbert.Swinnen, Koen.Vanhoof, Geert.Wets@luc.ac.be Limburg University Centre, Faculty of Applied Economic Sciences,

More information

Web Data mining-a Research area in Web usage mining

Web Data mining-a Research area in Web usage mining IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,

More information

Application of Fusion-Fission to the multi-way graph partitioning problem

Application of Fusion-Fission to the multi-way graph partitioning problem Application of Fusion-Fission to the multi-way graph partitioning problem Charles-Edmond Bichot Laboratoire d Optimisation Globale, École Nationale de l Aviation Civile/Direction des Services de la Navigation

More information

Web Usage Mining: A Research Area in Web Mining

Web Usage Mining: A Research Area in Web Mining Web Usage Mining: A Research Area in Web Mining Rajni Pamnani, Pramila Chawan Department of computer technology, VJTI University, Mumbai Abstract Web usage mining is a main research area in Web mining

More information

Solving the Graph Bisection Problem with Imperialist Competitive Algorithm

Solving the Graph Bisection Problem with Imperialist Competitive Algorithm 2 International Conference on System Engineering and Modeling (ICSEM 2) IPCSIT vol. 34 (2) (2) IACSIT Press, Singapore Solving the Graph Bisection Problem with Imperialist Competitive Algorithm Hodais

More information

Automatic Personalization Based on Web Usage Mining

Automatic Personalization Based on Web Usage Mining Automatic Personalization Based on Web Usage Mining Bamshad Mobasher Dept. of Computer Science, DePaul University, Chicago, IL mobasher@cs.depaul.edu Robert Cooley, Jaideep Srivastava Dept. of Computer

More information

Lecture 19: Graph Partitioning

Lecture 19: Graph Partitioning Lecture 19: Graph Partitioning David Bindel 3 Nov 2011 Logistics Please finish your project 2. Please start your project 3. Graph partitioning Given: Graph G = (V, E) Possibly weights (W V, W E ). Possibly

More information

THE STUDY OF WEB MINING - A SURVEY

THE STUDY OF WEB MINING - A SURVEY THE STUDY OF WEB MINING - A SURVEY Ashish Gupta, Anil Khandekar Abstract over the year s web mining is the very fast growing research field. Web mining contains two research areas: Data mining and World

More information

Mining fuzzy association rules for web access case adaptation

Mining fuzzy association rules for web access case adaptation Mining fuzzy association rules for web access case adaptation Cody Wong, Simon Shiu Department of Computing Hong Kong Polytechnic University Hung Hom, Kowloon Hong Kong, China {cskpwong; csckshiu}@comp.polyu.edu.hk

More information

Keywords: Figure 1: Web Log File. 2013, IJARCSSE All Rights Reserved Page 1167

Keywords: Figure 1: Web Log File. 2013, IJARCSSE All Rights Reserved Page 1167 Volume 3, Issue 12, December 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Review on

More information

Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications

Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications Association-Rules-Based Recommender System for Personalization in Adaptive Web-Based Applications Daniel Mican, Nicolae Tomai Babes-Bolyai University, Dept. of Business Information Systems, Str. Theodor

More information

COMPREHENSIVE FRAMEWORK FOR PATTERN ANALYSIS THROUGH WEB LOGS USING WEB MINING: A REVIEW

COMPREHENSIVE FRAMEWORK FOR PATTERN ANALYSIS THROUGH WEB LOGS USING WEB MINING: A REVIEW Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 4, April 2013,

More information

Web Service Usage Mining: Mining For Executable Sequences

Web Service Usage Mining: Mining For Executable Sequences 7th WSEAS International Conference on APPLIED COMPUTER SCIENCE, Venice, Italy, November 21-23, 2007 266 Web Service Usage Mining: Mining For Executable Sequences MOHSEN JAFARI ASBAGH, HASSAN ABOLHASSANI

More information

Mining for User Navigation Patterns Based on Page Contents

Mining for User Navigation Patterns Based on Page Contents WSS03 Applications, Products and Services of Web-based Support Systems 27 Mining for User Navigation Patterns Based on Page Contents Yue Xu School of Software Engineering and Data Communications Queensland

More information

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey G. Shivaprasad, N. V. Subbareddy and U. Dinesh Acharya

More information

Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming

Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming Dr.K.Duraiswamy Dean, Academic K.S.Rangasamy College of Technology Tiruchengode, India V. Valli Mayil (Corresponding

More information

K-Ways Partitioning of Polyhedral Process Networks: a Multi-Level Approach

K-Ways Partitioning of Polyhedral Process Networks: a Multi-Level Approach 2015 IEEE International Parallel and Distributed Processing Symposium Workshops K-Ways Partitioning of Polyhedral Process Networks: a Multi-Level Approach Riccardo Cattaneo, Mahdi Moradmand, Donatella

More information

Genetic Algorithm for Circuit Partitioning

Genetic Algorithm for Circuit Partitioning Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

AN APPROACH FOR LOAD BALANCING FOR SIMULATION IN HETEROGENEOUS DISTRIBUTED SYSTEMS USING SIMULATION DATA MINING

AN APPROACH FOR LOAD BALANCING FOR SIMULATION IN HETEROGENEOUS DISTRIBUTED SYSTEMS USING SIMULATION DATA MINING AN APPROACH FOR LOAD BALANCING FOR SIMULATION IN HETEROGENEOUS DISTRIBUTED SYSTEMS USING SIMULATION DATA MINING Irina Bernst, Patrick Bouillon, Jörg Frochte *, Christof Kaufmann Dept. of Electrical Engineering

More information

Requirements of Load Balancing Algorithm

Requirements of Load Balancing Algorithm LOAD BALANCING Programs and algorithms as graphs Geometric Partitioning Graph Partitioning Recursive Graph Bisection partitioning Recursive Spectral Bisection Multilevel Graph partitioning Hypergraph Partitioning

More information

Create a Profile for User Using Web Usage Mining

Create a Profile for User Using Web Usage Mining Journal of Academic and Applied Studies (Special Issue on Applied Sciences) Vol. 3(9) September 2013, pp. 1-12 Available online @ www.academians.org ISSN1925-931X Create a Profile for User Using Web Usage

More information

Penalized Graph Partitioning for Static and Dynamic Load Balancing

Penalized Graph Partitioning for Static and Dynamic Load Balancing Penalized Graph Partitioning for Static and Dynamic Load Balancing Tim Kiefer, Dirk Habich, Wolfgang Lehner Euro-Par 06, Grenoble, France, 06-08-5 Task Allocation Challenge Application (Workload) = Set

More information

Native mesh ordering with Scotch 4.0

Native mesh ordering with Scotch 4.0 Native mesh ordering with Scotch 4.0 François Pellegrini INRIA Futurs Project ScAlApplix pelegrin@labri.fr Abstract. Sparse matrix reordering is a key issue for the the efficient factorization of sparse

More information

Pattern Classification based on Web Usage Mining using Neural Network Technique

Pattern Classification based on Web Usage Mining using Neural Network Technique International Journal of Computer Applications (975 8887) Pattern Classification based on Web Usage Mining using Neural Network Technique Er. Romil V Patel PIET, VADODARA Dheeraj Kumar Singh, PIET, VADODARA

More information

An Intuitive Approach for Web Scale Mining using W-Miner for Web Personalization

An Intuitive Approach for Web Scale Mining using W-Miner for Web Personalization An Intuitive Approach for Web Scale Mining using W-Miner for Web Personalization R.Lokeshkumar 1, Dr.P.Sengottuvelan 2 1 2 Department of Information Technology Bannari Amman Institute of Technology Sathyamangalam

More information

User Session Identification Using Enhanced Href Method

User Session Identification Using Enhanced Href Method User Session Identification Using Enhanced Href Method Department of Computer Science, Constantine the Philosopher University in Nitra, Slovakia jkapusta@ukf.sk, psvec@ukf.sk, mmunk@ukf.sk, jskalka@ukf.sk

More information

Construction of Web Community Directories by Mining Usage Data

Construction of Web Community Directories by Mining Usage Data Construction of Web Community Directories by Mining Usage Data Dimitrios Pierrakos 1, Georgios Paliouras 1, Christos Papatheodorou 2, Vangelis Karkaletsis 1, Marios Dikaiakos 3 1 Institute of Informatics

More information

Novel Hybrid k-d-apriori Algorithm for Web Usage Mining

Novel Hybrid k-d-apriori Algorithm for Web Usage Mining IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 4, Ver. VI (Jul.-Aug. 2016), PP 01-10 www.iosrjournals.org Novel Hybrid k-d-apriori Algorithm for Web

More information

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning

Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning Kirk Schloegel, George Karypis, and Vipin Kumar Army HPC Research Center Department of Computer Science and Engineering University

More information

PROBLEM STATEMENTS. Dr. Suresh Jain 2 2 Department of Computer Engineering, Institute of

PROBLEM STATEMENTS. Dr. Suresh Jain 2 2 Department of Computer Engineering, Institute of Efficient Web Log Mining using Doubly Linked Tree Ratnesh Kumar Jain 1, Dr. R. S. Kasana 1 1 Department of Computer Science & Applications, Dr. H. S. Gour, University, Sagar, MP (India) jratnesh@rediffmail.com,

More information

Heuristic Graph Bisection with Less Restrictive Balance Constraints

Heuristic Graph Bisection with Less Restrictive Balance Constraints Heuristic Graph Bisection with Less Restrictive Balance Constraints Stefan Schamberger Fakultät für Elektrotechnik, Informatik und Mathematik Universität Paderborn Fürstenallee 11, D-33102 Paderborn schaum@uni-paderborn.de

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Web Usage Mining: How to Efficiently Manage New Transactions and New Clients

Web Usage Mining: How to Efficiently Manage New Transactions and New Clients Web Usage Mining: How to Efficiently Manage New Transactions and New Clients F. Masseglia 1,2, P. Poncelet 2, and M. Teisseire 2 1 Laboratoire PRiSM, Univ. de Versailles, 45 Avenue des Etats-Unis, 78035

More information

A Modified Inertial Method for Loop-free Decomposition of Acyclic Directed Graphs

A Modified Inertial Method for Loop-free Decomposition of Acyclic Directed Graphs MACRo 2015-5 th International Conference on Recent Achievements in Mechatronics, Automation, Computer Science and Robotics A Modified Inertial Method for Loop-free Decomposition of Acyclic Directed Graphs

More information

Using Markov Models to define proactive action plans for users at multi-viewpoint websites

Using Markov Models to define proactive action plans for users at multi-viewpoint websites Using Markov Models to define proactive action plans for users at multi-viewpoint websites E. Menasalvas 1, S. Millán 2, P. Gonzalez 1 1 Facultad de Informática UPM. Madrid Spain 2 Universidad del Valle.

More information

Web Mining Using Cloud Computing Technology

Web Mining Using Cloud Computing Technology International Journal of Scientific Research in Computer Science and Engineering Review Paper Volume-3, Issue-2 ISSN: 2320-7639 Web Mining Using Cloud Computing Technology Rajesh Shah 1 * and Suresh Jain

More information

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS

MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS J.I. Serrano M.D. Del Castillo Instituto de Automática Industrial CSIC. Ctra. Campo Real km.0 200. La Poveda. Arganda del Rey. 28500

More information

I. Introduction II. Keywords- Pre-processing, Cleaning, Null Values, Webmining, logs

I. Introduction II. Keywords- Pre-processing, Cleaning, Null Values, Webmining, logs ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: An Enhanced Pre-Processing Research Framework for Web Log Data

More information

The influence of caching on web usage mining

The influence of caching on web usage mining The influence of caching on web usage mining J. Huysmans 1, B. Baesens 1,2 & J. Vanthienen 1 1 Department of Applied Economic Sciences, K.U.Leuven, Belgium 2 School of Management, University of Southampton,

More information

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, rajashreeshettar@rvce.edu.in Abstract

More information

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1

More information

A Website Mining Model Centered on User Queries

A Website Mining Model Centered on User Queries A Website Mining Model Centered on User Queries Ricardo Baeza-Yates 1, 3, 2 and Barbara Poblete 2, 3 1 ICREA, Barcelona, Catalunya, Spain 2 Center for Web Research, CS Dept., University of Chile 3 Web

More information

Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs

Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs Efficient Programming of Nanowire-based Sublithographic PLAs: A Multilevel Algorithm for Partitioning Graphs Vivek Rajkumar (University of Washington CSE) Contact: California Institute

More information

IJMIE Volume 2, Issue 9 ISSN:

IJMIE Volume 2, Issue 9 ISSN: WEB USAGE MINING: LEARNER CENTRIC APPROACH FOR E-BUSINESS APPLICATIONS B. NAVEENA DEVI* Abstract Emerging of web has put forward a great deal of challenges to web researchers for web based information

More information

On-line Generation of Suggestions for Web Users

On-line Generation of Suggestions for Web Users On-line Generation of Suggestions for Web Users Fabrizio Silvestri Istituto ISTI - CNR Pisa Italy Ranieri Baraglia Istituto ISTI - CNR Pisa Italy Paolo Palmerini Istituto ISTI - CNR Pisa - Italy {fabrizio.silvestri,ranieri.baraglia,paolo.palmerini}@isti.cnr.it

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN: IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T

More information

A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph

A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph A New Web Usage Mining Approach for Website Recommendations Using Concept Hierarchy and Website Graph T. Vijaya Kumar, H. S. Guruprasad, Bharath Kumar K. M., Irfan Baig, and Kiran Babu S. Abstract To have

More information

A Recursive Coalescing Method for Bisecting Graphs

A Recursive Coalescing Method for Bisecting Graphs A Recursive Coalescing Method for Bisecting Graphs The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable

More information

Domain Specific Search Engine for Students

Domain Specific Search Engine for Students Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam

More information

AN ALGORITHMIC APPROACH TO DATA PREPROCESSING IN WEB USAGE MINING

AN ALGORITHMIC APPROACH TO DATA PREPROCESSING IN WEB USAGE MINING International Journal of Information Technology and Knowledge Management July-December 2010, Volume 2, No. 2, pp. 279-283 AN ALGORITHMIC APPROACH TO DATA PREPROCESSING IN WEB USAGE MINING Navin Kumar Tyagi

More information

Fuzzy Cognitive Maps application for Webmining

Fuzzy Cognitive Maps application for Webmining Fuzzy Cognitive Maps application for Webmining Andreas Kakolyris Dept. Computer Science, University of Ioannina Greece, csst9942@otenet.gr George Stylios Dept. of Communications, Informatics and Management,

More information

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Expert Systems: Final (Research Paper) Project Daniel Josiah-Akintonde December

More information

An Improvement of Centroid-Based Classification Algorithm for Text Classification

An Improvement of Centroid-Based Classification Algorithm for Text Classification An Improvement of Centroid-Based Classification Algorithm for Text Classification Zehra Cataltepe, Eser Aygun Istanbul Technical Un. Computer Engineering Dept. Ayazaga, Sariyer, Istanbul, Turkey cataltepe@itu.edu.tr,

More information

Keywords Data alignment, Data annotation, Web database, Search Result Record

Keywords Data alignment, Data annotation, Web database, Search Result Record Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web

More information

Multilevel k-way Hypergraph Partitioning

Multilevel k-way Hypergraph Partitioning _ Multilevel k-way Hypergraph Partitioning George Karypis and Vipin Kumar fkarypis, kumarg@cs.umn.edu Department of Computer Science & Engineering, University of Minnesota, Minneapolis, MN 55455 Abstract

More information

Genetic Programming for Data Classification: Partitioning the Search Space

Genetic Programming for Data Classification: Partitioning the Search Space Genetic Programming for Data Classification: Partitioning the Search Space Jeroen Eggermont jeggermo@liacs.nl Joost N. Kok joost@liacs.nl Walter A. Kosters kosters@liacs.nl ABSTRACT When Genetic Programming

More information

A Hierarchical Document Clustering Approach with Frequent Itemsets

A Hierarchical Document Clustering Approach with Frequent Itemsets A Hierarchical Document Clustering Approach with Frequent Itemsets Cheng-Jhe Lee, Chiun-Chieh Hsu, and Da-Ren Chen Abstract In order to effectively retrieve required information from the large amount of

More information

A Privacy Preserving Web Recommender System

A Privacy Preserving Web Recommender System A Privacy Preserving Web Recommender System Ranieri Baraglia Massimo Serrano Claudio Lucchese Universita Ca Foscari Venezia, Italy Salvatore Orlando Universita Ca Foscari Venezia, Italy Fabrizio Silvestri

More information

Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal

Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal Log Information Mining Using Association Rules Technique: A Case Study Of Utusan Education Portal Mohd Helmy Ab Wahab 1, Azizul Azhar Ramli 2, Nureize Arbaiy 3, Zurinah Suradi 4 1 Faculty of Electrical

More information

Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning

Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning Multilevel Algorithms for Multi-Constraint Hypergraph Partitioning George Karypis University of Minnesota, Department of Computer Science / Army HPC Research Center Minneapolis, MN 55455 Technical Report

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

Conceptual Classification to Improve a Web Site Content

Conceptual Classification to Improve a Web Site Content Conceptual Classification to Improve a Web Site Content Sebastián A. Ríos 1,JuanD.Velásquez 2, Hiroshi Yasuda 1,andTerumasaAoki 1 1 Applied Information Engineering Laboratory, University of Tokyo, Japan

More information

Effectively Capturing User Navigation Paths in the Web Using Web Server Logs

Effectively Capturing User Navigation Paths in the Web Using Web Server Logs Effectively Capturing User Navigation Paths in the Web Using Web Server Logs Amithalal Caldera and Yogesh Deshpande School of Computing and Information Technology, College of Science Technology and Engineering,

More information

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining Miss. Rituja M. Zagade Computer Engineering Department,JSPM,NTC RSSOER,Savitribai Phule Pune University Pune,India

More information

Text clustering based on a divide and merge strategy

Text clustering based on a divide and merge strategy Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 55 (2015 ) 825 832 Information Technology and Quantitative Management (ITQM 2015) Text clustering based on a divide and

More information

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI

Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI Parallel Adaptive Institute of Theoretical Informatics Karlsruhe Institute of Technology (KIT) 10th DIMACS Challenge Workshop, Feb 13-14, 2012, Atlanta 1 Load Balancing by Repartitioning Application: Large

More information

Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology

Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.1, January 2008 179 Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology G T Raju 1 and P S Satyanarayana

More information

C-NBC: Neighborhood-Based Clustering with Constraints

C-NBC: Neighborhood-Based Clustering with Constraints C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is

More information

An Accomplishment of Web Personalization Using Web Mining Techniques

An Accomplishment of Web Personalization Using Web Mining Techniques An Accomplishment of Web Personalization Using Web Mining Techniques 1 Y.Raju B.Prashanth Kumar 2 Dr.D.Suresh Babu 3 1 Gitanjali Eng.College, Hederabad, 2 AVN Inst Of Engg & Tech, Ibrahimpatnam, Hyderabad.

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Hierarchical Document Clustering

Hierarchical Document Clustering Hierarchical Document Clustering Benjamin C. M. Fung, Ke Wang, and Martin Ester, Simon Fraser University, Canada INTRODUCTION Document clustering is an automatic grouping of text documents into clusters

More information

A Memetic Heuristic for the Co-clustering Problem

A Memetic Heuristic for the Co-clustering Problem A Memetic Heuristic for the Co-clustering Problem Mohammad Khoshneshin 1, Mahtab Ghazizadeh 2, W. Nick Street 1, and Jeffrey W. Ohlmann 1 1 The University of Iowa, Iowa City IA 52242, USA {mohammad-khoshneshin,nick-street,jeffrey-ohlmann}@uiowa.edu

More information

Lecture 2 Wednesday, August 22, 2007

Lecture 2 Wednesday, August 22, 2007 CS 6604: Data Mining Fall 2007 Lecture 2 Wednesday, August 22, 2007 Lecture: Naren Ramakrishnan Scribe: Clifford Owens 1 Searching for Sets The canonical data mining problem is to search for frequent subsets

More information

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION Kiran V. Gaidhane*, Prof. L. H. Patil, Prof. C. U. Chouhan DOI: 10.5281/zenodo.58632

More information

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci 8314 03/09/2017 Outline Introduction Graph Partitioning Problem Multilevel

More information

Mining Association Rules in Temporal Document Collections

Mining Association Rules in Temporal Document Collections Mining Association Rules in Temporal Document Collections Kjetil Nørvåg, Trond Øivind Eriksen, and Kjell-Inge Skogstad Dept. of Computer and Information Science, NTNU 7491 Trondheim, Norway Abstract. In

More information

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers A. Srivastava E. Han V. Kumar V. Singh Information Technology Lab Dept. of Computer Science Information Technology Lab Hitachi

More information

Correlation Based Feature Selection with Irrelevant Feature Removal

Correlation Based Feature Selection with Irrelevant Feature Removal Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management

Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management Kranti Patil 1, Jayashree Fegade 2, Diksha Chiramade 3, Srujan Patil 4, Pradnya A. Vikhar 5 1,2,3,4,5 KCES

More information

A Cube Model and Cluster Analysis for Web Access Sessions

A Cube Model and Cluster Analysis for Web Access Sessions A Cube Model and Cluster Analysis for Web Access Sessions Joshua Zhexue Huang 1, Michael Ng 2, Wai-Ki Ching 2, Joe Ng 1, and David Cheung 1 1 E-Business Technology Institute The University of Hong Kong

More information

A Web Page Recommendation system using GA based biclustering of web usage data

A Web Page Recommendation system using GA based biclustering of web usage data A Web Page Recommendation system using GA based biclustering of web usage data Raval Pratiksha M. 1, Mehul Barot 2 1 Computer Engineering, LDRP-ITR,Gandhinagar,cepratiksha.2011@gmail.com 2 Computer Engineering,

More information

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5. Course contents: Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Readings Chapter 7.5 Partitioning 1 Basic Definitions Cell: a logic block used to build larger circuits.

More information

IJITKMSpecial Issue (ICFTEM-2014) May 2014 pp (ISSN )

IJITKMSpecial Issue (ICFTEM-2014) May 2014 pp (ISSN ) A Review Paper on Web Usage Mining and future request prediction Priyanka Bhart 1, Dr.SonaMalhotra 2 1 M.Tech., CSE Department, U.I.E.T. Kurukshetra University, Kurukshetra, India 2 HOD, CSE Department,

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

An Efficient Clustering for Crime Analysis

An Efficient Clustering for Crime Analysis An Efficient Clustering for Crime Analysis Malarvizhi S 1, Siddique Ibrahim 2 1 UG Scholar, Department of Computer Science and Engineering, Kumaraguru College Of Technology, Coimbatore, Tamilnadu, India

More information

A Naïve Soft Computing based Approach for Gene Expression Data Analysis

A Naïve Soft Computing based Approach for Gene Expression Data Analysis Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2124 2128 International Conference on Modeling Optimization and Computing (ICMOC-2012) A Naïve Soft Computing based Approach for

More information

Semantic Clickstream Mining

Semantic Clickstream Mining Semantic Clickstream Mining Mehrdad Jalali 1, and Norwati Mustapha 2 1 Department of Software Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran 2 Department of Computer Science, Universiti

More information

ABSTRACT I. INTRODUCTION II. METHODS AND MATERIAL

ABSTRACT I. INTRODUCTION II. METHODS AND MATERIAL 2016 IJSRST Volume 2 Issue 4 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology A Paper on Multisite Framework for Web page Recommendation Using Incremental Mining Mr.

More information

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms Ruoming Jin Department of Computer and Information Sciences Ohio State University, Columbus OH 4321 jinr@cis.ohio-state.edu

More information