Extraction of Frequent Subgraph from Graph Database

Size: px
Start display at page:

Download "Extraction of Frequent Subgraph from Graph Database"

Transcription

1 Extraction of Frequent Subgraph from Graph Database Sakshi S. Mandke, Sheetal S. Sonawane Deparment of Computer Engineering Pune Institute of Computer Engineering, Pune, India. Abstract Graphs are promising abstraction of complex structured and semi-structured data. Graph mining techniques extract, analyze and summarize significant and useful information from the graph databases. Finding frequent subgraph from graph database is an essence of graph mining. Sometimes the mined subgraphs are large in numbers, posing difficulty in selecting significant subgraph. Every frequent subgraph is not always significant from the application perspective. This paper proposes an innovative concept to extract significant subgraphs. Our method does this in two stages. In the first stage, frequent subgraphs are identified using frequency threshold ( ϴ), which is an input parameter. In the second stage, feature vectors of subgraphs are generated to calculate its statistical significance. P-value is measure of statistical significance. Key terms Frequent subgraphs mining, feature selection, random walk on graph, statistical significance. I. Introduction Complex data can be effectively represented in graphs. Many application areas such as social networking, web links, bioinformatics, chemistry etc. uses graphs to represent complex data. Graph consists of set of vertices and the edges connecting it. For example, in chemistry, molecule consists of atoms and bonds that are represented in graphs as vertex and edges respectively. In web link, users are represented as nodes and communication links between them are represented as edges. Graph mining can be applied on single graph or series of graphs. A graph database consists of collection of many graphs. Let G B is graph dataset such as G B ={G 1, G 2,,..., G n }. Each graph G i = { V i, E i } is collection of set of vertices and set of edges connecting them. V= {v 1, v 2, v 3,..., v k } and E ={(u,v) u,v ϵ V}. A graph g is subgraph of G, if there is isomorphism from g to G. A support of g is number of graphs in G B where g is subgraph. A subgraph said to be frequent if its support is greater than or equal to user defined frequency threshold ϴ. Extraction of frequent substructures from series of graph database is required in many applications. For example, in chemistry, frequent subgraph mining is aimed to analyze large collections of molecules to find some regularity among molecules of a specific class. Another application can be found in web log files. Web log file are analyzed to search set of activities carried out by users, such as frequently accessed URLs, common group interactions, and so on. Numerous methods are developed to mine frequent subgraphs from graph database. However, in frequent subgraph mining has to face few challenges. The mined patterns may be large in numbers, and every subgraph may not be significant. A frequency parameter not always sufficient to categorize graphs efficiently. Other graph properties may also help in categorization of the graph. For example, benzene is common frequent subgraph in chemistry molecule dataset which is not effective as it does not indicate any biological or chemical activity. Significance of graph depends on the graph data characteristics. The domain specific or topological features are therefore being viewed as reference point to find significance of graphs. Feature analysis helps in reducing answer set and finding significant subgraphs. Extracting feature based frequent subgraph solves the problem of quality selection of frequent subgraph. Page 309

2 Figure 1: Overall Approach Our work is to filter answer set of frequent subgraphs by calculating its statistical significance. As shown in figure 1, firstly frequent subgraphs are extracted and then analyze these graphs in feature domain. Stastical significance refers to difference between samples under observation are real or they are exist just by chance. P value is measure of statistical significance. P value is probability of differences between observed and real. In graph database pvalue is definedd as: Give a graph g and observed frequency threshold µ 0 is statistical significant if probability of its occurrences in random database with frequency µ µ 0.[15] The remainder of the paper is organized as follows. Section II describes related work. Section III presents design of proposed work. Datasets are discussed in section IV. Results are discussed in section V. Conclusion in section VI. II. Related Work Many algorithms based on frequent subgraphs mining have been developed, such as AGM [6], FSG [10], gspan [21], SUBDUE [4], FFSM[7], MoFa [1] and Gaston [13]. Thesee algorithms are broadly classified into apriori- based algorithms and pattern-growth algorithms. In apriori based approach a set of k subgraphs at one level are consider first before generating k+ +1 subgraphs of next level. It uses breadth first search approach to explore graphs of next level. Pattern growth approach uses depth first search to generate subgraph candidates. In pattern growth approach each subgraph g is extended recursively to find all its subgraphs. Various FSM algorithms are developed in last past decades. Now in recent years, research is focused to optimize the result set of FSM to improve quality of it. In survey on graph miningng C. Jiang [2] noted some issues related to FSM which are still in research. He noted that there is need of reducing size of answer set generated by FSM algorithms. In many cases, as number of subgraphs from result set are loo large it is difficult to analyze them individually. Similarly, in some cases redundant subgraphs are present in large result set. Different approaches like approximate frequent subgraphs, closed frequent subgraphs, maximal frequent subgraphs and discriminative subgraphs are useful to address reducing size of subgraph. Defining compact subgraph without disturbing its importance for specific application is difficult. He also noted that feature selection can be incorporated in frequent subgraph mining process. It is useful to achieve better classification using frequent subgraph based classifier. Frequent subgraph mining can be made application specific by applying domain knowledge. In this case, features are used as mining parameters. It is difficult to select suitable parameters for given application as different features are available. In third issue he suggested that t different isomorphism test can be applied, for finding subgraphs. For example instead of exact matching approximate matching concept can be used. SUBDUE [4] algorithm uses heuristic beam search using domain knowledge to reduce search space. GREW [9], gapprox [3], RAM[22] are algorithms which uses approximate measures to generate result set. Above first two issues can be solved by applying feature analysis on graphs. But selecting parameter for significant mining is difficult. Significance parameter may change with an application. Page ranking, graph classification, frequent subgraph mining are the areas in which feature based analysis is in research. Yan and Han [17] presented pattern based ndexing in GIndex to achieve fast graph search. He and Singh proposed a GraphRank [5] which calculates statistical significance of subgraph. Subgraphs are converted into feature vectors for calculating its stastical significance using Pvalue. Gang Li [11], proposed graph Classification method based on Topological and Label Attributes. Cluster component can be used as discriminative property for graph classification is proposed by X. Yan [18]. CORK[12] uses gspan frequent subgraph mining algorithm to generate binary feature vectors for classification. Few algorithms exist that mines significant subgraphs. Milto et al.[20] proposed algorithm that Page 310

3 mines motifs as graph pattern in randomized networks. They use p value calculation to decide significance of pattern. Yan et al. [19] developed a mining framework for mining significant patterns using structural leap search and frequency descending mining concepts. GraphSig [15, 16] method mines the statistical significant subgraphs from the subgraphs at low frequency threshold. Using random walk on graph concepts graphs are converted into feature vectors. P-value of each subgraph is calculated to find statistical significance in feature space. A. Feature Vector Generation To find the feature vectors of mined subgraphs random walk is applied on it. Random walk starts from one node and it keeps jumping over all other nodes within graph. Each neighbour has an equal probability for jumping. In our work we are combining techniques mentioned in GraphRank[5] and GraphSig[15]. To preserve more structural information in subgraph feature vector, we are implementing random walk technique on subgraphs. Stastical significance of subgraph feature vector is then calculated using Pvalue. III. Design of proposed work Figure 3 outlines the proposed idea for finding significant frequent subgraphs. Existing algorithm, like Fast Frequent Subgraph algorithm [7] is applied to extract frequent subgraphs. Figure 3 a: Sample graphs Figure 3b: Frequent subgraphs with frequency threshold is 3 Figure 2: Block Diagram Sample graph database and its frequent subgraphs are illustrated in figure 3a and 3b respectively. Random walk on mined subgraphs is applied to convert them into feature vector. Statistical significance of these feature vectors is then calculated using Pvalue. Feature vector generation and its significance calculation are described in following subsection. A random walk on graph of length L on one graph is a set of X1, X2, X3,Xn random variables where X1= root vertex and Xi+1is neighbouring vertex of Xi and it is chosen uniformly at random. In random walk while traversing from one node to its neighbourhood node s features are captured. Features may consist of nodes, edges, or small subgraph. Even some pharmacophoric features can also be considered as feature. Here, edge type (NNP- node to node pair) is considered as feature [1]. For subgraph having n nodes, n number of Page 311

4 vectors will be generated. All the edges noted as column in feature vector. If specific edge is not present then 0 is inserted in row. After counting all NNP types during random walk; frequency of NNP is calculated. Value of NNP is noted in feature vector as: Value of NNP= Value of NNP is truncated to make it more Starting Node C-1-S S-1-N N-1-O C S N O traceable. calculated. First, probability density function of vector PDF(x) is computed using prior probabilities of features. In prior probabilities matrix each row represents one feature component (in our case, NNP-types). Xij element within prior probability matrix represents feature i found in subgraph feature vectors dataset at least j number of times. NNPs C-1-B C-1-A B-1-A A-1-B Table III: Prior-probability Matrix Probability of feature vector in random vector database can be expressed using joint probability: Figure 4: Sample subgraph Table I: Random walk on graph shown in figure 4. Feature vectors extracted from subgraphs are further analyzed. Subgraph represented in single feature vector by taking floor of values stored in feature vector matrix of subgraph. Finally subgraph is represented in one feature vector in which each column represents frequency count of one NNPtype. Floor of matrix: Floor([x 1,x 2,..., x n ], [y 1, y 2,..., y n ], [z 1,z 2,..., z n ]...)=[Min(x i, y i, z i,...))] for all i=1...n. P(x) = (,.. ) Where P (xi) is the probability that element i occurs at least yi times. Example: P (7, 7, 6, 0) =P(C-1-B 7) P(C-1-A 7) P (B-1-A 3) P (A-1-B) 0 = = Binomial distribution is used to measure frequency of feature vector in database. A random histogram can be viewed as a trial and x occurring in the histogram is success. Number of trials for vector x on database depends on number of histograms. Example: Floor([2,4,2],[2,3,3],[2,2,4])=[2,2,2] P-value(x, µ0) = µ binomial(p(x), i)[1] C-1-B C-1-A B-1-A A-1-B g g g g g g Table II: Subgraph feature vector dataset B. Calculating significance of feature vector In this section, we explain p-value calculation on feature vector of subgraph. The occurrences of each feature vector in random graph database are Lower the Pvalue higher is significance. Algorithm1: CalSignificance (G, maxpvalue) Input: G is a subgraph database with support of each subgraph. MaxPvalue is the p-value threshold, support of each subgraph. Output: O is the answer set of all significant subgraphs. D ø O ø For each g G do for each node in g do Page 312

5 Dg Dg + RWR (g) X X + Vector(floor(Dg)) for each NNP-type nnp in G do for i=1 to G do for k=1 to m do Pnnp (k) {probability (nnp) count of NNP at k th position and Value (G ik ) >= k} for each g in G do Pval=Calculate value(xg, g_support) if Pvalue maxpval then O O+g IV. Datasets In chemistry, molecules are represented in graphs and are analyzed using graph mining techniques. Extraction of frequent substructures from chemical database is required in many of the applications in chemistry domain such as drug discovery. Figure 5: Cyclohexene (C 6 H 10 ) compound in graph. Hydrogen s are implicit in graph. We are testing our experiment on chemical graph datasets. Three different datasets are used. The first dataset is DTP-AIDS Antiviral Screen 1 chemical compound dataset from National Chemical Institute (NCI/NIH). Compounds are divided into three categories on the basis of their antiviral activity. Compound which provides at least 50% protections are classified as CM (Confirm moderately active) and which provides 100% protections are listed as CA (Confirm active).other compounds are listed as CI (Confirm Inactive). Second dataset is anticancer compound dataset from pubchem 2. They are classified into two classes active and inactive. Third dataset is PTE 3 - Predicative Toxicology Evaluation compound dataset by NIEHS. It contains total 340 chemical compounds. 1 http : //dtp.nci.nih.gov/docs/aids/aids data.html 2 Chemical data represented in special different formats such as.sdf,.mol,.cml, and.smile etc. Tools like JoeLib[8], OpenBabel[14] are useful to convert these files format into different file format. V. Discussion about expected result We are implementing our algorithm in Java. The experiments will be performed on a 3.2GHz, 8GB memory PC running Linux Fedora 17. We are using FFSM algorithm[7] to generate frequent subgraphs from graph database. P-value often ranges from 0.01 to 0.1. If subgraph has pvalue less than 0.01 then it is very strong significant. If subgraph has pvalue<= 0.01 and >=.05 then it is strong significant. Subgraph also consider as significant if its pvalue is 0.1. Stastical significance calculation will improve result set. All insignificant subgraphs will be filtered out by calculating p-value. When numbers of frequent subgraphs are large in numbers then this filtering process is more effective. For example, as shown in figure 6, if numbers of frequent subgraphs are then significant subgraphs will not be more than Result set will be reducing by 10%. Thus, some subgraphs which exist just by chance will be filtered out. Running time also increase linearly with increasing number of frequent subgraphs. Freguent subgraphs VI. Conclusion All frequent subgraphs are not always significant one. There is need of one more filtering process. Feature analysis using random walks on graph 3 /PTE MaxPvalue=0. 01 MaxPvalue=0. 1 Significant Frequent Subgraph Figure 6: Frequent subgraphs Vs Significant frequent subgraphs Page 313

6 preserves more structural information. P-value calculation provides statistical significance of feature vector of graph. Quality and quantity of result set will be improved by applying above experiment. Significant feature vectors further can be given as input to classifier for classification. References [1] Borglet, C., & Berlthold, M. (November 2002). Mining Molecular Fragments: Finding Relevant Substructures of Molecules. IEEE International conference on Data Mining, (pp ). Maebashi City, Japan. [2] C. Jiang, F. C. (2004). A Survey of Frequent Subgraph Mining Algorithms. The Knowledge Engineering Review, Cambridge University Press. [3] C., C., Yan, X. Z., & Han., J. (2007). gapprox:mining Frequent Approximate Patterns from Massive Network. 7th IEEE International Conference on Data Mining, (pp ). [4] Cook, D. J., & Holder, L. B. (1994). Substructure Discovery Using Minimum Description Length and Background Knowledge. Journal of Artificial Intelligence Research, 1: [5] He, H., & Singh, A. (2006). "GraphRank: Stastical Modeling and Mining of Significant Subgraphs in the Feature Space". 6th International Conference on Data Mining IEEE Computer Society, (pp ). Washington, DC, US. [6] Inokuchi, A., Wahio, T., & Motoda, H. (2000). An Apriory based Algorithm for Mining Frequent Substructures from Graph Data. PKDD'00, (pp ). [7] J. Huan, Wang, W., & Prins. (2003). Efficient Mining of Frequent Subgraph in Presence of Isomorphism. International Conference on Data Mining, (pp ). [8] JoeLib: A JAva Based Computational Chemistry Pacakge. (2009). Wilhwlm-Schickard- Insitute for Computer Science. Tubinge, Germany. [9] Kuramochi, M. a. (2004). GREW: Scalable Frequent Subgraph Discovery Algorithm. 4th IEEE International Conference on Data Mining, (pp ). [10] Kuramochi, M., & Karypis, G. (2001). Frequent Subgraph discovery. ICDM, (pp ). [11] Li, G., Semerci, M., Yenar, B., & J.Zaki, M. (2011, August). "Graph Classification via Topological and Label Attributes". 9th Workshop on Mining and Learning with Graphs. SIGKDD. [12] M.Thoma, H. C.-P. (October,2010). "Descriminative Frequent Subgraph Mining with Optimally Garuntees.". Statistical Analysis and Data Mining, (pp. 3(5): ). [13] Nijssen, S., & Kok, J. N. (2004). The Gaston tool for frequent Subgraph Mining. International Workshop on Graph-Based Tools. Amsterdam, the Netherlands: Elsevier. [14] OpenBabel An open chemical toolbox. [15] Ranu, S., & Singh, A. (April, 2 009). "GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Databases". 25th IEEE International Conference on Data Engineering. (ICDE). [16] Ranu, S., & Singh, K. (2009). " Mining Statistically Significant Molecular Sub-structures for Efficient Molecular Classification". Journal of Chemical Information and Modeling, 49, [17] X. Yan, P. Y. (2004). "GrapghIndexing : a frequent structure- based approach" ACM SIGMOD (pp ). SIGMOD. [18] Xifeng Yan, F. Z. (2006). "Featur e-based similarity Search in graph structures.". ACM transaction on Database System, (pp. 31(4): ). [19] Xifeng Yan, H. C. (2008). Mining Significant graph patterns by leap search. SIGMOD, (pp ). [20] Y. Chi, Y. Y. (2003). Indexing and min ing free trees. ICDM. Page 314

7 [21] Yan, X., & Han, J. (2002). "gsapn: Graph - Based Substructure Pattern Mining ". IEEE Computer Society. Washington, DC,USA: ICDM'02. [22] Zhang, S., & Yang, J. (2008). RAM: Randomized Approximate Graph Mining. 20th International Conference on Scientific and Statistical Database Management, (pp ). Page 315

Data Mining in Bioinformatics Day 3: Graph Mining

Data Mining in Bioinformatics Day 3: Graph Mining Graph Mining and Graph Kernels Data Mining in Bioinformatics Day 3: Graph Mining Karsten Borgwardt & Chloé-Agathe Azencott February 6 to February 17, 2012 Machine Learning and Computational Biology Research

More information

Data Mining in Bioinformatics Day 5: Graph Mining

Data Mining in Bioinformatics Day 5: Graph Mining Data Mining in Bioinformatics Day 5: Graph Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen from Borgwardt and Yan, KDD 2008 tutorial Graph Mining and Graph Kernels,

More information

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Chloé-Agathe Azencott & Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institutes

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data: Part I Instructor: Yizhou Sun yzsun@ccs.neu.edu November 12, 2013 Announcement Homework 4 will be out tonight Due on 12/2 Next class will be canceled

More information

Subdue: Compression-Based Frequent Pattern Discovery in Graph Data

Subdue: Compression-Based Frequent Pattern Discovery in Graph Data Subdue: Compression-Based Frequent Pattern Discovery in Graph Data Nikhil S. Ketkar University of Texas at Arlington ketkar@cse.uta.edu Lawrence B. Holder University of Texas at Arlington holder@cse.uta.edu

More information

Pattern Mining in Frequent Dynamic Subgraphs

Pattern Mining in Frequent Dynamic Subgraphs Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universität Munich, Germany kb kriegel wackersr@dbs.ifi.lmu.de

More information

Survey on Graph Query Processing on Graph Database. Presented by FAN Zhe

Survey on Graph Query Processing on Graph Database. Presented by FAN Zhe Survey on Graph Query Processing on Graph Database Presented by FA Zhe utline Introduction of Graph and Graph Database. Background of Subgraph Isomorphism. Background of Subgraph Query Processing. Background

More information

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining D.Kavinya 1 Student, Department of CSE, K.S.Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India 1

More information

Graph Mining: Repository vs. Canonical Form

Graph Mining: Repository vs. Canonical Form Graph Mining: Repository vs. Canonical Form Christian Borgelt and Mathias Fiedler European Center for Soft Computing c/ Gonzalo Gutiérrez Quirós s/n, 336 Mieres, Spain christian.borgelt@softcomputing.es,

More information

Canonical Forms for Frequent Graph Mining

Canonical Forms for Frequent Graph Mining Canonical Forms for Frequent Graph Mining Christian Borgelt Dept. of Knowledge Processing and Language Engineering Otto-von-Guericke-University of Magdeburg borgelt@iws.cs.uni-magdeburg.de Summary. A core

More information

FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING

FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING E.R.Naganathan 1 S.Narayanan 2 K.Ramesh kumar 3 1 Department of Computer Applications, Velammal Engineering College Ambattur-Redhills Road,

More information

Graph-based Learning. Larry Holder Computer Science and Engineering University of Texas at Arlington

Graph-based Learning. Larry Holder Computer Science and Engineering University of Texas at Arlington Graph-based Learning Larry Holder Computer Science and Engineering University of Texas at Arlingt 1 Graph-based Learning Multi-relatial data mining and learning SUBDUE graph-based relatial learner Discovery

More information

Data Mining: Concepts and Techniques. Graph Mining. Graphs are Everywhere. Why Graph Mining? Chapter Graph mining

Data Mining: Concepts and Techniques. Graph Mining. Graphs are Everywhere. Why Graph Mining? Chapter Graph mining Data Mining: Concepts and Techniques Chapter 9 9.1. Graph mining Jiawei Han and Micheline Kamber Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.edu/~hanj 2006 Jiawei

More information

Review Article Performance Evaluation of Frequent Subgraph Discovery Techniques

Review Article Performance Evaluation of Frequent Subgraph Discovery Techniques Mathematical Problems in Engineering, rticle ID 869198, 6 pages http://dx.doi.org/10.1155/2014/869198 Review rticle Performance Evaluation of Frequent Subgraph Discovery Techniques Saif Ur Rehman, 1 Sohail

More information

Mining Interesting Itemsets in Graph Datasets

Mining Interesting Itemsets in Graph Datasets Mining Interesting Itemsets in Graph Datasets Boris Cule Bart Goethals Tayena Hendrickx Department of Mathematics and Computer Science University of Antwerp firstname.lastname@ua.ac.be Abstract. Traditionally,

More information

GRAPH MINING AND GRAPH KERNELS

GRAPH MINING AND GRAPH KERNELS GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan* ^University of Cambridge *IBM T. J. Watson Research Center August 24, 2008 ACM SIG KDD, Las Vegas Graphs Are Everywhere

More information

Data mining, 4 cu Lecture 8:

Data mining, 4 cu Lecture 8: 582364 Data mining, 4 cu Lecture 8: Graph mining Spring 2010 Lecturer: Juho Rousu Teaching assistant: Taru Itäpelto Frequent Subgraph Mining Extend association rule mining to finding frequent subgraphs

More information

MINING GRAPH DATA EDITED BY. Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington

MINING GRAPH DATA EDITED BY. Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington MINING GRAPH DATA EDITED BY Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington Lawrence B. Holder School of Electrical Engineering and

More information

Combining Ring Extensions and Canonical Form Pruning

Combining Ring Extensions and Canonical Form Pruning Combining Ring Extensions and Canonical Form Pruning Christian Borgelt European Center for Soft Computing c/ Gonzalo Gutiérrez Quirós s/n, 00 Mieres, Spain christian.borgelt@softcomputing.es Abstract.

More information

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA

UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University

More information

Managing and Mining Graph Data

Managing and Mining Graph Data Managing and Mining Graph Data by Charu C. Aggarwal IBM T.J. Watson Research Center Hawthorne, NY, USA Haixun Wang Microsoft Research Asia Beijing, China

More information

Innovative Study to the Graph-based Data Mining: Application of the Data Mining

Innovative Study to the Graph-based Data Mining: Application of the Data Mining Innovative Study to the Graph-based Data Mining: Application of the Data Mining Amit Kr. Mishra, Pradeep Gupta, Ashutosh Bhatt, Jainendra Singh Rana Abstract Graph-based data mining represents a collection

More information

A New Approach To Graph Based Object Classification On Images

A New Approach To Graph Based Object Classification On Images A New Approach To Graph Based Object Classification On Images Sandhya S Krishnan,Kavitha V K P.G Scholar, Dept of CSE, BMCE, Kollam, Kerala, India Sandhya4parvathy@gmail.com Abstract: The main idea of

More information

MINING AND SEARCHING GRAPHS AND STRUCTURES

MINING AND SEARCHING GRAPHS AND STRUCTURES MINING AND SEARCHING GRAPHS AND STRUCTURES Jiawei Han Xifeng Yan Department of Computer Science University of Illinois at Urbana-Champaign Philip S. Yu IBM T. J. Watson Research Center http://ews.uiuc.edu/~xyan/tutorial/kdd06_graph.htm

More information

Mining Minimal Contrast Subgraph Patterns

Mining Minimal Contrast Subgraph Patterns Mining Minimal Contrast Subgraph Patterns Roger Ming Hieng Ting James Bailey Abstract In this paper, we introduce a new type of contrast pattern, the minimal contrast subgraph. It is able to capture structural

More information

gspan: Graph-Based Substructure Pattern Mining

gspan: Graph-Based Substructure Pattern Mining University of Illinois at Urbana-Champaign February 3, 2017 Agenda What motivated the development of gspan? Technical Preliminaries Exploring the gspan algorithm Experimental Performance Evaluation Introduction

More information

Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach

Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach K. Vivekanandan Professor BSMED A. Pankaj Moses Monickaraj (Correspoding author) Doctoral Scholar Department of Computer Science

More information

Using Graphs to Improve Activity Prediction in Smart Environments based on Motion Sensor Data

Using Graphs to Improve Activity Prediction in Smart Environments based on Motion Sensor Data Using Graphs to Improve Activity Prediction in Smart Environments based on Motion Sensor Data S. Seth Long and Lawrence B. Holder Washington State University Abstract. Activity Recognition in Smart Environments

More information

gprune: A Constraint Pushing Framework for Graph Pattern Mining

gprune: A Constraint Pushing Framework for Graph Pattern Mining gprune: A Constraint Pushing Framework for Graph Pattern Mining Feida Zhu Xifeng Yan Jiawei Han Philip S. Yu Computer Science, UIUC, {feidazhu,xyan,hanj}@cs.uiuc.edu IBM T. J. Watson Research Center, psyu@us.ibm.com

More information

Using a Hash-Based Method for Apriori-Based Graph Mining

Using a Hash-Based Method for Apriori-Based Graph Mining Using a Hash-Based Method for Apriori-Based Graph Mining Phu Chien Nguyen, Takashi Washio, Kouzou Ohara, and Hiroshi Motoda The Institute of Scientific and Industrial Research, Osaka University 8-1 Mihogaoka,

More information

Efficient homomorphism-free enumeration of conjunctive queries

Efficient homomorphism-free enumeration of conjunctive queries Efficient homomorphism-free enumeration of conjunctive queries Jan Ramon 1, Samrat Roy 1, and Jonny Daenen 2 1 K.U.Leuven, Belgium, Jan.Ramon@cs.kuleuven.be, Samrat.Roy@cs.kuleuven.be 2 University of Hasselt,

More information

cmfsm: a scalable CPU-MIC coordinated drug-finding tool by frequent subgraph mining

cmfsm: a scalable CPU-MIC coordinated drug-finding tool by frequent subgraph mining Yang et al. BMC Bioinformatics 2018, 19(Suppl 4):98 https://doi.org/10.1186/s12859-018-2071-z RESEARCH cmfsm: a scalable CPU-MIC coordinated drug-finding tool by frequent subgraph mining Open Access Shunyun

More information

Chapters 11 and 13, Graph Data Mining

Chapters 11 and 13, Graph Data Mining CSI 4352, Introduction to Data Mining Chapters 11 and 13, Graph Data Mining Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Graph Representation Graph An ordered pair GV,E

More information

Discovering Frequent Topological Structures from Graph Datasets

Discovering Frequent Topological Structures from Graph Datasets Discovering Frequent Topological Structures from Graph Datasets R. Jin C. Wang D. Polshakov S. Parthasarathy G. Agrawal Department of Computer Science and Engineering Ohio State University, Columbus OH

More information

EGDIM - Evolving Graph Database Indexing Method

EGDIM - Evolving Graph Database Indexing Method EGDIM - Evolving Graph Database Indexing Method Shariful Islam Department of Computer Science and Engineering University of Dhaka, Bangladesh tulip.du@gmail.com Chowdhury Farhan Ahmed Department of Computer

More information

Monotone Constraints in Frequent Tree Mining

Monotone Constraints in Frequent Tree Mining Monotone Constraints in Frequent Tree Mining Jeroen De Knijf Ad Feelders Abstract Recent studies show that using constraints that can be pushed into the mining process, substantially improves the performance

More information

Improved Frequent Pattern Mining Algorithm with Indexing

Improved Frequent Pattern Mining Algorithm with Indexing IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.

More information

Dual Active Feature and Sample Selection for Graph Classification

Dual Active Feature and Sample Selection for Graph Classification Dual Active Feature and Sample Selection for Graph Classification Xiangnan Kong University of Illinois at Chicago Chicago, IL, USA xkong4@uic.edu Wei Fan IBM T. J. Watson Research Hawthorn, NY, USA weifan@us.ibm.com

More information

Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder

Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder Reporte Técnico No. CCC-08-003 27 de Febrero de 2008 2008 Coordinación de Ciencias

More information

In Mathematics and computer science, the study of graphs is graph theory where graphs are data structures used to model

In Mathematics and computer science, the study of graphs is graph theory where graphs are data structures used to model ISSN: 0975-766X CODEN: IJPTFI Available Online through Research Article www.ijptonline.com A BRIEF REVIEW ON APPLICATION OF GRAPH THEORY IN DATA MINING Abhinav Chanana*, Tanya Rastogi, M.Yamuna VIT University,

More information

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining Miss. Rituja M. Zagade Computer Engineering Department,JSPM,NTC RSSOER,Savitribai Phule Pune University Pune,India

More information

A Quantitative Comparison of the Subgraph Miners MoFa, gspan, FFSM, and Gaston

A Quantitative Comparison of the Subgraph Miners MoFa, gspan, FFSM, and Gaston A Quantitative omparison of the Subgraph Miners MoFa,,, and Marc Wörlein, Thorsten Meinl, Ingrid Fischer, and Michael Philippsen University of Erlangen-Nuremberg, omputer Science Department 2, Martensstr.

More information

Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2

Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2 Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2 1: Institute of Mathematics and Informatics BAS, Sofia, Bulgaria 2: Hasselt University, Belgium 1 st Int. Conf. IMMM, 23-29.10.2011,

More information

Frequent Pattern Mining On Un-rooted Unordered Tree Using FRESTM

Frequent Pattern Mining On Un-rooted Unordered Tree Using FRESTM Frequent Pattern Mining On Un-rooted Unordered Tree Using FRESTM Dhananjay G. Telavekar 1, Hemant A. Tirmare 2 1M.Tech. Scholar, Dhananjay G. Telavekar, Dept. Of Technology, Shivaji University, Kolhapur,

More information

Searching and ranking similar clusters of polyhedra in inorganic crystal structures

Searching and ranking similar clusters of polyhedra in inorganic crystal structures Searching and ranking similar clusters of polyhedra in inorganic crystal structures Hans-Joachim Klein Institut f. Informatik Christian-Albrechts-Universität Kiel Germany 2 Definition: A crystal is an

More information

Lower and upper queries for graph-mining

Lower and upper queries for graph-mining Lower and upper queries for graph-mining Amina Kemmar, Yahia Lebbah, Samir Loudni, Mohammed Ouali To cite this version: Amina Kemmar, Yahia Lebbah, Samir Loudni, Mohammed Ouali. Lower and upper queries

More information

Positive and Unlabeled Learning for Graph Classification

Positive and Unlabeled Learning for Graph Classification Positive and Unlabeled Learning for Graph Classification Yuchen Zhao Department of Computer Science University of Illinois at Chicago Chicago, IL Email: yzhao@cs.uic.edu Xiangnan Kong Department of Computer

More information

Frequent Subgraph Retrieval in Geometric Graph Databases

Frequent Subgraph Retrieval in Geometric Graph Databases Frequent Subgraph Retrieval in Geometric Graph Databases Sebastian Nowozin Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany sebastian.nowozin@tuebingen.mpg.de Koji

More information

A COMPARATIVE STUDY OF FREQUENT SUBGRAPH MINING ALGORITHMS

A COMPARATIVE STUDY OF FREQUENT SUBGRAPH MINING ALGORITHMS A COMPARATIVE STUDY OF FREQUENT SUBGRAPH MINING ALGORITHMS K.Lakshmi 1 and Dr. T. Meyyappan 2 1. Department of MCA, Sir M.Visvesvaraya Institute of Technology, Bangalore. lakshmi_kes@rediffmail.com 2.

More information

Efficient Subgraph Matching by Postponing Cartesian Products

Efficient Subgraph Matching by Postponing Cartesian Products Efficient Subgraph Matching by Postponing Cartesian Products Computer Science and Engineering Lijun Chang Lijun.Chang@unsw.edu.au The University of New South Wales, Australia Joint work with Fei Bi, Xuemin

More information

Frequent Pattern-Growth Approach for Document Organization

Frequent Pattern-Growth Approach for Document Organization Frequent Pattern-Growth Approach for Document Organization Monika Akbar Department of Computer Science Virginia Tech, Blacksburg, VA 246, USA. amonika@cs.vt.edu Rafal A. Angryk Department of Computer Science

More information

Mining Top K Large Structural Patterns in a Massive Network

Mining Top K Large Structural Patterns in a Massive Network Mining Top K Large Structural Patterns in a Massive Network Feida Zhu Singapore Management University fdzhu@smu.edu.sg Xifeng Yan University of California at Santa Barbara xyan@cs.ucsb.edu Qiang Qu Peking

More information

MARGIN: Maximal Frequent Subgraph Mining Λ

MARGIN: Maximal Frequent Subgraph Mining Λ MARGIN: Maximal Frequent Subgraph Mining Λ Lini T Thomas Satyanarayana R Valluri Kamalakar Karlapalem enter For Data Engineering, IIIT, Hyderabad flini,satyag@research.iiit.ac.in, kamal@iiit.ac.in Abstract

More information

Mining Significant Graph Patterns by Leap Search

Mining Significant Graph Patterns by Leap Search Mining Significant Graph Patterns by Leap Search Xifeng Yan (IBM T. J. Watson) Hong Cheng, Jiawei Han (UIUC) Philip S. Yu (UIC) Graphs Are Everywhere Magwene et al. Genome Biology 2004 5:R100 Co-expression

More information

9.1. Graph Mining, Social Network Analysis, and Multirelational Data Mining. Graph Mining

9.1. Graph Mining, Social Network Analysis, and Multirelational Data Mining. Graph Mining 9 Graph Mining, Social Network Analysis, and Multirelational Data Mining 9.1 We have studied frequent-itemset mining in Chapter 5 and sequential-pattern mining in Section 3 of Chapter 8. Many scientific

More information

Semi-supervised Clustering of Graph Objects: A Subgraph Mining Approach

Semi-supervised Clustering of Graph Objects: A Subgraph Mining Approach Semi-supervised Clustering of Graph Objects: A Subgraph Mining Approach Xin Huang 1, Hong Cheng 1, Jiong Yang 2, Jeffery Xu Yu 1, Hongliang Fei 3, and Jun Huan 3 1 The Chinese University of Hong Kong 2

More information

Les Cahiers du GERAD ISSN:

Les Cahiers du GERAD ISSN: Les Cahiers du GERAD ISSN: 0711 2440 SyGMA: Reducing Symmetry in Graph Mining C. Desrosiers, Ph. Galinier, P. Hansen, A. Hertz G 2007 12 February 2007 Revised: February 2008 Les textes publiés dans la

More information

Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India

Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India Web Page Classification using FP Growth Algorithm Akansha Garg,Computer Science Department Swami Vivekanad Subharti University,Meerut, India Abstract - The primary goal of the web site is to provide the

More information

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS By NIKHIL S. KETKAR A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTORATE OF PHILOSOPHY

More information

Parallelization of Graph Isomorphism using OpenMP

Parallelization of Graph Isomorphism using OpenMP Parallelization of Graph Isomorphism using OpenMP Vijaya Balpande Research Scholar GHRCE, Nagpur Priyadarshini J L College of Engineering, Nagpur ABSTRACT Advancement in computer architecture leads to

More information

Edgar: the Embedding-baseD GrAph MineR

Edgar: the Embedding-baseD GrAph MineR Edgar: the Embedding-baseD GrAph MineR Marc Wörlein, 1 Alexander Dreweke, 1 Thorsten Meinl, 2 Ingrid Fischer 2, and Michael Philippsen 1 1 University of Erlangen-Nuremberg, Computer Science Department

More information

Edgar: the Embedding-baseD GrAph MineR

Edgar: the Embedding-baseD GrAph MineR Edgar: the Embedding-baseD GrAph MineR Marc Wörlein, 1 Alexander Dreweke, 1 Thorsten Meinl, 2 Ingrid Fischer 2, and Michael Philippsen 1 1 University of Erlangen-Nuremberg, Computer Science Department

More information

I. INTRODUCTION. Keywords : Spatial Data Mining, Association Mining, FP-Growth Algorithm, Frequent Data Sets

I. INTRODUCTION. Keywords : Spatial Data Mining, Association Mining, FP-Growth Algorithm, Frequent Data Sets 2017 IJSRSET Volume 3 Issue 5 Print ISSN: 2395-1990 Online ISSN : 2394-4099 Themed Section: Engineering and Technology Emancipation of FP Growth Algorithm using Association Rules on Spatial Data Sets Sudheer

More information

Temporal Weighted Association Rule Mining for Classification

Temporal Weighted Association Rule Mining for Classification Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider

More information

Graph Pattern Mining

Graph Pattern Mining : Lecture VIII Graph Pattern Mining Computer Science Department Data Mining Research Nov 26, 2014 Announcement No Homework Slides available at www.cs.ucsb.edu/~xyan/classes/ns201 Two Quizzes (Dec 3, 10),

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

Mining frequent Closed Graph Pattern

Mining frequent Closed Graph Pattern Mining frequent Closed Graph Pattern Seminar aus maschninellem Lernen Referent: Yingting Fan 5.November Fachbereich 21 Institut Knowledge Engineering Prof. Fürnkranz 1 Outline Motivation and introduction

More information

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University Chapter MINING HIGH-DIMENSIONAL DATA Wei Wang 1 and Jiong Yang 2 1. Department of Computer Science, University of North Carolina at Chapel Hill 2. Department of Electronic Engineering and Computer Science,

More information

Upper bound tighter Item caps for fast frequent itemsets mining for uncertain data Implemented using splay trees. Shashikiran V 1, Murali S 2

Upper bound tighter Item caps for fast frequent itemsets mining for uncertain data Implemented using splay trees. Shashikiran V 1, Murali S 2 Volume 117 No. 7 2017, 39-46 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Upper bound tighter Item caps for fast frequent itemsets mining for uncertain

More information

Knowledge Discovery from Transportation Network Data

Knowledge Discovery from Transportation Network Data Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005 1

More information

Web Data mining-a Research area in Web usage mining

Web Data mining-a Research area in Web usage mining IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 13, Issue 1 (Jul. - Aug. 2013), PP 22-26 Web Data mining-a Research area in Web usage mining 1 V.S.Thiyagarajan,

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

An Approach for Finding Frequent Item Set Done By Comparison Based Technique

An Approach for Finding Frequent Item Set Done By Comparison Based Technique Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Web Usage Mining: A Research Area in Web Mining

Web Usage Mining: A Research Area in Web Mining Web Usage Mining: A Research Area in Web Mining Rajni Pamnani, Pramila Chawan Department of computer technology, VJTI University, Mumbai Abstract Web usage mining is a main research area in Web mining

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining

More information

A Hierarchical Document Clustering Approach with Frequent Itemsets

A Hierarchical Document Clustering Approach with Frequent Itemsets A Hierarchical Document Clustering Approach with Frequent Itemsets Cheng-Jhe Lee, Chiun-Chieh Hsu, and Da-Ren Chen Abstract In order to effectively retrieve required information from the large amount of

More information

Understanding Rule Behavior through Apriori Algorithm over Social Network Data

Understanding Rule Behavior through Apriori Algorithm over Social Network Data Global Journal of Computer Science and Technology Volume 12 Issue 10 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172

More information

Improving the Efficiency of Web Usage Mining Using K-Apriori and FP-Growth Algorithm

Improving the Efficiency of Web Usage Mining Using K-Apriori and FP-Growth Algorithm International Journal of Scientific & Engineering Research Volume 4, Issue3, arch-2013 1 Improving the Efficiency of Web Usage ining Using K-Apriori and FP-Growth Algorithm rs.r.kousalya, s.k.suguna, Dr.V.

More information

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, rajashreeshettar@rvce.edu.in Abstract

More information

Pamba Pravallika 1, K. Narendra 2

Pamba Pravallika 1, K. Narendra 2 2018 IJSRSET Volume 4 Issue 1 Print ISSN: 2395-1990 Online ISSN : 2394-4099 Themed Section : Engineering and Technology Analysis on Medical Data sets using Apriori Algorithm Based on Association Rules

More information

Parallel Popular Crime Pattern Mining in Multidimensional Databases

Parallel Popular Crime Pattern Mining in Multidimensional Databases Parallel Popular Crime Pattern Mining in Multidimensional Databases BVS. Varma #1, V. Valli Kumari *2 # Department of CSE, Sri Venkateswara Institute of Science & Information Technology Tadepalligudem,

More information

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE

AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE Vandit Agarwal 1, Mandhani Kushal 2 and Preetham Kumar 3

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining

More information

The Transpose Technique to Reduce Number of Transactions of Apriori Algorithm

The Transpose Technique to Reduce Number of Transactions of Apriori Algorithm The Transpose Technique to Reduce Number of Transactions of Apriori Algorithm Narinder Kumar 1, Anshu Sharma 2, Sarabjit Kaur 3 1 Research Scholar, Dept. Of Computer Science & Engineering, CT Institute

More information

Behavior Query Discovery in System-Generated Temporal Graphs

Behavior Query Discovery in System-Generated Temporal Graphs Behavior Query Discovery in System-Generated Temporal Graphs Bo Zong,, Xusheng Xiao, Zhichun Li, Zhenyu Wu, Zhiyun Qian, Xifeng Yan, Ambuj K. Singh, Guofei Jiang UC Santa Barbara NEC Labs, America UC Riverside

More information

Multi-Label Feature Selection for Graph Classification

Multi-Label Feature Selection for Graph Classification Multi-Label Feature Selection for Graph Classification Xiangnan Kong Department of Computer Science University of Illinois at Chicago, IL, USA xkong4@uic.edu Philip S. Yu Department of Computer Science

More information

Discovering Geometric Patterns in Genomic Data

Discovering Geometric Patterns in Genomic Data Discovering Geometric Patterns in Genomic Data Wenxuan Gao Department of Computer Science University of Illinois at Chicago wgao5@uic.edu Lijia Ma ljma @uchicago.edu Christopher Brown caseybrown@uchicago.edu

More information

Tendency Mining in Dynamic Association Rules Based on SVM Classifier

Tendency Mining in Dynamic Association Rules Based on SVM Classifier Send Orders for Reprints to reprints@benthamscienceae The Open Mechanical Engineering Journal, 2014, 8, 303-307 303 Open Access Tendency Mining in Dynamic Association Rules Based on SVM Classifier Zhonglin

More information

A New Technique to Optimize User s Browsing Session using Data Mining

A New Technique to Optimize User s Browsing Session using Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Optimization using Ant Colony Algorithm

Optimization using Ant Colony Algorithm Optimization using Ant Colony Algorithm Er. Priya Batta 1, Er. Geetika Sharmai 2, Er. Deepshikha 3 1Faculty, Department of Computer Science, Chandigarh University,Gharaun,Mohali,Punjab 2Faculty, Department

More information

Graph mining-based Image Indexing

Graph mining-based Image Indexing Graph mining-based Image Indeing Gábor Iváncs, Renáta Iváncs and István Vajk Department of Automation and Applied Informatics, Budapest Universit of Technolog and Economics,, Goldmann G. ter 3. Budapest,

More information

PSM-Flow: Probabilistic Subgraph Mining for Discovering Reusable Fragments in Workflows

PSM-Flow: Probabilistic Subgraph Mining for Discovering Reusable Fragments in Workflows PSM-Flow: Probabilistic Subgraph Mining for Discovering Reusable Fragments in Workflows Ken Cheong CS Department HK Baptist University Hong Kong Daniel Garijo Information Sciences Institute U. of Southern

More information

Mining Molecular Datasets on Symmetric Multiprocessor Systems

Mining Molecular Datasets on Symmetric Multiprocessor Systems Mining Molecular Datasets on Symmetric Multiprocessor Systems Thorsten Meinl ALTANA Chair for Bioinformatics and Information Mining, University of Konstanz, Germany meinl@inf.uni-konstanz.de Marc Wörlein,

More information

AI Web-Based Agent for Banks

AI Web-Based Agent for Banks www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 10 October, 2014 Page No.8782-8787 AI Web-Based Agent for Banks Nikhila Kamat 1, Michelle D cruz 2,

More information

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: [35] [Rana, 3(12): December, 2014] ISSN:

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: [35] [Rana, 3(12): December, 2014] ISSN: IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A Brief Survey on Frequent Patterns Mining of Uncertain Data Purvi Y. Rana*, Prof. Pragna Makwana, Prof. Kishori Shekokar *Student,

More information

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction

An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan

More information

FREQUENT PATTERN MINING IN BIG DATA USING MAVEN PLUGIN. School of Computing, SASTRA University, Thanjavur , India

FREQUENT PATTERN MINING IN BIG DATA USING MAVEN PLUGIN. School of Computing, SASTRA University, Thanjavur , India Volume 115 No. 7 2017, 105-110 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu FREQUENT PATTERN MINING IN BIG DATA USING MAVEN PLUGIN Balaji.N 1,

More information

GDClust: A Graph-Based Document Clustering Technique

GDClust: A Graph-Based Document Clustering Technique GDClust: A Graph-Based Document Clustering Technique M. Shahriar Hossain, Rafal A. Angryk Department of Computer Science, Montana State University, Bozeman, MT 59715, USA E-mail: {mshossain, angryk}@cs.montana.edu

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data

Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Shilpa Department of Computer Science & Engineering Haryana College of Technology & Management, Kaithal, Haryana, India

More information