Graph-based Learning. Larry Holder Computer Science and Engineering University of Texas at Arlington

Size: px
Start display at page:

Download "Graph-based Learning. Larry Holder Computer Science and Engineering University of Texas at Arlington"

Transcription

1 Graph-based Learning Larry Holder Computer Science and Engineering University of Texas at Arlingt 1

2 Graph-based Learning Multi-relatial data mining and learning SUBDUE graph-based relatial learner Discovery Clustering Graph grammar learning Supervised learning 2

3 Multi-Relatial Data Mining Looking for patterns involving multiple tables (relatis) in a relatial database Pers ID P1 P2 P3 Last Doe Doe Smith First Age John 30 Sally 29 Income Robert Married Pers1 P1 Pers2 RichCouple(X,Y) Pers(X,LastX,FirstX,AgeX,IncX) & Pers(Y,LastY,FirstY,AgeY,IncY) & Married(X,Y) & (IncX + IncY) > P3 P2 P7 3

4 Multi-Relatial Data Mining Approaches Transform to n-relatial problem First-order logic based Inductive Logic Programming (ILP) Graph based 4

5 Graph-based Data Mining Finding all subgraphs g within a set of graph transactis G such that freq( g) G > t where t is the minimum support 5

6 Graph-based Data Mining Systems Apriori-based Graph Mining (AGM) Inokuchi, Washio and Motoda,, 2003 Frequent Sub-Graph discovery (FSG) Kuramochi and Karypis,, 2001 Graph-based Substructure pattern mining (gspan) Yan and Han, 2002 Focus pruning and fast, code-based graph matching 6

7 Graph-based Relatial Learning Finding patterns in graph(s) Discovery Clustering Supervised learning Smith Robert Last First Married Pers Age Income Doe John Doe Sally Last First Married Pers Age Income Last First Pers Age Income

8 Graph-based Relatial Learning Graph-Based Inducti (GBI) Yoshida, Motoda and Indurkhya,, 1994 SUBstructure Discovery Using Examples (SUBDUE) Cook and Holder, 1994 Focus efficient subgraph generati and compressi-based heuristic search 8

9 SUBDUE Graph-based Discovery Graph representati Graph compressi and MDL Discovery algorithm Inexact graph match Background knowledge Parallel/distributed discovery 9

10 Graph Representati Input is a labeled (vertices and edges) directed graph A substructure is a cnected subgraph An instance of a substructure is an isomorphic subgraph of the input graph Input graph compressed by replacing instances with vertex representing substructure T1 S1 T2 S2 Input Database R1 T3 S3 C1 T4 S4 Substructure S1 (graph form) object object shape shape triangle square Compressed Database S1 S1 R1 S1 C1 S1 10

11 Graph Representati S 2 S 1 S 1 S 1 S 1 S 1 S 2 S 2 11

12 Graph Compressi and MDL Minimum Descripti Length (MDL) principle Best theory minimizes descripti length of theory and the data given theory Best substructure S minimizes descripti length of substructure definiti DL(S) and compressed graph DL(G S) min( DL( S) + S DL( G S)) 12

13 Discovery Algorithm 1. Create substructure for each unique vertex label triangle square rectangle triangle square circle triangle square triangle square Substructures: triangle (4), square (4), circle (1), rectangle (1) 13

14 Discovery Algorithm 2. Expand best substructures by an edge or edge+neighboring vertex triangle square rectangle triangle square circle triangle square triangle square Substructures: triangle square square rectangle circle rectangle rectangle triangle 14

15 Discovery Algorithm 3. Keep ly best beam-width substructures queue 4. Terminate when queue is empty or #discovered substructures > limit 5. Compress graph and repeat to generate hierarchical descripti 15

16 DNA Example 16

17 Sample SUBDUE Input sample.g: v 1 object v 2 object v 3 object v 4 object v 5 object v 6 object v 7 object v 8 object v 9 object v 10 object v 11 triangle v 12 triangle v 13 triangle v 14 triangle v 15 square v 16 square v 17 square v 18 square v 19 circle v 20 rectangle e 1 11 shape e 2 12 shape e 3 13 shape e 4 14 shape e 5 15 shape e 6 16 shape e 7 17 shape e 8 18 shape e 9 19 shape e shape e 1 5 e 2 6 e 3 7 e 4 8 e 5 10 e 9 10 e 10 2 e 10 3 e 10 4 T1 S1 T2 S2 R1 T3 S3 C1 T4 S4 17

18 Inexact Graph Match Some variatis may occur between instances Want to abstract over minor differences Difference = cost of transforming e graph to make it isomorphic to another Match if cost/size < threshold 18

19 Inexact Graph Match A a B 1 2 b B b A 3 4 a a b 5 B (1,3) 1 (1,4) 0 (1,5) 1 (1,λ) 1 (2,4) 7 (2,5) 6 (2,λ) 10 (2,3) 3 (2,5) 6 (2,λ) 9 (2,3) 7 (2,4) 7 (2,λ) 10 (2,3) 9 (2,4) 10 (2,5) 9 (2,λ) 11 Least-cost match is {(1,4), (2,3)} 19

20 Inexact Graph Match Vertices csidered by degree Polynomially cstrained Greedy after n k partial mappings csidered Suboptimal mappings rare for k>2 20

21 Background Knowledge User-defined substructures Two alternative uses Prime search queue Initial graph compressi Variant of discovery algorithm used to generate instances 21

22 Parallel/Distributed Discovery Divide graph into P partitis Distribute to P processors Each processor performs serial discovery local partiti Broadcast best substructures, evaluate other processors Master processor stores best global substructures 22

23 Graph-based Clustering Hierarchical, cceptual clustering Previous work defined classificati trees Inadequate in relatial domains Better hierarchical descripti: classificati lattice A cluster can have more than e parent A parent can be at any level (not ly e level above) Use iterative graph-based discovery 23

24 Clustering: DNA 24

25 Clustering: DNA C N C C DNA O O == P OH Coverage 61% C \ N C \ C O \ C / \ C C N C / \ O C C C \ O O O == P OH O CH 2 68% 71% 25

26 Learning Graph Grammars Graph grammar producti: S P S is a n-terminal P is a graph ctaining terminals and/or n- terminals S P 1 P 2 P n Recursive producti: S P S P P linked to S via a single edge Algorithm expential in number of linking edges 26

27 Example Graph Grammar S 2 a S 2 a b S 3 b S 3 S 3 c d e f 27

28 Graph Grammar Learning SUBDUE Extensis (SubdueGL( SubdueGL) Each iterati results in a graph grammar producti substructure Producti used to compress graph Replace instances of right-hand hand side with new vertex labeled with n-terminal left-hand side Iteratis ctinue until entire graph compressed to single n-terminal 28

29 SubdueGL Example Input graph Edge labels: t, s, next a a a a b c b d b e b f k x y x y r x y x y z q z q z q z q 29

30 SubdueGL Example First producti rule S 1 x y S 1 x y z q z q Input graph parsed by first producti a a a a b c b d b e b f k S 1 r S 1 30

31 SubdueGL Example Secd and third producti rules S 2 a S 2 a b S 3 b S 3 S 3 c d e f Input graph parsed by productis S 2 S 1 r S 1 k 31

32 Graph-Based Supervised Learning Input now a set of positive graphs and a set of negative graphs Input Hypothesis object object object shape shape triangle square 32

33 Graph-Based Supervised Learning Soluti 1 Find substructure compressing positive graphs, but not negative graphs Compress graphs and iterate until no further compressi Problem Compressing, instead of removing, partially- covered positive graphs leads to overly- specific hypotheses 33

34 Graph-Based Supervised Learning Soluti 2 Find substructure covering (i.e., subgraph of) positive graphs, but not negative graphs Remove covered positive graphs and iterate until all covered Substructure value = 1 - Error Error = # PosEgsNotCovered+ # NegEgsCovered # PosEgs+ # NegEgs 34

35 Supervised Learning: Cancer Chemical toxicity SUBDUE achieved 62% accuracy classifying carcinogenic vs. n-carcinogenic compounds six_ring p 6 cytogen_ca di227 in_group ctains compound in_group atom 7 element type charge c compound compound drosophila_slrl chromaberr p p positive ames ctains atom element type c 22 has_group amine charge -13 ashby_alert ashby_alert halide10 35

36 Applicati Domains Biochemical domains Protein data DNA data Toxicology (cancer) data Spatial-temporal temporal domains Earthquake data Aircraft Safety and Reporting System Web topology and search Social network analysis hyperlink web_page hyperlink home web_page hyperlink web_page 36

37 Summary Multi-relatial data mining and learning Graph-based relatial learning Discovery Clustering Graph grammar learning Supervised learning 37

38 Future Directis Efficient graph-based learning from incremental streaming data Supervised graphs All examples in e, cnected graph Graph-based anomaly detecti Improved scalability Graph and subgraph isomorphism 38

39 Further Informati Graph-based Data Mining SUBDUE Project ailab.uta.edu/subdue 39

Subdue: Compression-Based Frequent Pattern Discovery in Graph Data

Subdue: Compression-Based Frequent Pattern Discovery in Graph Data Subdue: Compression-Based Frequent Pattern Discovery in Graph Data Nikhil S. Ketkar University of Texas at Arlington ketkar@cse.uta.edu Lawrence B. Holder University of Texas at Arlington holder@cse.uta.edu

More information

Graph-based Relational Learning with Application to Security

Graph-based Relational Learning with Application to Security Fundamenta Informaticae 66 (2005) 1 19 1 IS Press Graph-based Relational Learning with Application to Security Lawrence Holder Λ, Diane Cook, Jeff Coble and Maitrayee Mukherjee Department of Computer Science

More information

Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder

Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder Reporte Técnico No. CCC-08-003 27 de Febrero de 2008 2008 Coordinación de Ciencias

More information

Graph-Based Concept Learning

Graph-Based Concept Learning Graph-Based Concept Learning Jesus A. Gonzalez, Lawrence B. Holder, and Diane J. Cook Department of Computer Science and Engineering University of Texas at Arlington Box 19015, Arlington, TX 76019-0015

More information

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining D.Kavinya 1 Student, Department of CSE, K.S.Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India 1

More information

Data Mining in Bioinformatics Day 5: Graph Mining

Data Mining in Bioinformatics Day 5: Graph Mining Data Mining in Bioinformatics Day 5: Graph Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen from Borgwardt and Yan, KDD 2008 tutorial Graph Mining and Graph Kernels,

More information

In Mathematics and computer science, the study of graphs is graph theory where graphs are data structures used to model

In Mathematics and computer science, the study of graphs is graph theory where graphs are data structures used to model ISSN: 0975-766X CODEN: IJPTFI Available Online through Research Article www.ijptonline.com A BRIEF REVIEW ON APPLICATION OF GRAPH THEORY IN DATA MINING Abhinav Chanana*, Tanya Rastogi, M.Yamuna VIT University,

More information

Innovative Study to the Graph-based Data Mining: Application of the Data Mining

Innovative Study to the Graph-based Data Mining: Application of the Data Mining Innovative Study to the Graph-based Data Mining: Application of the Data Mining Amit Kr. Mishra, Pradeep Gupta, Ashutosh Bhatt, Jainendra Singh Rana Abstract Graph-based data mining represents a collection

More information

Data Mining in Bioinformatics Day 3: Graph Mining

Data Mining in Bioinformatics Day 3: Graph Mining Graph Mining and Graph Kernels Data Mining in Bioinformatics Day 3: Graph Mining Karsten Borgwardt & Chloé-Agathe Azencott February 6 to February 17, 2012 Machine Learning and Computational Biology Research

More information

Pattern Mining in Frequent Dynamic Subgraphs

Pattern Mining in Frequent Dynamic Subgraphs Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universität Munich, Germany kb kriegel wackersr@dbs.ifi.lmu.de

More information

Graph-Based Hierarchical Conceptual Clustering

Graph-Based Hierarchical Conceptual Clustering Journal of Machine Learning Research 2 (2001) 19-43 Submitted 7/01; Published 10/01 Graph-Based Hierarchical Conceptual Clustering Istvan Jonyer Diane J. Cook Lawrence B. Holder Department of Computer

More information

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Chloé-Agathe Azencott & Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institutes

More information

MINING GRAPH DATA EDITED BY. Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington

MINING GRAPH DATA EDITED BY. Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington MINING GRAPH DATA EDITED BY Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington Lawrence B. Holder School of Electrical Engineering and

More information

RELATIONAL APPROACH TO MODELING AND IMPLEMENTING SUBTLE ASPECTS OF GRAPH MINING

RELATIONAL APPROACH TO MODELING AND IMPLEMENTING SUBTLE ASPECTS OF GRAPH MINING Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 RELATIONAL APPROACH TO MODELING AND IMPLEMENTING SUBTLE ASPECTS OF GRAPH MINING Ramanathan Balachandran

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data: Part I Instructor: Yizhou Sun yzsun@ccs.neu.edu November 12, 2013 Announcement Homework 4 will be out tonight Due on 12/2 Next class will be canceled

More information

GRAPH-BASED HIERARCHICAL CONCEPTUAL CLUSTERING

GRAPH-BASED HIERARCHICAL CONCEPTUAL CLUSTERING GRAPH-BASED HIERARCHICAL CONCEPTUAL CLUSTERING ISTVAN JONYER, LAWRENCE B. HOLDER, and DIANE J. COOK University of Texas at Arlington Department of Computer Science and Engineering Box 19015 (416 Yates

More information

Extraction of Frequent Subgraph from Graph Database

Extraction of Frequent Subgraph from Graph Database Extraction of Frequent Subgraph from Graph Database Sakshi S. Mandke, Sheetal S. Sonawane Deparment of Computer Engineering Pune Institute of Computer Engineering, Pune, India. sakshi.mandke@cumminscollege.in;

More information

Mining Interesting Itemsets in Graph Datasets

Mining Interesting Itemsets in Graph Datasets Mining Interesting Itemsets in Graph Datasets Boris Cule Bart Goethals Tayena Hendrickx Department of Mathematics and Computer Science University of Antwerp firstname.lastname@ua.ac.be Abstract. Traditionally,

More information

RELATIONAL DATABASE ALGORITHMS AND THEIR OPTIMIZATION FOR GRAPH MINING

RELATIONAL DATABASE ALGORITHMS AND THEIR OPTIMIZATION FOR GRAPH MINING Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 RELATIONAL DATABASE ALGORITHMS AND THEIR OPTIMIZATION FOR GRAPH MINING RAMJI BEERA MS Thesis CSE-2003-18

More information

Graph-Based Hierarchical Conceptual Clustering

Graph-Based Hierarchical Conceptual Clustering From: FLAIRS-00 Proceedings. opyright 2000, AAAI (www.aaai.org). All rights reserved. Graph-Based Hierarchical onceptual lustering Istvan Jonyer, Lawrence B. Holder and Diane J. ook Department of omputer

More information

Using a Hash-Based Method for Apriori-Based Graph Mining

Using a Hash-Based Method for Apriori-Based Graph Mining Using a Hash-Based Method for Apriori-Based Graph Mining Phu Chien Nguyen, Takashi Washio, Kouzou Ohara, and Hiroshi Motoda The Institute of Scientific and Industrial Research, Osaka University 8-1 Mihogaoka,

More information

Mining Minimal Contrast Subgraph Patterns

Mining Minimal Contrast Subgraph Patterns Mining Minimal Contrast Subgraph Patterns Roger Ming Hieng Ting James Bailey Abstract In this paper, we introduce a new type of contrast pattern, the minimal contrast subgraph. It is able to capture structural

More information

HDB-SUBDUE, A RELATIONAL DATABASE APPROACH TO GRAPH MINING AND HIERARCHICAL REDUCTION SRIHARI PADMANABHAN

HDB-SUBDUE, A RELATIONAL DATABASE APPROACH TO GRAPH MINING AND HIERARCHICAL REDUCTION SRIHARI PADMANABHAN HDB-SUBDUE, A RELATIONAL DATABASE APPROACH TO GRAPH MINING AND HIERARCHICAL REDUCTION by SRIHARI PADMANABHAN Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial

More information

Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach

Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach K. Vivekanandan Professor BSMED A. Pankaj Moses Monickaraj (Correspoding author) Doctoral Scholar Department of Computer Science

More information

Symbolic AI. Andre Freitas. Photo by Vasilyev Alexandr

Symbolic AI. Andre Freitas. Photo by Vasilyev Alexandr Symbolic AI Andre Freitas Photo by Vasilyev Alexandr Acknowledgements These slides were based on the slides of: Peter A. Flach, Rule induction tutorial, IDA Spring School 2001. Anoop & Hector, Inductive

More information

gspan: Graph-Based Substructure Pattern Mining

gspan: Graph-Based Substructure Pattern Mining University of Illinois at Urbana-Champaign February 3, 2017 Agenda What motivated the development of gspan? Technical Preliminaries Exploring the gspan algorithm Experimental Performance Evaluation Introduction

More information

Approaches to Parallel Graph-Based Knowledge Discovery 1

Approaches to Parallel Graph-Based Knowledge Discovery 1 Journal of Parallel and Distributed Computing 61, 427446 (2001) doi:10.1006jpdc.2000.1696, available online at http:www.idealibrary.com on Approaches to Parallel Graph-Based Knowledge Discovery 1 Diane

More information

sift: Adapting Graph Mining Techniques for Classification

sift: Adapting Graph Mining Techniques for  Classification Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 emailsift: Adapting Graph Mining Techniques for Email Classification Manu Aery and Sharma Chakravarthy

More information

Discovering Frequent Geometric Subgraphs

Discovering Frequent Geometric Subgraphs Discovering Frequent Geometric Subgraphs Michihiro Kuramochi and George Karypis Department of Computer Science/Army HPC Research Center University of Minnesota 4-192 EE/CS Building, 200 Union St SE Minneapolis,

More information

An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data Akihiro Inokuchi, Takashi Washio, and Hiroshi Motoda I.S.I.R., Osaka University 8-1, Mihogaoka, Ibarakishi, Osaka, 567-0047,

More information

Data mining, 4 cu Lecture 8:

Data mining, 4 cu Lecture 8: 582364 Data mining, 4 cu Lecture 8: Graph mining Spring 2010 Lecturer: Juho Rousu Teaching assistant: Taru Itäpelto Frequent Subgraph Mining Extend association rule mining to finding frequent subgraphs

More information

A new Approach for Handling Numeric Ranges for Graph-Based. Knowledge Discovery.

A new Approach for Handling Numeric Ranges for Graph-Based. Knowledge Discovery. A new Approach for Handling Numeric Ranges for Graph-Based Knowledge Discovery Oscar E. Romero A. 1, Lawrence B. Holder 2, and Jesus A. Gonzalez B. 1 1 Computer Science Department, INAOE, San Andres Cholula,

More information

GDClust: A Graph-Based Document Clustering Technique

GDClust: A Graph-Based Document Clustering Technique GDClust: A Graph-Based Document Clustering Technique M. Shahriar Hossain, Rafal A. Angryk Department of Computer Science, Montana State University, Bozeman, MT 59715, USA E-mail: {mshossain, angryk}@cs.montana.edu

More information

Data Mining: Concepts and Techniques. Graph Mining. Graphs are Everywhere. Why Graph Mining? Chapter Graph mining

Data Mining: Concepts and Techniques. Graph Mining. Graphs are Everywhere. Why Graph Mining? Chapter Graph mining Data Mining: Concepts and Techniques Chapter 9 9.1. Graph mining Jiawei Han and Micheline Kamber Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.edu/~hanj 2006 Jiawei

More information

MARGIN: Maximal Frequent Subgraph Mining Λ

MARGIN: Maximal Frequent Subgraph Mining Λ MARGIN: Maximal Frequent Subgraph Mining Λ Lini T Thomas Satyanarayana R Valluri Kamalakar Karlapalem enter For Data Engineering, IIIT, Hyderabad flini,satyag@research.iiit.ac.in, kamal@iiit.ac.in Abstract

More information

Efficient homomorphism-free enumeration of conjunctive queries

Efficient homomorphism-free enumeration of conjunctive queries Efficient homomorphism-free enumeration of conjunctive queries Jan Ramon 1, Samrat Roy 1, and Jonny Daenen 2 1 K.U.Leuven, Belgium, Jan.Ramon@cs.kuleuven.be, Samrat.Roy@cs.kuleuven.be 2 University of Hasselt,

More information

Graph-Based Anomaly Detection Applied to Homeland Security Cargo Screening

Graph-Based Anomaly Detection Applied to Homeland Security Cargo Screening Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference Graph-Based Anomaly Detection Applied to Homeland Security Cargo Screening William Eberle Department

More information

Discovering Structural Anomalies in Graph-Based Data

Discovering Structural Anomalies in Graph-Based Data Discovering Structural nomalies in Graph-ased Data William Eberle Tennessee Technological University weberle@tntech.edu Lawrence Holder Washingt State University holder@wsu.edu bstract The ability to mine

More information

Diane J. Cook Department of Computer Science Engineering 303 Nedderman Hall University of Texas at Arlington Arlington, TX (817)

Diane J. Cook Department of Computer Science Engineering 303 Nedderman Hall University of Texas at Arlington Arlington, TX (817) Graph-Based Anomaly Detection Caleb C. Noble Department of Computer Science Engineering Nedderman Hall University of Texas at Arlington Arlington, TX 79 (87)7-9 noble@cse.uta.edu Diane J. Cook Department

More information

Learning Graph Grammars

Learning Graph Grammars Learning Graph Grammars 1 Aly El Gamal ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Discovery of patterns in a given graph - in the form of repeated

More information

GREW A Scalable Frequent Subgraph Discovery Algorithm

GREW A Scalable Frequent Subgraph Discovery Algorithm GREW A Scalable Frequent Subgraph Discovery Algorithm Michihiro Kuramochi and George Karypis Department of Computer Science and Engineering/ Army High Performance Computing Research Center/ Digital Technology

More information

Knowledge Discovery from Transportation Network Data

Knowledge Discovery from Transportation Network Data Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from Transportation Network Data. In ICDE, 2005 1

More information

Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2

Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2 Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2 1: Institute of Mathematics and Informatics BAS, Sofia, Bulgaria 2: Hasselt University, Belgium 1 st Int. Conf. IMMM, 23-29.10.2011,

More information

9.1. Graph Mining, Social Network Analysis, and Multirelational Data Mining. Graph Mining

9.1. Graph Mining, Social Network Analysis, and Multirelational Data Mining. Graph Mining 9 Graph Mining, Social Network Analysis, and Multirelational Data Mining 9.1 We have studied frequent-itemset mining in Chapter 5 and sequential-pattern mining in Section 3 of Chapter 8. Many scientific

More information

Web Search Using a Graph-Based Discovery System

Web Search Using a Graph-Based Discovery System From: FLAIRS-01 Proceedings. Copyright 2001, AAAI (www.aaai.org). All rights reserved. Structural Web Search Using a Graph-Based Discovery System Nifish Manocha, Diane J. Cook, and Lawrence B. Holder Department

More information

Chapters 11 and 13, Graph Data Mining

Chapters 11 and 13, Graph Data Mining CSI 4352, Introduction to Data Mining Chapters 11 and 13, Graph Data Mining Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Graph Representation Graph An ordered pair GV,E

More information

Graph-Based Approaches to Insider Threat Detection

Graph-Based Approaches to Insider Threat Detection Graph-Based Approaches to Insider Threat Detection William Eberle Department of Computer Science Tennessee Technological University Cookeville, TN USA weberle@tntech.edu Lawrence Holder School of Elec.

More information

Graph-Based Data Mining

Graph-Based Data Mining D A T A M I N I N G Graph-Based Data Mining Diane J. Cook and Lawrence B. Holder, University of Texas at Arlington THE LARGE AMOUNT OF DATA collected today is quickly overwhelming researchers abilities

More information

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners Data Mining 3.5 (Instance-Based Learners) Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction k-nearest-neighbor Classifiers References Introduction Introduction Lazy vs. eager learning Eager

More information

Canonical Forms for Frequent Graph Mining

Canonical Forms for Frequent Graph Mining Canonical Forms for Frequent Graph Mining Christian Borgelt Dept. of Knowledge Processing and Language Engineering Otto-von-Guericke-University of Magdeburg borgelt@iws.cs.uni-magdeburg.de Summary. A core

More information

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS By NIKHIL S. KETKAR A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTORATE OF PHILOSOPHY

More information

Temporal and Structural Analysis of Biological Networks in Combination with Microarray Data

Temporal and Structural Analysis of Biological Networks in Combination with Microarray Data Temporal and Structural Analysis of Biological Networks in Combination with Microarray Data Chang Hun You, Lawrence B. Holder and Diane J. Cook Abstract We introduce a graph-based relational learning approach

More information

FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING

FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING E.R.Naganathan 1 S.Narayanan 2 K.Ramesh kumar 3 1 Department of Computer Applications, Velammal Engineering College Ambattur-Redhills Road,

More information

Privacy-Preserving Subgraph Discovery

Privacy-Preserving Subgraph Discovery Privacy-Preserving Subgraph Discovery Danish Mehmood 1, Basit Shafiq 1,2, Jaideep Vaidya 2, Yuan Hong 2, Nabil Adam 2, and Vijayalakshmi Atluri 2 1 Lahore University of Management Sciences, Pakistan 2

More information

Mining Top K Large Structural Patterns in a Massive Network

Mining Top K Large Structural Patterns in a Massive Network Mining Top K Large Structural Patterns in a Massive Network Feida Zhu Singapore Management University fdzhu@smu.edu.sg Xifeng Yan University of California at Santa Barbara xyan@cs.ucsb.edu Qiang Qu Peking

More information

GRAPH MINING AND GRAPH KERNELS

GRAPH MINING AND GRAPH KERNELS GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan* ^University of Cambridge *IBM T. J. Watson Research Center August 24, 2008 ACM SIG KDD, Las Vegas Graphs Are Everywhere

More information

Frequent Pattern-Growth Approach for Document Organization

Frequent Pattern-Growth Approach for Document Organization Frequent Pattern-Growth Approach for Document Organization Monika Akbar Department of Computer Science Virginia Tech, Blacksburg, VA 246, USA. amonika@cs.vt.edu Rafal A. Angryk Department of Computer Science

More information

Epilog: Further Topics

Epilog: Further Topics Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Epilog: Further Topics Lecture: Prof. Dr. Thomas

More information

Dynamic Graph-based Relational Learning of Temporal Patterns in Biological Networks Changing over Time

Dynamic Graph-based Relational Learning of Temporal Patterns in Biological Networks Changing over Time Dynamic Graph-based Relational Learning of Temporal Patterns in Biological Networks Changing over Time Chang Hun You, Lawrence B. Holder and Diane J. Cook School of Electrical Engineering & Computer Science,

More information

Discovering Frequent Topological Structures from Graph Datasets

Discovering Frequent Topological Structures from Graph Datasets Discovering Frequent Topological Structures from Graph Datasets R. Jin C. Wang D. Polshakov S. Parthasarathy G. Agrawal Department of Computer Science and Engineering Ohio State University, Columbus OH

More information

Coupling Two Complementary Knowledge Discovery Systems *

Coupling Two Complementary Knowledge Discovery Systems * From: Proceedings of the Eleventh nternational FLARS Conference. Copyright 1998, AAA (www.aaai.org). All rights reserved. Coupling Two Complementary Knowledge Discovery Systems * Lawrence B. Holder and

More information

Survey on Graph Query Processing on Graph Database. Presented by FAN Zhe

Survey on Graph Query Processing on Graph Database. Presented by FAN Zhe Survey on Graph Query Processing on Graph Database Presented by FA Zhe utline Introduction of Graph and Graph Database. Background of Subgraph Isomorphism. Background of Subgraph Query Processing. Background

More information

Robust Automatic Traffic Signs Recognition Using Fast Polygonal Approximation of Digital Curves and Neural Network

Robust Automatic Traffic Signs Recognition Using Fast Polygonal Approximation of Digital Curves and Neural Network Robust Automatic Traffic Signs Recogniti Using Fast Polygal Approximati of Digital Curves and Neural Network Abderrahim SALHI, Brahim MINAOUI 2, Mohamed FAKIR 3 Informati Processing and Decisi Laboratory,

More information

Handling of Numeric Ranges for Graph-Based Knowledge Discovery

Handling of Numeric Ranges for Graph-Based Knowledge Discovery Handling of Numeric Ranges for Graph-Based Knowledge Discovery Abstract Discovering interesting patterns from structural domains is an important task in many real world domains. In recent years, graph-based

More information

MINING AND SEARCHING GRAPHS AND STRUCTURES

MINING AND SEARCHING GRAPHS AND STRUCTURES MINING AND SEARCHING GRAPHS AND STRUCTURES Jiawei Han Xifeng Yan Department of Computer Science University of Illinois at Urbana-Champaign Philip S. Yu IBM T. J. Watson Research Center http://ews.uiuc.edu/~xyan/tutorial/kdd06_graph.htm

More information

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control. What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem

More information

D. BOONE/CORBIS. Structural Web Search Using a Graph-Based Discovery System. 20 Spring 2001 intelligence

D. BOONE/CORBIS. Structural Web Search Using a Graph-Based Discovery System. 20 Spring 2001 intelligence D. BOONE/CORBIS Structural Web Search Using a Graph-Based Discovery System 20 Spring 2001 intelligence Cover Story Our structural search engine searches not only for particular words or topics, but also

More information

Dynamic Leadership Protocol for S-nets

Dynamic Leadership Protocol for S-nets Dynamic Leadership Protocol for S-nets Gregory J. Barlow 1, Thomas C. Henders 2, Andrew L. Nels 1, and Edward Grant 1 1 Center for Robotics and Intelligent Machines, Department of Electrical and Computer

More information

Les Cahiers du GERAD ISSN:

Les Cahiers du GERAD ISSN: Les Cahiers du GERAD ISSN: 0711 2440 SyGMA: Reducing Symmetry in Graph Mining C. Desrosiers, Ph. Galinier, P. Hansen, A. Hertz G 2007 12 February 2007 Revised: February 2008 Les textes publiés dans la

More information

Relational Learning. Jan Struyf and Hendrik Blockeel

Relational Learning. Jan Struyf and Hendrik Blockeel Relational Learning Jan Struyf and Hendrik Blockeel Dept. of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, 3001 Leuven, Belgium 1 Problem definition Relational learning refers

More information

Graph Grammars for Super Mario Bros

Graph Grammars for Super Mario Bros Graph Grammars for Super Mario Bros Levels Santiago Londoño University of Bonn londono@cs.uni-bonn.de Olana Missura University of Bonn olana.missura@uni-bonn.de ABSTRACT We assume that the structure of

More information

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology 9/9/ I9 Introduction to Bioinformatics, Clustering algorithms Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Outline Data mining tasks Predictive tasks vs descriptive tasks Example

More information

Chapter 4: Mining Frequent Patterns, Associations and Correlations

Chapter 4: Mining Frequent Patterns, Associations and Correlations Chapter 4: Mining Frequent Patterns, Associations and Correlations 4.1 Basic Concepts 4.2 Frequent Itemset Mining Methods 4.3 Which Patterns Are Interesting? Pattern Evaluation Methods 4.4 Summary Frequent

More information

Mining Significant Graph Patterns by Leap Search

Mining Significant Graph Patterns by Leap Search Mining Significant Graph Patterns by Leap Search Xifeng Yan (IBM T. J. Watson) Hong Cheng, Jiawei Han (UIUC) Philip S. Yu (UIC) Graphs Are Everywhere Magwene et al. Genome Biology 2004 5:R100 Co-expression

More information

Hierarchical Clustering of Process Schemas

Hierarchical Clustering of Process Schemas Hierarchical Clustering of Process Schemas Claudia Diamantini, Domenico Potena Dipartimento di Ingegneria Informatica, Gestionale e dell'automazione M. Panti, Università Politecnica delle Marche - via

More information

Association Rule Mining. Entscheidungsunterstützungssysteme

Association Rule Mining. Entscheidungsunterstützungssysteme Association Rule Mining Entscheidungsunterstützungssysteme Frequent Pattern Analysis Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set

More information

Nesnelerin İnternetinde Veri Analizi

Nesnelerin İnternetinde Veri Analizi Nesnelerin İnternetinde Veri Analizi Bölüm 3. Classification in Data Streams w3.gazi.edu.tr/~suatozdemir Supervised vs. Unsupervised Learning (1) Supervised learning (classification) Supervision: The training

More information

COMPARATIVE STUDY OF EXISTING PLAGIARISM DETECTION SYSTEMS AND DESIGN OF A NEW WEB BASED SYSTEM NEERAJA JOSHUA SAMUEL. Bachelor of Commerce

COMPARATIVE STUDY OF EXISTING PLAGIARISM DETECTION SYSTEMS AND DESIGN OF A NEW WEB BASED SYSTEM NEERAJA JOSHUA SAMUEL. Bachelor of Commerce COMPARATIVE STUDY OF EXISTING PLAGIARISM DETECTION SYSTEMS AND DESIGN OF A NEW WEB BASED SYSTEM By NEERAJA JOSHUA SAMUEL Bachelor of Commerce University of Delhi New Delhi, India 1985 Submitted to the

More information

gprune: A Constraint Pushing Framework for Graph Pattern Mining

gprune: A Constraint Pushing Framework for Graph Pattern Mining gprune: A Constraint Pushing Framework for Graph Pattern Mining Feida Zhu Xifeng Yan Jiawei Han Philip S. Yu Computer Science, UIUC, {feidazhu,xyan,hanj}@cs.uiuc.edu IBM T. J. Watson Research Center, psyu@us.ibm.com

More information

A New Approach To Graph Based Object Classification On Images

A New Approach To Graph Based Object Classification On Images A New Approach To Graph Based Object Classification On Images Sandhya S Krishnan,Kavitha V K P.G Scholar, Dept of CSE, BMCE, Kollam, Kerala, India Sandhya4parvathy@gmail.com Abstract: The main idea of

More information

Enhancing Structure Discovery for Data Mining in Graphical Databases Using Evolutionary Programming

Enhancing Structure Discovery for Data Mining in Graphical Databases Using Evolutionary Programming From: FLAIRS-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Enhancing Structure Discovery for Data Mining in Graphical Databases Using Evolutionary Programming Sanghamitra Bandyopadhyay,

More information

Edgar: the Embedding-baseD GrAph MineR

Edgar: the Embedding-baseD GrAph MineR Edgar: the Embedding-baseD GrAph MineR Marc Wörlein, 1 Alexander Dreweke, 1 Thorsten Meinl, 2 Ingrid Fischer 2, and Michael Philippsen 1 1 University of Erlangen-Nuremberg, Computer Science Department

More information

Edgar: the Embedding-baseD GrAph MineR

Edgar: the Embedding-baseD GrAph MineR Edgar: the Embedding-baseD GrAph MineR Marc Wörlein, 1 Alexander Dreweke, 1 Thorsten Meinl, 2 Ingrid Fischer 2, and Michael Philippsen 1 1 University of Erlangen-Nuremberg, Computer Science Department

More information

Knowledge Discovery from Transportation Network Data

Knowledge Discovery from Transportation Network Data Knowledge Discovery from Transportation Network Data Wei Jiang Purdue University 25 N. University St. W. Lafayette, IN 4797-266 wjiang@cs.purdue.edu Chris Clifton Purdue University 25 N. University St.

More information

Modularity CMSC 858L

Modularity CMSC 858L Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look

More information

Graph Mining: Repository vs. Canonical Form

Graph Mining: Repository vs. Canonical Form Graph Mining: Repository vs. Canonical Form Christian Borgelt and Mathias Fiedler European Center for Soft Computing c/ Gonzalo Gutiérrez Quirós s/n, 336 Mieres, Spain christian.borgelt@softcomputing.es,

More information

Spectral Methods for Network Community Detection and Graph Partitioning

Spectral Methods for Network Community Detection and Graph Partitioning Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection

More information

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10) CMPUT 391 Database Management Systems Data Mining Textbook: Chapter 17.7-17.11 (without 17.10) University of Alberta 1 Overview Motivation KDD and Data Mining Association Rules Clustering Classification

More information

Directed Search for Generalized Plans Using Classical Planners

Directed Search for Generalized Plans Using Classical Planners Directed Search for Generalized Plans Using Classical Planners Siddharth Srivastava and Neil Immerman and Shlomo Zilberstein Department of Computer Science University of Massachusetts Amherst Tianjiao

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42 Pattern Mining Knowledge Discovery and Data Mining 1 Roman Kern KTI, TU Graz 2016-01-14 Roman Kern (KTI, TU Graz) Pattern Mining 2016-01-14 1 / 42 Outline 1 Introduction 2 Apriori Algorithm 3 FP-Growth

More information

Genetic Programming for Design Grammar Rule Induction

Genetic Programming for Design Grammar Rule Induction Genetic Programming for Design Grammar Rule Induction Julian R. Eichhoff and Dieter Roller Institute of Computer-aided Product Development Systems, University of Stuttgart Universitätsstr. 38, 70569 Stuttgart,

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Spring 2013 " An second class in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt13 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Graph Pattern Mining

Graph Pattern Mining : Lecture VIII Graph Pattern Mining Computer Science Department Data Mining Research Nov 26, 2014 Announcement No Homework Slides available at www.cs.ucsb.edu/~xyan/classes/ns201 Two Quizzes (Dec 3, 10),

More information

Decision Tree CE-717 : Machine Learning Sharif University of Technology

Decision Tree CE-717 : Machine Learning Sharif University of Technology Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete

More information

Web Data Extraction. Craig Knoblock University of Southern California. This presentation is based on slides prepared by Ion Muslea and Kristina Lerman

Web Data Extraction. Craig Knoblock University of Southern California. This presentation is based on slides prepared by Ion Muslea and Kristina Lerman Web Data Extraction Craig Knoblock University of Southern California This presentation is based on slides prepared by Ion Muslea and Kristina Lerman Extracting Data from Semistructured Sources NAME Casablanca

More information

Subgraph Frequencies:

Subgraph Frequencies: Subgraph Frequencies: The Empirical and Extremal Geography of Large Graph Collections Johan Ugander, Lars Backstrom, Jon Kleinberg World Wide Web Conference May 16, 2013 Graph collections Neighborhoods:

More information

Data Mining Course Overview

Data Mining Course Overview Data Mining Course Overview 1 Data Mining Overview Understanding Data Classification: Decision Trees and Bayesian classifiers, ANN, SVM Association Rules Mining: APriori, FP-growth Clustering: Hierarchical

More information