MINING GRAPH DATA EDITED BY. Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington

Similar documents
MINING GRAPH DATA EDITED BY. Diane J. Cook. Lawrence B. Holder WILEY-INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION

Data Mining in Bioinformatics Day 3: Graph Mining

In Mathematics and computer science, the study of graphs is graph theory where graphs are data structures used to model

Graph-based Learning. Larry Holder Computer Science and Engineering University of Texas at Arlington

Subdue: Compression-Based Frequent Pattern Discovery in Graph Data

Data Mining in Bioinformatics Day 5: Graph Mining

Extraction of Frequent Subgraph from Graph Database

Pattern Mining in Frequent Dynamic Subgraphs

Managing and Mining Graph Data

Graph Mining Sub Domains and a Framework for Indexing A Graphical Approach

Using a Hash-Based Method for Apriori-Based Graph Mining

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining

Graph-based Relational Learning with Application to Security

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining

Using Graphs to Improve Activity Prediction in Smart Environments based on Motion Sensor Data

Part I: Data Mining Foundations

Graph-Based Hierarchical Conceptual Clustering

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR

Hierarchical Clustering of Process Schemas

Efficient homomorphism-free enumeration of conjunctive queries

A New Approach To Graph Based Object Classification On Images

Discovering Frequent Geometric Subgraphs

Contents. Preface to the Second Edition

Innovative Study to the Graph-based Data Mining: Application of the Data Mining

Mining Minimal Contrast Subgraph Patterns

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

Relational Learning. Jan Struyf and Hendrik Blockeel

Mining frequent Closed Graph Pattern

gspan: Graph-Based Substructure Pattern Mining

Discovering Frequent Topological Structures from Graph Datasets

MARGIN: Maximal Frequent Subgraph Mining Λ

TASK SCHEDULING FOR PARALLEL SYSTEMS

Monotone Constraints in Frequent Tree Mining

Mining Interesting Itemsets in Graph Datasets

High-Performance Parallel Database Processing and Grid Databases

Stochastic propositionalization of relational data using aggregates

Data Mining: Concepts and Techniques. Graph Mining. Graphs are Everywhere. Why Graph Mining? Chapter Graph mining

Graph-Based Concept Learning

GRAPH MINING AND GRAPH KERNELS

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Numeric Ranges Handling for Graph Based Knowledge Discovery Oscar E. Romero A., Jesús A. González B., Lawrence B. Holder

Edgar: the Embedding-baseD GrAph MineR

Canonical Forms for Frequent Graph Mining

Edgar: the Embedding-baseD GrAph MineR

RELATIONAL APPROACH TO MODELING AND IMPLEMENTING SUBTLE ASPECTS OF GRAPH MINING

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Finding Neighbor Communities in the Web using Inter-Site Graph

Min-Hashing for Probabilistic Frequent Subtree Feature Spaces

FP-GROWTH BASED NEW NORMALIZATION TECHNIQUE FOR SUBGRAPH RANKING

Epilog: Further Topics

Parallelizing Frequent Itemset Mining with FP-Trees

MINING AND SEARCHING GRAPHS AND STRUCTURES

Mining Significant Graph Patterns by Leap Search

Graph-Based Anomaly Detection Applied to Homeland Security Cargo Screening

Frequent Pattern-Growth Approach for Document Organization

Graph-Based Hierarchical Conceptual Clustering

GRAPH MINING AND GRAPH KERNELS

The Discovery and Retrieval of Temporal Rules in Interval Sequence Data

Exploring graph mining approaches for dynamic heterogeneous networks

Semantic Web Technologies Trends and Research in Ontology-based Systems

An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

DETERMINISTIC OPERATIONS RESEARCH

Iliya Mitov 1, Krassimira Ivanova 1, Benoit Depaire 2, Koen Vanhoof 2

Self-Organization in Sensor and Actor Networks

Data Clustering in C++

Introduction to Statistical Relational Learning

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming

Semi-supervised Learning

International Journal of Advanced Research in Computer Science and Software Engineering

CODENSE v

Mining Frequent Closed Unordered Trees Through Natural Representations

CS6220: DATA MINING TECHNIQUES

Knowledge Discovery from Transportation Network Data

Fundamentals of Digital Image Processing

GDClust: A Graph-Based Document Clustering Technique

Manifold Learning Theory and Applications

GREW A Scalable Frequent Subgraph Discovery Algorithm

Preface MOTIVATION ORGANIZATION OF THE BOOK. Section 1: Basic Concepts of Graph Theory

Business Intelligence Roadmap HDT923 Three Days

Graph Mining and Social Network Analysis

Classification in Dynamic Streaming Networks

9.1. Graph Mining, Social Network Analysis, and Multirelational Data Mining. Graph Mining

Performance Based Study of Association Rule Algorithms On Voter DB

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Mining Query-Based Subnetwork Outliers in Heterogeneous Information Networks

A Graph-Based Approach for Mining Closed Large Itemsets

MINING FREQUENT MAX AND CLOSED SEQUENTIAL PATTERNS

GRAPH-BASED HIERARCHICAL CONCEPTUAL CLUSTERING

A Hierarchical Document Clustering Approach with Frequent Itemsets

Graph Matching. Filtering Databases of Graphs Using Machine Learning Techniques

Efficient Subtree Inclusion Testing in Subtree Discovering Applications

Dynamic Graph-based Relational Learning of Temporal Patterns in Biological Networks Changing over Time

Comparison of FP tree and Apriori Algorithm

Table Of Contents: xix Foreword to Second Edition

Link Mining & Entity Resolution. Lise Getoor University of Maryland, College Park

Pattern Recognition Using Graph Theory

gprune: A Constraint Pushing Framework for Graph Pattern Mining

Parallelization of Graph Isomorphism using OpenMP

Epipolar Geometry in Stereo, Motion and Object Recognition

Transcription:

MINING GRAPH DATA EDITED BY Diane J. Cook School of Electrical Engineering and Computei' Science Washington State University Puliman, Washington Lawrence B. Holder School of Electrical Engineering and Computer Science Washington State University Puliman, Washington BICINTlUNIfl Jl 1 1 8 O 7 l WILEY: ; 200 7 <n\ ftlcintinhial WILEY-INTERSC1ENCE A JOHN WILEY & SONS, INC., PUBLICATION

CONTENTS Preface Acknowledgments Contributors xiii xv xvn 1 INTRODUCTION 1 Lawrence B. Holder and Diane J. Cook 1.1 Terminology 2 1.2 Graph Databases 3 i.3 Book Overview 10 References 11 Part I GRAPHS 15 2 GRAPH MATCHING EXACT AND ERROR-TOLERANT METHODS AND THE AUTOMATIC LEARNING OF EDIT COSTS 17 Horst Bunke and Michel Neuhaus 2.1 Introductkm 17 2.2 Definitions and Graph Matching Methods 18 2.3 Learning Edit Costs 24 2.4 Experimental Evaluation 28 2.5 Discussion and Conelusions 31 References 32 3 GRAPH VISUALIZATION AND DATA MINING 35 Walter Didimo and Giuseppe Liotta 3.1 Introduction 35 3.2 Graph Drawing Techniques 38 3.3 Examples of Visualization Systems 48 vii

viii CONTENTS 3.4 Conclusions 55 Rer'erences 57 4 GRAPH PATTERNS AND THE R-MAT GENERATOR 65 Deepayan Chakrabarti and Christas Faioutsos 4.1 Introdüction 65 4.2 Background and Related Work 67 4.3 NetMine and R-MAT 79 4.4 Experiments 82 4.5 Conciusions 86 References 92 Part II MINING TECHNIQUES 97 5 DISCOVERY OF FREQUENT SUBSTRUCTURES 99 Xifeng Yan and Jiawei Han 5.1 Introdüction 99 5.2 Preliminary Concepts 100 5.3 Apriori4)ased Approach 101 5.4 Pattern Growth Approach 103 5.5 Variant Substrueture Patterns 107 5.6 Experiments and Performance Study 109 5.7 Conclusions 112 References 113 6 FINDING TOPOLOGICAL FREQUENT PATTERNS FROM GRAPH DATASETS 117 Michihiro Kuramochi and George Karypis 6.1 Introdüction 117 6.2 Background Definitions and Notation 118 6.3 Frequent Pattern Discovery from Graph Datasets Problem Definitions 122 6.4 FSG for the Graph-Transaction Setting 127 6.5 S-GKAM tbr the Single-Graph Setting 131 6.6 GREW Scalable Frequem Subgraph Discovery Algorithm 141 6.7 Related Research 149 6.8 Conclusions 151 References 154 7 UNSUPERVISED AND SUPERVISED PATTERN LEARNING IN GRAPH DATA 159 Diane J. Cook, Lawrence B. Holder, and Nikhil Keikar 7.1 Introdüction 159

CONTENTS ix 7.2 Mining Graph Data Using Subdue 160 7.3 Comparison to Other Graph-Based Mining Algorithms 165 7.4 Comparison to Frequent Substructure Mining Approaches 165 7.5 Comparison to ILP Approaches 170 7.6 Conclusions 179 References 179 8 GRAPH GRAMMAR LEARNING 183 Istvan Jonyer 8.1 Introduction 183 8.2 Related Work 184 8.3 Graph Grammar Learning 185 8.4 Empirical Evaluation 193 8.5 Conclusion 199 References 199 9 CONSTRUCTING DECISION TREE BASED ON CHUNKINGLESS GRAPH-BASED INDUCTION 203 Kouzou Ohara, Phu Chien Nguyen, Akira Mogi, Hiroshi Motoda, and Takashi Washio 9.1 Introduction 203 9.2 Graph-Based Induction Revisited 205 9.3 Problem Caused by Chunking in B-GBI 207 9.4 Chunkingless Graph-Based Induction (Cl-GBI) 208 9.5 Decision Tree Chunkingless Graph-Based Induction (DT-C1GBI) 214 9.6 Conclusions 224 References 224 10 SOWIE LINKS BETWEEN FORMAL CONCEPT ANALYSIS AND GRAPH MINING 227 Michel Liquiere 10.1 Presentation 227 10.2 Basic Concepts and Notation 228 10.3 Formal Concept Analysis 229 10.4 Extension Lattice and Description Lattice Give Concept Lattice 231 10.5 Graph Description and Galois Lattice 235 10.6 Graph Mining and Formal Propositionalization 240 10.7 Conclusion 249 References 250

X CONTENTS 11 KERNEL METHODS FOR GRAPHS 253 Thomas Gärtner, Tamäs Horvath, Quoc V, Le, Alex J. Smola, and Stefan Wrobel 11.1 Introduction 11.2 Graph Classification 11.3 Vertex Classification 11.4 Conclusions and Future Work References 253 254 266 279 280 12 KERNELS AS LINK ANALYSIS MEASURES Masashi Shimbo and Takahiko Ito 12.1 Introduction 12.2 Preliminaries 12.3 Kernel-based Unified Framework for Importance and Relatedness 12.4 Laplacian Kernels as a Relatedness Measure 12.5 Practical Issues 12.6 Related Work 12.7 Evaluation with Bibliographie Citation Data 12,8 Summary References 283 283 284 286 290 297 299 300 308 308 13 ENTITY RESOLUTION IN GRAPHS 311 Indrajit Bhattacharya and Lise Getoor 13.1 Introduction 311 13.2 Related Work 314 13.3 Motivating Example for Graph-Based Entity Resolution 318 13.4 Graph-Based Entity Resolution: Problem Formulation 322 13.5 Similarity Measures for Entity Resolution 325 13.6 Graph-Based Clustering for Entity Resolution 330 13.7 Experimental Evaluation 333!3.8 Conclusion 341 References 342 Part III APPLICATIONS 345 14 MINING FROM CHEMICAL GRAPHS 347 Takashi Okada 14.1 Introduction and Representation of Molecules 347 14.2 Issues for Mining 355 14.3 CASE: A Prototype Mining System in Chemistry 356 14.4 Quantitative Estimation Using Graph Mining 358 14.5 Extension of Linear Fragments to Graphs 362

CONTENTS xi 14.6 Combination of Conditions 366 14.7 Concluding Remarks 375 References 377 15 UNIFIED APPROACH TO ROOTED TREE MINING: ALGORITHMS AND APPLICATIONS 381 Mohammed Zaki 15.1 Introducüon 381 15.2 Preliminaries 382 15.3 Related Work 384 15.4 Generaüng Candidate Subtrees 385 15.5 Frequency Computation 392 15.6 Counting Distinct Occurrences 397 15.7 The SLEUTH Algorithm 399 15.8 Experimental Results 401 15.9 Tree Mining Applications in Bioinformatics 405 15.10 Conclusions 409 References 409 16 DENSE SUBGRAPH EXTRACTION 411 Andrew Tomkins and Ravi Kumar 6.1 6.2 6.3 6.4 6.5 6.6 6.7 Introduction Related Work Finding the densest subgraph Trawling Graph Shingling Connection Subgraphs Conclusions References 411 414 416 418 421 429 438 438 17 SOCIAL NETWORK ANALYSIS 443 Sherry E. Marcus, Melanie Moy, and Thayne Cojfman 17.1 Introduction 443 17.2 Social Network Analysis 443 17.3 Group Detection 452 17.4 Terrorist Modus Operandi Detection System 452 17.5 Computational Experiments 465 17.6 Conclusion 467 References 468 Index 469