Data Mining for Association Rules and Sequential Patterns

Size: px
Start display at page:

Download "Data Mining for Association Rules and Sequential Patterns"

Transcription

1 Data Mining for Association Rules and Sequential Patterns

2 Springer-Science+Business Media, LLC

3 Jean-Marc Adamo Data Mining for Association Rules and Sequential Patterns Sequential and Parallel Algorithms With 54 Illustrations, Springer

4 Jean-Marc Adamo Universile de Lyon 43 bd. II novembre 1918 Bat. 308, B.P Vi lleurbanne, cedelt France Ubrary ofcongress Cataloging-in-Publication Data Adamo, Jean-Marc, Data mining for association rule.s and sequential pancms : sequential and par.lllel algorithms I Jean-Marc Adamo p. cm. Includes bibliographical rcferem:e.s and index. ISBN ISBN (ebook) DOI / Data mining. 2. Col1ţlute r algorithms. 1. Title. QA76.9.D343A dc2\ (}O.{)S6267 Printed on acid-free paper. C 2001 Springe r Sciencc+ Busincss Media New York Originally publishcd by Springel"'-Ve rla g New YOI k, Ine in 2001 Softcover reprint ofthe hardcover l at edition 2001 AII rights reserved. This work may RO( be tmnslated or copied in whole or in pari without the wrinen permission ofme publisher (Springer Science+Business Media, LLC), except for brief excerpts in connection wim rcvicws Of scllolarly analysis. Use in connection with any form of infonnation stomge ancl retrieval, electronic adaptation, computer software, CIr by similar Of dissimilar metllodology now known or hercafter developcd is forbidden. The use ofgeneral descriptive names. U"ade names, trademaooi, etc., in mis publicat ion, even ifme former are RO( e.special1y idenlified, is noi to be laken IS I sign that such names. as unclerstood by the Trade Marks and Merchandise Marks Act, may aecordingly be uscd frecl y byanyonc:. Production manage<! by Steven Pisano; manufacturing supervised by Jeffrey Taub. Camcra-rcady pagts prcpared ITom the authors' Microsoft Word files ISUN

5 Preface Data mining includes a wide range of activities such as classification, clustering, similarity analysis, summarization, association rule and sequential pattern discovery, and so forth. The book focuses on the last two previously listed activities. It provides a unified presentation of algorithms for association rule and sequential pattern discovery. For both mining problems, the presentation relies on the lattice structure of the search space. All algorithms are built as processes running on this structure. Proving their properties takes advantage of the mathematical properties of the structure. Part of the motivation for writing this book was postgraduate teaching. One of the main intentions was to make the book a suitable support for the clear exposition of problems and algorithms as well as a sound base for further discussion and investigation. Since the book only assumes elementary mathematical knowledge in the domains of lattices, combinatorial optimization, probability calculus, and statistics, it is fit for use by undergraduate students as well. The algorithms are described in a C-like pseudo programming language. The computations are shown in great detail. This makes the book also fit for use by implementers: computer scientists in many domains as well as industry engineers. Mining for association rules and sequential patterns is known to be a problem with large computational complexity. Most mining algorithms typically take hours to complete when performed on large real-life datasets. The issue of designing efficient parallel algorithms should be considered as critical. Most algorithms in the book are devised for both sequential and parallel execution. Parallel algorithm design takes advantage of the lattice structure of the search space. Partitioning is performed via lattice recursive bisection. Database partitioning is also used as an additional source of parallelism. The book contains ten chapters including the introduction. Chapter 2 is dedicated to search space partitioning and to mining with partitioned search spaces. Chapter 3 contains a review of all rule-mining algorithms that have been presented thus far in the literature. Chapter 4 extends the search space partition-based algorithm so that it can also deal with taxonomies. Chapter 5 investigates the problem of rule mining under Boolean constraints. Chapter 6 presents a database partitioning method based

6 vi Preface on sampling. Methods for merging the search space and database partitioning are proposed, leading to new sequential and parallel algorithms. Chapter 7 investigates the problem of mining rules with categorical and metric attributes. The latter problem deals with exhaustive enumeration. Another way of drawing useful information from quantitative association rules leads to optimization problems. Chapter 8 proposes a unified presentation of the problems and solutions. Chapter 9 describes new measures aimed at improving the predictive ability of the rules and new algorithms aimed at limiting combinatorial explosion. Finally, Chapter 10 deals with sequential pattern mining. The problem is investigated by using a method similar to the one used for rule mining. The implementation of all algorithms presented in the book is currently under development. The work is carried out on a cluster of SMP machines and uses the ARCH library (run above MPI) as a development tool. Progress reports will be found at the address Acknowledgments The creation of this book benefited from the assistance of a number of people. I am grateful to Yves Kodratoff for reading the manuscript and making constructive comments. I am also grateful to my wife, Monique, for reading the manuscript so professionally and making many comments that helped improve the syntax significantly. Many thanks also go to my editors, Steven Pisano, Wayne Wheeler, and Wayne Yuhasz at Springer-Verlag, for their assistance. The Ecole Supeneure de Chimie Physique et Electronique de Lyon hosted me and provided logistic support and facilities. The cover photograph is by Olivier Lernout, CINES, Montpellier, France. Permission was granted by Olivier Lernout and Alain Quere, Director of CINES. The photograph represents the shot of a largescale sand counter designed by Jean Bernard Metais for counting the time between two eclipses: August 11, 1999 and June 22, The work of art is being exhibited at the Museum National d 'Histoire Naturelle de Paris. The sand flows down multiple holes, thereby creating a series of patterns, which triggers a striking evocation of parallel data mining--one of the main topics of this book. Jean-Marc Adamo

7 I Contents Preface... v 1. Introduction Search Space Partition-Based Rule Mining Problem Statement Canonical Attribute Sequences (cas) Database Support Association Rule Problem Statement Search Space Splitting Procedure Enumerating a-frequent Attribute Sets (cass) Sequential Enumeration Procedure Parallel Enumeration Procedure Initial Load Balancing Computing the Starting Sets Enumeration Procedure Dynamic Load Balancing Generating the Association Rules Sequential Generation Parallel Generation Apriori and Other Algorithms Early Algorithms AlS SETM The Apriori Algorithms Apriori AprioriTid Direct Hashing and Pruning Filtering Candidates... 41

8 viii Contents Database Trimming The DHP Algorithm Dynamic Set Counting Mining for Rules over Attribute Taxonomies Association Ru1es over Taxonomies Problem Statement and Algorithms Pruning Uninteresting Rules Measure o/interest Rule Pruning Algorithm Attribute Presence-Based Pruning Constraint-Based Rule Mining Boolean Constraints Syntax Semantics Propagation o/boolean Constraints Prime Implicants Problem Statement and Algorithms Data Partition-Based Rule Mining Data Partitioning Building a Probabilistic Model Bounding Large Deviations for One cas (Chernoff bounds) Bounding Large Deviations for Sets of cass cas Enumeration with Partitioned Data Data Partitioning Local CF-Frequent cas Generation Global CF-Frequent cas Generation Mining for Rules with Categorical and Metric Attributes Interval Systems and Quantitative Ru1es k-partial Completeness Pruning Uninteresting Rules Measure of Interest Attribute Presence-Based Pruning Enumeration Algorithms Optimizing Rules with Quantitative Attributes Solving I-I-Type Rule Optimization Problems Problem Statement MCISProblem MSIC Problem MG Problem

9 Data Miningfor Association Rules and Sequential Patterns ix 8.2 Solving d-l-type Rule Optimization Problems Solving l-q-type Rule Optimization Problems Problem Statement MSIC Problem MG Problem Solving d-q-type Rule Optimization Problems Problem Statement Basic Enumeration Enumeration with Pruning Pruning the Instantiation Set Beyond Support-Confidence Framework A Criticism ofthe Support-Confidence Framework Conviction Pruning Conviction-Based Rules Analyzing Conviction Transitivity-Based Pruning Improvement-Based Pruning One-Step Association Rule Mining Building a Procedure for One-Step Mining Building a Procedure for Improvement-Based Pruning Correlated Attribute-Set Mining Collective Strength Correlated Attribute-Set Enumeration Refining Conviction: Association Rule Intensity Measure Construction Properties Relating a-int(s.=;> u) to conv(s.=;> u) Mining with the Intensify Measure a-intensify Versus Intensity as Defined in [G96] Search Space Partition-Based Sequential Pattern Mining Problem Statement Sequences of cass Database Support Problem Statement Search Space Splitting the Search Space Splitting Procedure Sequence Enumeration Extending the Support Set Notion Join Operations Sequential Enumeration Procedure Parallel Enumeration Procedure

10 x Contents Appendix 1. Chernoff Bounds Appendix 2. Partitioning in Figure 10.5: Beyond 3rd Power Appendix 3. Partitioning in Figure 10.6: Beyond 3rd Power References Index

Yves Nievergelt. Wavelets Made Easy. Springer Science+Business Media, LLC

Yves Nievergelt. Wavelets Made Easy. Springer Science+Business Media, LLC Wavelets Made Easy Yves Nievergelt Wavelets Made Easy Springer Science+Business Media, LLC Yves Nievergelt Department of Mathematics Eastem Washington University Cheney, WA 99004-2431 USA Library of Congress

More information

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOG? HARDWARE DESCRIPTION LANGUAGE THE VERILOGf HARDWARE DESCRIPTION LANGUAGE by Donald E. Thomas Carnegie Mellon University and Philip R. Moorby Cadence Design Systems, Inc. SPRINGER SCIENCE+BUSINESS

More information

ITSM: An Interactive Time Series Modelling Package for the pe

ITSM: An Interactive Time Series Modelling Package for the pe ITSM: An Interactive Time Series Modelling Package for the pe Peter J. Brockwell Richard A. Davis ITSM: An Interactive Time Series Modelling Package for the pe With 53 Illustrations and 3 Diskettes Written

More information

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs Vaughn Betz Jonathan Rose Alexander Marquardt

More information

Using MSC/NASTRAN: Statics and Dynamics

Using MSC/NASTRAN: Statics and Dynamics Using MSC/NASTRAN: Statics and Dynamics A.D. Cifuentes Using MSC/NASTRAN Statics and Dynamics With 94 Illustrations Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Arturo O. Cifuentes

More information

A Structured Programming Approach to Data

A Structured Programming Approach to Data A Structured Programming Approach to Data Derek Coleman A Structured Programming Approach to Data Springer-Verlag New York Derek Coleman Department of Computation Institute of Science Technology University

More information

Topological Structure and Analysis of Interconnection Networks

Topological Structure and Analysis of Interconnection Networks Topological Structure and Analysis of Interconnection Networks Network Theory and Applications Volume 7 Managing Editors: Ding-Zhu Du, University of Minnesota, U.S.A. and Cauligi Raghavendra, University

More information

Guide to RISC Processors

Guide to RISC Processors Guide to RISC Processors Sivarama P. Dandamudi Guide to RISC Processors for Programmers and Engineers Sivarama P. Dandamudi School of Computer Science Carleton University Ottawa, ON K1S 5B6 Canada sivarama@scs.carleton.ca

More information

Fundamentals of Operating Systems. Fifth Edition

Fundamentals of Operating Systems. Fifth Edition Fundamentals of Operating Systems Fifth Edition Fundamentals of Operating Systems A.M. Lister University of Queensland R. D. Eager University of Kent at Canterbury Fifth Edition Springer Science+Business

More information

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE Yau-Tsun Steven Li Monterey Design Systems, Inc. Sharad Malik Princeton University ~. " SPRINGER

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Programming with Turing and Object Oriented Turing

Programming with Turing and Object Oriented Turing Programming with Turing and Object Oriented Turing Peter Grogono Programming with Turing and Object Oriented Turing Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Barcelona Budapest

More information

Loop Tiling for Parallelism

Loop Tiling for Parallelism Loop Tiling for Parallelism THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE LOOP TILING FOR PARALLELISM JINGLING XUE School of Computer Science and Engineering The University of New

More information

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING The Kluwer International Series on ADVANCES IN DATABASE SYSTEMS Series Editor Ahmed K. Elmagarmid Purdue University West Lafayette, IN 47907 Other books

More information

Computer Networks and Systems

Computer Networks and Systems Computer Networks and Systems Queueing Theory and Performance Evaluation Third Edition Springer Science+Business Media, LLC Thomas G. Robertazzi Computer Networks and Systems Queueing Theory and Performance

More information

LOGICAL DATA MODELING

LOGICAL DATA MODELING LOGICAL DATA MODELING INTEGRATED SERIES IN INFORMATION SYSTEMS Professor Ramesh Sharda Oklahoma State University Series Editors Prof. Dr. Stefan VoB Universitat Hamburg Expository and Research Monographs

More information

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS MULTIMEDIA DATABASE MANAGEMENT SYSTEMS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE MULTIMEDIA SYSTEMS AND APPLICATIONS Recently Published Titles: Consulting Editor Borko Furht Florida

More information

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application Data Structures Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali 2009-2010 Association Rules: Basic Concepts and Application 1. Association rules: Given a set of transactions, find

More information

Scheduling in Distributed Computing Systems Analysis, Design & Models

Scheduling in Distributed Computing Systems Analysis, Design & Models Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) Scheduling in Distributed Computing Systems Analysis, Design & Models (A Research Monograph) by Deo Prakash

More information

Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4

Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Stephen L. Campbell, Jean-Philippe Chancelier and Ramine Nikoukhah Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4 Second Edition

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

Graphics Programming in c++

Graphics Programming in c++ Graphics Programming in c++ Springer London Berlin Heidelberg New York Barcelona Budapest Hong Kong Milan Paris Santa Clara Singapore Tokyo Mark Walmsley Graphics Programming in c++ Writing Graphics Applications

More information

(All chapters begin with an Introduction end with a Summary, Exercises, and Reference and Bibliography) Preliminaries An Overview of Database

(All chapters begin with an Introduction end with a Summary, Exercises, and Reference and Bibliography) Preliminaries An Overview of Database (All chapters begin with an Introduction end with a Summary, Exercises, and Reference and Bibliography) Preliminaries An Overview of Database Management What is a database system? What is a database? Why

More information

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ONTOLOGY LEARNING FOR THE SEMANTIC WEB ONTOLOGY LEARNING FOR THE SEMANTIC WEB by Alexander Maedche University of Karlsruhe, Germany SPRINGER

More information

A Structured Programming Approach to Data

A Structured Programming Approach to Data A Structured Programming Approach to Data Macmillan Computer Science Series Consulting Editor: Professor F. H. Sumner, University of Manchester J. K. Buckle, The ICL 2900 Series Andrew J. T. Colin, Programming

More information

Research on Industrial Security Theory

Research on Industrial Security Theory Research on Industrial Security Theory Menggang Li Research on Industrial Security Theory Menggang Li China Centre for Industrial Security Research Beijing, People s Republic of China ISBN 978-3-642-36951-3

More information

Software Development for SAP R/3

Software Development for SAP R/3 Software Development for SAP R/3 Springer-Verlag Berlin Heidelberg GmbH Ulrich Mende Software Development for SAP R/3 Data Dictionary, ABAP/4, Interfaces With Diskette With 124 Figures and Many Example

More information

Functional Programming in R

Functional Programming in R Functional Programming in R Advanced Statistical Programming for Data Science, Analysis and Finance Thomas Mailund Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis

More information

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10)

CMPUT 391 Database Management Systems. Data Mining. Textbook: Chapter (without 17.10) CMPUT 391 Database Management Systems Data Mining Textbook: Chapter 17.7-17.11 (without 17.10) University of Alberta 1 Overview Motivation KDD and Data Mining Association Rules Clustering Classification

More information

The MATLAB 5 Handbook

The MATLAB 5 Handbook The MATLAB 5 Handbook Springer New York Berlin Heidelberg Barcelona Budapest Hong Kong London Milan Paris Singapore Tokyo Darren Redfern Colin Campbell The MATLAB 5 Handbook Springer Darren Redfern Practical

More information

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R,

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, statistics foundations 5 Introduction to D3, visual analytics

More information

Java Quick Syntax Reference. Second Edition. Mikael Olsson

Java Quick Syntax Reference. Second Edition. Mikael Olsson Java Quick Syntax Reference Second Edition Mikael Olsson Java Quick Syntax Reference Second Edition Mikael Olsson Java Quick Syntax Reference Mikael Olsson Hammarland, Länsi-Suomi, Finland ISBN-13 (pbk):

More information

Visualization in Supercomputing

Visualization in Supercomputing Visualization in Supercomputing Raul H. Mendez Editor Visualization in Supercomputing With 166 Illustrations, 25 in Color Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Raul H.

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Open Geometry: OpenGL + Advanced Geometry

Open Geometry: OpenGL + Advanced Geometry Open Geometry: OpenGL + Advanced Geometry Springer-Science+Business Media, LLC Open Geometry: OpenGL + Advanced Geometry Georg Glaeser Hellmuth Stachel Springer Georg G laeser University of Applied Arts,

More information

Advanced Data Mining Techniques

Advanced Data Mining Techniques Advanced Data Mining Techniques David L. Olson Dursun Delen Advanced Data Mining Techniques Dr. David L. Olson Department of Management Science University of Nebraska Lincoln, NE 68588-0491 USA dolson3@unl.edu

More information

Chapter 4: Mining Frequent Patterns, Associations and Correlations

Chapter 4: Mining Frequent Patterns, Associations and Correlations Chapter 4: Mining Frequent Patterns, Associations and Correlations 4.1 Basic Concepts 4.2 Frequent Itemset Mining Methods 4.3 Which Patterns Are Interesting? Pattern Evaluation Methods 4.4 Summary Frequent

More information

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore Data Warehousing Data Mining (17MCA442) 1. GENERAL INFORMATION: PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore 560 100 Department of MCA COURSE INFORMATION SHEET Academic

More information

Stefan Waldmann. Topology. An Introduction

Stefan Waldmann. Topology. An Introduction Topology Stefan Waldmann Topology An Introduction 123 Stefan Waldmann Julius Maximilian University of Würzburg Würzburg Germany ISBN 978-3-319-09679-7 ISBN 978-3-319-09680-3 (ebook) DOI 10.1007/978-3-319-09680-3

More information

Communication Complexity and Parallel Computing

Communication Complexity and Parallel Computing Juraj Hromkovic Communication Complexity and Parallel Computing With 40 Figures Springer Table of Contents 1 Introduction 1 1.1 Motivation and Aims 1 1.2 Concept and Organization 4 1.3 How to Read the

More information

Algorithms for Discrete Fourier Transform and Convolution

Algorithms for Discrete Fourier Transform and Convolution Algorithms for Discrete Fourier Transform and Convolution Second Edition Springer Science+Business Media, LLC Signal Processing and Digital Filtering Synthetic Aperture Radar J.P. Fitch Multiplicative

More information

Preface. and Its Applications 81, ISBN , doi: / , Springer Science+Business Media New York, 2013.

Preface. and Its Applications 81, ISBN , doi: / , Springer Science+Business Media New York, 2013. Preface This book is for all those interested in using the GAMS technology for modeling and solving complex, large-scale, continuous nonlinear optimization problems or applications. Mainly, it is a continuation

More information

SYNTHESIS OF FINITE STATE MACHINES: LOGIC OPTIMIZATION

SYNTHESIS OF FINITE STATE MACHINES: LOGIC OPTIMIZATION SYNTHESIS OF FINITE STATE MACHINES: LOGIC OPTIMIZATION SYNTHESIS OF FINITE STATE MACHINES: LOGIC OPTIMIZATION Tiziano Villa University of California/Berkeley Timothy Kam Intel Corporation Robert K. Brayton

More information

Database Performance Tuning and Optimization. Using Oracle

Database Performance Tuning and Optimization. Using Oracle Database Performance Tuning and Optimization Using Oracle Springer New York Berlin Heidelberg Hong Kong London Milan Paris Tokyo Sitansu S. Mittra Database Performance Tuning and Optimization Using Oracle

More information

High-Performance Parallel Database Processing and Grid Databases

High-Performance Parallel Database Processing and Grid Databases High-Performance Parallel Database Processing and Grid Databases David Taniar Monash University, Australia Clement H.C. Leung Hong Kong Baptist University and Victoria University, Australia Wenny Rahayu

More information

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach Library of Congress Cataloging-in-Publication ISBN 978-1-4613-5091-0 ISBN 978-1-4615-0467-2 (ebook) DOI 10.1007/978-1-4615-0467-2

More information

The Discovery and Retrieval of Temporal Rules in Interval Sequence Data

The Discovery and Retrieval of Temporal Rules in Interval Sequence Data The Discovery and Retrieval of Temporal Rules in Interval Sequence Data by Edi Winarko, B.Sc., M.Sc. School of Informatics and Engineering, Faculty of Science and Engineering March 19, 2007 A thesis presented

More information

SCHEME OF COURSE WORK. Data Warehousing and Data mining

SCHEME OF COURSE WORK. Data Warehousing and Data mining SCHEME OF COURSE WORK Course Details: Course Title Course Code Program: Specialization: Semester Prerequisites Department of Information Technology Data Warehousing and Data mining : 15CT1132 : B.TECH

More information

LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS

LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS DINESH C. VERMA IBM T. J. Watson Research Center A JOHN WILEY & SONS, INC., PUBLICATION LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS LEGITIMATE APPLICATIONS

More information

INTELLIGENT SUPERMARKET USING APRIORI

INTELLIGENT SUPERMARKET USING APRIORI INTELLIGENT SUPERMARKET USING APRIORI Kasturi Medhekar 1, Arpita Mishra 2, Needhi Kore 3, Nilesh Dave 4 1,2,3,4Student, 3 rd year Diploma, Computer Engineering Department, Thakur Polytechnic, Mumbai, Maharashtra,

More information

Relational Database Index Design and the Optimizers

Relational Database Index Design and the Optimizers Relational Database Index Design and the Optimizers DB2, Oracle, SQL Server, et al. Tapio Lahdenmäki Michael Leach A JOHN WILEY & SONS, INC., PUBLICATION Relational Database Index Design and the Optimizers

More information

Chunjie Duan Brock J. LaMeres Sunil P. Khatri. On and Off-Chip Crosstalk Avoidance in VLSI Design

Chunjie Duan Brock J. LaMeres Sunil P. Khatri. On and Off-Chip Crosstalk Avoidance in VLSI Design Chunjie Duan Brock J. LaMeres Sunil P. Khatri On and Off-Chip Crosstalk Avoidance in VLSI Design 123 On and Off-Chip Crosstalk Avoidance in VLSI Design Chunjie Duan Brock J. LaMeres Sunil P. Khatri On

More information

JavaScript Essentials for SAP ABAP Developers

JavaScript Essentials for SAP ABAP Developers JavaScript Essentials for SAP ABAP Developers A Guide to Mobile and Desktop Application Development Rehan Zaidi JavaScript Essentials for SAP ABAP Developers: A Guide to Mobile and Desktop Application

More information

Energy Efficient Microprocessor Design

Energy Efficient Microprocessor Design Energy Efficient Microprocessor Design Energy Efficient Microprocessor Design by Thomas D. Burd Robert W. Brodersen with Contributions Irom Trevor Pering Anthony Stratakos Berkeley Wireless Research Center

More information

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S.

Privacy-Preserving. Introduction to. Data Publishing. Concepts and Techniques. Benjamin C. M. Fung, Ke Wang, Chapman & Hall/CRC. S. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Introduction to Privacy-Preserving Data Publishing Concepts and Techniques Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu CRC

More information

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures WIRELESS ATM AND AD-HOC NETWORKS Protocols and Architectures C-K Toh, Ph.D. University of Cambridge Cambridge, United Kingdom SPRINGER-SCIENCE+BUSINESS

More information

A mining method for tracking changes in temporal association rules from an encoded database

A mining method for tracking changes in temporal association rules from an encoded database A mining method for tracking changes in temporal association rules from an encoded database Chelliah Balasubramanian *, Karuppaswamy Duraiswamy ** K.S.Rangasamy College of Technology, Tiruchengode, Tamil

More information

SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE

SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE SOFTWARE-IMPLEMENTED HARDWARE FAULT TOLERANCE O. Goloubeva, M. Rebaudengo, M. Sonza Reorda, and M. Violante Politecnico di Torino - Dipartimento di Automatica

More information

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the Chapter 6: What Is Frequent ent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc) that occurs frequently in a data set frequent itemsets and association rule

More information

Stock Message Boards

Stock Message Boards Stock Message Boards This page intentionally left blank Stock Message Boards A Quantitative Approach to Measuring Investor Sentiment Ying Zhang STOCK MESSAGE BOARDS Copyright Ying Zhang, 2014. Softcover

More information

Whitestein Series in software Agent Technologies. About whitestein Technologies

Whitestein Series in software Agent Technologies. About whitestein Technologies Whitestein Series in software Agent Technologies Series Editors: Marius Walliser Stefan Brantschen Monique Calisti Thomas Hempfling This series reports new developments in agent-based software technologies

More information

Computer-Aided Design in Magnetics

Computer-Aided Design in Magnetics Computer-Aided Design in Magnetics D. A. Lowther P. P. Silvester Computer-Aided Design in Magnetics With 84 illustrations Springer-Verlag Berlin Heidelberg New York Tokyo D. A. Lowther Associate Professor

More information

This content has been downloaded from IOPscience. Please scroll down to see the full text.

This content has been downloaded from IOPscience. Please scroll down to see the full text. This content has been downloaded from IOPscience. Please scroll down to see the full text. Download details: IP Address: 148.251.232.83 This content was downloaded on 22/11/2018 at 08:50 Please note that

More information

IMPLEMENTATION AND COMPARATIVE STUDY OF IMPROVED APRIORI ALGORITHM FOR ASSOCIATION PATTERN MINING

IMPLEMENTATION AND COMPARATIVE STUDY OF IMPROVED APRIORI ALGORITHM FOR ASSOCIATION PATTERN MINING IMPLEMENTATION AND COMPARATIVE STUDY OF IMPROVED APRIORI ALGORITHM FOR ASSOCIATION PATTERN MINING 1 SONALI SONKUSARE, 2 JAYESH SURANA 1,2 Information Technology, R.G.P.V., Bhopal Shri Vaishnav Institute

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to JULY 2011 Afsaneh Yazdani What motivated? Wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge What motivated? Data

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

Enumerating Pseudo-Intents in a Partial Order

Enumerating Pseudo-Intents in a Partial Order Enumerating Pseudo-Intents in a Partial Order Alexandre Bazin and Jean-Gabriel Ganascia Université Pierre et Marie Curie, Laboratoire d Informatique de Paris 6 Paris, France Alexandre.Bazin@lip6.fr Jean-Gabriel@Ganascia.name

More information

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining D.Kavinya 1 Student, Department of CSE, K.S.Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India 1

More information

Project Participants

Project Participants Annual Report for Period:10/2004-10/2005 Submitted on: 06/21/2005 Principal Investigator: Yang, Li. Award ID: 0414857 Organization: Western Michigan Univ Title: Projection and Interactive Exploration of

More information

A Study on Association Rule Mining Using ACO Algorithm for Generating Optimized ResultSet

A Study on Association Rule Mining Using ACO Algorithm for Generating Optimized ResultSet Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 11, November 2013,

More information

Table Of Contents: xix Foreword to Second Edition

Table Of Contents: xix Foreword to Second Edition Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data

More information

LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS DINESH C. VERMA IBM T. J. Watson Research Center A JOHN WILEY & SONS, INC., PUBLICATION

LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS DINESH C. VERMA IBM T. J. Watson Research Center A JOHN WILEY & SONS, INC., PUBLICATION LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS DINESH C. VERMA IBM T. J. Watson Research Center A JOHN WILEY & SONS, INC., PUBLICATION LEGITIMATE APPLICATIONS OF PEER-TO-PEER NETWORKS LEGITIMATE APPLICATIONS

More information

Association Pattern Mining. Lijun Zhang

Association Pattern Mining. Lijun Zhang Association Pattern Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction The Frequent Pattern Mining Model Association Rule Generation Framework Frequent Itemset Mining Algorithms

More information

DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING

DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING DISCOVERING INFORMATIVE KNOWLEDGE FROM HETEROGENEOUS DATA SOURCES TO DEVELOP EFFECTIVE DATA MINING Ms. Pooja Bhise 1, Prof. Mrs. Vidya Bharde 2 and Prof. Manoj Patil 3 1 PG Student, 2 Professor, Department

More information

Evaluation of Seed Selection Strategies for Vehicle to Vehicle Epidemic Information Dissemination

Evaluation of Seed Selection Strategies for Vehicle to Vehicle Epidemic Information Dissemination Evaluation of Seed Selection Strategies for Vehicle to Vehicle Epidemic Information Dissemination Richard Kershaw and Bhaskar Krishnamachari Ming Hsieh Department of Electrical Engineering, Viterbi School

More information

Data Modeling: Beginning and Advanced HDT825 Five Days

Data Modeling: Beginning and Advanced HDT825 Five Days Five Days Prerequisites Students should have experience designing databases. Who Should Attend This course is targeted at database designers, data modelers, database analysts, and anyone else who needs

More information

Distance-based Outlier Detection: Consolidation and Renewed Bearing

Distance-based Outlier Detection: Consolidation and Renewed Bearing Distance-based Outlier Detection: Consolidation and Renewed Bearing Gustavo. H. Orair, Carlos H. C. Teixeira, Wagner Meira Jr., Ye Wang, Srinivasan Parthasarathy September 15, 2010 Table of contents Introduction

More information

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Contents. Foreword to Second Edition. Acknowledgments About the Authors Contents Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments About the Authors xxxi xxxv Chapter 1 Introduction 1 1.1 Why Data Mining? 1 1.1.1 Moving toward the Information Age 1

More information

Frequent Pattern Mining. Based on: Introduction to Data Mining by Tan, Steinbach, Kumar

Frequent Pattern Mining. Based on: Introduction to Data Mining by Tan, Steinbach, Kumar Frequent Pattern Mining Based on: Introduction to Data Mining by Tan, Steinbach, Kumar Item sets A New Type of Data Some notation: All possible items: Database: T is a bag of transactions Transaction transaction

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

Essential Angular for ASP.NET Core MVC

Essential Angular for ASP.NET Core MVC Essential Angular for ASP.NET Core MVC Adam Freeman Essential Angular for ASP.NET Core MVC Adam Freeman London, UK ISBN-13 (pbk): 978-1-4842-2915-6 ISBN-13 (electronic): 978-1-4842-2916-3 DOI 10.1007/978-1-4842-2916-3

More information

Performance Based Study of Association Rule Algorithms On Voter DB

Performance Based Study of Association Rule Algorithms On Voter DB Performance Based Study of Association Rule Algorithms On Voter DB K.Padmavathi 1, R.Aruna Kirithika 2 1 Department of BCA, St.Joseph s College, Thiruvalluvar University, Cuddalore, Tamil Nadu, India,

More information

FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS

FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE INFORMATION THEORY Consulting Editor Robert G. Gallager FINITE FIELDS FOR COMPUTER

More information

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Lectures for the course: Data Warehousing and Data Mining (IT 60107) Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline

More information

The Information Retrieval Series. Series Editor W. Bruce Croft

The Information Retrieval Series. Series Editor W. Bruce Croft The Information Retrieval Series Series Editor W. Bruce Croft Sándor Dominich The Modern Algebra of Information Retrieval 123 Sándor Dominich Computer Science Department University of Pannonia Egyetem

More information

Comparison of FP tree and Apriori Algorithm

Comparison of FP tree and Apriori Algorithm International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 6 (June 2014), PP.78-82 Comparison of FP tree and Apriori Algorithm Prashasti

More information

Groupware and the World Wide Web

Groupware and the World Wide Web Groupware and the World Wide Web Edited by Richard Bentley, Uwe Busbach, David Kerr & Klaas Sikkel German National Research Center for Information Technology, Institutefor Applied Information Technology

More information

Digital System Test and Testable Design

Digital System Test and Testable Design Digital System Test and Testable Design wwwwwwwwwwww Zainalabedin Navabi Digital System Test and Testable Design Using HDL Models and Architectures Zainalabedin Navabi Worcester Polytechnic Institute Department

More information

FUZZY LOGIC WITH ENGINEERING APPLICATIONS

FUZZY LOGIC WITH ENGINEERING APPLICATIONS FUZZY LOGIC WITH ENGINEERING APPLICATIONS Third Edition Timothy J. Ross University of New Mexico, USA A John Wiley and Sons, Ltd., Publication FUZZY LOGIC WITH ENGINEERING APPLICATIONS Third Edition FUZZY

More information

Measuring and Evaluating Dissimilarity in Data and Pattern Spaces

Measuring and Evaluating Dissimilarity in Data and Pattern Spaces Measuring and Evaluating Dissimilarity in Data and Pattern Spaces Irene Ntoutsi, Yannis Theodoridis Database Group, Information Systems Laboratory Department of Informatics, University of Piraeus, Greece

More information

Computer Science Workbench. Editor: Tosiyasu L. Kunii

Computer Science Workbench. Editor: Tosiyasu L. Kunii Computer Science Workbench Editor: Tosiyasu L. Kunii H. Kitagawa T.L. Kunii The U nnortnalized Relational Data Model F or Office Form Processor Design With 78 Figures Springer-Verlag Tokyo Berlin Heidelberg

More information

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Second Edition. Alan Kaminsky

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Second Edition. Alan Kaminsky Solving the World s Toughest Computational Problems with Parallel Computing Second Edition Alan Kaminsky Department of Computer Science B. Thomas Golisano College of Computing and Information Sciences

More information

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should

More information

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining. About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts

More information

Philip Andrew Simpson. FPGA Design. Best Practices for Team-based Reuse. Second Edition

Philip Andrew Simpson. FPGA Design. Best Practices for Team-based Reuse. Second Edition FPGA Design Philip Andrew Simpson FPGA Design Best Practices for Team-based Reuse Second Edition Philip Andrew Simpson San Jose, CA, USA ISBN 978-3-319-17923-0 DOI 10.1007/978-3-319-17924-7 ISBN 978-3-319-17924-7

More information

Network Performance Analysis

Network Performance Analysis Network Performance Analysis Network Performance Analysis Thomas Bonald Mathieu Feuillet Series Editor Pierre-Noël Favennec First published 2011 in Great Britain and the United States by ISTE Ltd and

More information

Data Preprocessing. Data Preprocessing

Data Preprocessing. Data Preprocessing Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should

More information

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

A Parallel Evolutionary Algorithm for Discovery of Decision Rules A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl

More information

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM SUHAS PATIL [M.Tech Scholar, Department Of Computer Science &Engineering, RKDF IST, Bhopal, RGPV University, India] Dr.Varsha Namdeo [Assistant Professor, Department

More information

Materialized Data Mining Views *

Materialized Data Mining Views * Materialized Data Mining Views * Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo 3a, 60-965 Poznan, Poland tel. +48 61

More information