Mining Quantitative Association Rules on Overlapped Intervals

Size: px
Start display at page:

Download "Mining Quantitative Association Rules on Overlapped Intervals"


1 Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang, yczhou} 2 Computer Network Information Center, Chinese Academy of Sciences, Beijing, China 3 Graduate School of the Chinese Academy of Sciences, Beijing, China Abstract. Mining association rules is an important problem in data mining. Algorithms for mining boolean data have been well studied and documented, but they cannot deal with quantitative and categorical data directly. For quantitative attributes, the general idea is partitioning the domain of a quantitative attribute into intervals, and applying boolean algorithms to the intervals. But, there is a conflict between the minimum support problem and the minimum confidence problem, while existing partitioning methods cannot avoid the conflict. Moreover, we expect the intervals to be meaningful. Clustering in data mining is a discovery process which groups a set of data such that the intracluster similarity is maximized and the intercluster similarity is minimized. The discovered clusters are used to explain the characteristics of the data distribution. The present paper will propose a novel method to find quantitative association rules by clustering the transactions of a database into clusters and projecting the clusters into the domains of the quantitative attributes to form meaningful intervals which may be overlapped. Experimental results show that our approach can efficiently find quantitative association rules, and can find important association rules which may be missed by the previous algorithms. 1 Introduction Mining association rules is a key data mining problem and has been widely studied [1]. Finding association rules in binary data has been well investigated and documented [2, 3, 4]. Finding association rules in numeric or categorical data is not as easy as in binary data. However, many real world databases contain quantitative attributes and current solutions to this case are so far inadequate. An association rule is a rule of the form X Y, where X and Y are sets of items. It states that when X occurs in a database so does Y with a certain probability. X is called the antecedent of the rule and Y the consequent. There X. Li, S. Wang, and Z.Y. Dong (Eds.): ADMA 2005, LNAI 3584, pp , c Springer-Verlag Berlin Heidelberg 2005

2 44 Q. Tong, B. Yan, and Y. Zhou are two important parameters associated with an association rule: support and confidence. Support describes the importance of the rule, while confidence determines the occurrence probability of the rule. The most difficult part of an association rule mining algorithm is to find the frequent itemsets. The process is affected by the support parameter designated by the user. A well known application of association rules is in market basket data analysis, which was introduced by Agrawal in 1993 [2]. In the problem of market basket data analysis, the data are boolean, which have values of 1 or 0. The classical association rule mining algorithms are designed for boolean data. However, quantitative and categorical attributes widely exist in current databases. In [5], Srikant and Agrawal proposed an algorithm dealing with quantitative attributes by dividing quantitative attributes into equi-depth intervals and then combining adjacent partitions when necessary. In other words, for a depth d, the first d values of the attribute are placed in one interval, the next d in a second interval, and so on. There are two problems with the current methods of partitioning intervals: MinSup and MinConf [5]. If a quantitative attribute is divided into too many intervals, the support for a single interval can be low. When the support of an interval is below the minimum support, some rules involving the attribute may not be found. This is the minimum support problem. Some rules may have minimum confidence only when a small interval is in the antecedent, and the information loss increases as the interval size becomes larger. This is the minimum confidence problem. The critical part of mining quantitative association rules is to divide the domains of the quantitative attributes into intervals. There are several classical dividing methods. The equi-width method divides the domain of a quantitative attribute into n intervals, and each interval has the same length. In the equidepth method, there are equal number of items contained in each interval. The equi-width method and the equi-depth method are so straightforward that the partitions of quantitative attributes may not be meaningful, and cannot deal with the minimum confidence problem. In [5], Srikant and Agrawal introduced a measure of partial completeness which quantified the information lost due to partitioning, and developed an algorithm to partition quantitative attributes. In [6], Miller and Yang pointed out the pitfalls of the equi-depth method, and presented several guiding principles for quantitative attribute partitioning. In selecting intervals or groups of data to consider, they wanted to have a measure of interval quality to reflect the distance among data points. They took the distance among data into account, since they believed that putting closer data together was more meaningful. To achieve this goal, they presented a more general form of an association rule, and used clustering to find subsets that made sense by containing a set of attributes that were close enough. They proposed an algorithm which used birch [7] to find clusters in the quantitative attributes and used the discovered clusters to form items, then fed the items into the classical boolean algorithm apriori [3]. In their algorithm, clustering was used to determine sets of dense values in a single attribute or over a set of attributes that were to be treated as a whole.

3 Mining Quantitative Association Rules on Overlapped Intervals 45 Although Miller and Yang took the distance among data into account and used a clustering method to make the intervals of quantitative attributes more meaningful, they did not take the relations among other attributes into account by clustering a quantitative attribute or a set of quantitative attributes alone. We believe that their technique still falls shot of a desirable goal. Based on the above analysis, we find that the partitioning method can be further improved. On the one hand, since clustering an attribute or a set of attributes alone is not good enough, we believe that the relations among attributes should be taken into account. We tend to cluster all attributes together, and project the clusters into the domains of the quantitative attributes. On the other hand, the projection of the clusters on a specific attribute can be overlapped. We think this is reasonable. Moreover, this is a good resolution to the conflict between the minimum support problem and the minimum confidence problem. A small interval may result in the minimum support problem, while a large interval may lead to the minimum confidence problem. When several overlapped intervals coexist in the domain of a quantitative attribute, and different intervals are used for different rules, the conflict between the two problems which confuses the quantitative attributes partitioning does not exist. In this paper, we propose an approach which first applys a clustering algorithm to all attributes, and projects the discovered clusters into the domains of all attributes to form intervals (the intervals may be overlapped), then uses a boolean association rule mining algorithm to find association rules. The rest of the paper is organized as follows. In Section 2, we introduce some definitions of the quantitative association rule mining problem and review the background in brief. In Section 3, we present our approach and our algorithm. In Section 4, we give the experimental results and our analysis. Finally, in Section 5, we give the conclusions and the future work. 2 Problem Description Now, we give a formal statement of the problem of mining quantitative association rules and introduce some definitions. Let I = {i 1, i 2,..., i n } be a set of attributes, and R be the set of real numbers, I R = X R R, that is I R = {(x, l, u) x I, l R, u R, l x u}. A triple (x, l, u) I R denotes either a quantitative attribute x with a value interval [l, u], or a categorical attribute with a value l (l = u). Let D be a set of transactions, where each transaction T is a set of attribute values. X I R, if (x, l, u) X, (x, v) T, l v u, we say transaction T supports X. A quantitative association rule is an implication of the form X Y, where X I R, Y I R, and attribute(x) attribute(y ) =. If s percent of transactions in D support X Y, and c percent of transactions which support X also support Y, we say that the association rule has support s and confidence c respectively. The problem of mining quantitative association rules is the process of finding association rules which meet the minimum support and the minimum confidence at a given transaction database which contains quantitative and/or categorical attributes.

4 46 Q. Tong, B. Yan, and Y. Zhou Clustering can be considered the most important unsupervised learning technique, which deals with finding a structure in a collection of unlabeled data. A cluster is therefore a collection of objects which are similar to each other and are dissimilar to the objects belonging to other clusters [8]. In this paper, a cluster is a set of transactions. An important component of a clustering algorithm is the distance measure among data points. In [6], Miller and Yang defined two thresholds on the cluster size and the diameter. First, the diameter of a cluster should be less than a specific value to ensure that the cluster is sufficiently dense. Second, the number of transactions contained in a cluster should be greater than the minimum support to ensure that the cluster is frequent. Since our clustering approach is different, our definition of the diameter of a cluster is also different. Definition 1. We adopt Euclidean distance as the distance metric, and the distance between two transactions is defined as d(i i, I j ) = n (I ik I jk ) 2 (1) k=1 Definition 2. A cluster C = {I 1, I 2,..., I m } is a set of transactions, and the gravity center of C is defined as C g = 1 m m I i (2) Definition 3. The diameter of a cluster C = {I 1, I 2,..., I m } is defined as D g (C) = 1 m i=1 m (I i C g ) T (I i C g ) (3) i=1 Definition 4. The number of transactions contained in a cluster C is denoted as C, d 0 and s 0 are thresholds for association rule mining, C and D g (C) should satisfy the following formula C s 0, D g (C) d 0 (4) The first criterion ensures that the cluster contains enough number of transactions to be frequent. The second criterion ensures that the cluster is dense enough. To ensure that clusters are isolated from each other, we will rely on a clustering algorithm to discover clusters which are as isolated as possible. 3 The Proposed Approach In this section, we describe our approach of mining quantitative association rules. We divide the problem of mining quantitative association rules into several steps:

5 Mining Quantitative Association Rules on Overlapped Intervals Map the attributes of the given database to I R = I R R. For ordered categorical attributes, map the values of the attribute to a set of consecutive integers, such that the order of the attributes is preserved. For unordered categorical attributes, we define the distance between two different attributes as a constant value. For boolean attributes, map the values of the attributes to 0 and 1. For quantitative attributes, we keep the original values or transform the values to a standard form, such as Z-Score. We adopt various mapping methods to fit the clustering algorithm. For different data sets, we may use different mapping methods. 2. Apply a clustering algorithm to the new database produced by the first step. In the clustering algorithm, by dealing with the transactions as n- dimension vectors, we take all attributes into account. In this paper, we adopt a common clustering algorithm k-means to identify transaction groups that are compact (the distance among transactions within a cluster is small) and isolated (relatively separable from other groups). By clustering all attributes together, the relations among all attributes are considered, and the clusters may be more meaningful. Besides, we also use Definition 4 as the principle for evaluating the quality of the discovered clusters. 3. Project the clusters into the domains of the quantitative attributes. The projections of the clusters will form overlapped intervals. We make an interval x [l, u] a new boolean attribute. The two-dimension example of the projection is shown in Figure Mine association rules by using a classical boolean algorithm. Since the quantitative attributes have been booleanized, we can use a boolean algorithm (such as apriori) to find frequent itemsets, and then use the frequent itemsets to generate association rules. I2 u1 u3 C1 C3 l1 u2 l3 C2 l2 l1 l2 u1 u2 l3 u3 I1 Fig. 1. Projecting the clusters into the domains of quantitative attributes to form intervals, which may be overlapped

6 48 Q. Tong, B. Yan, and Y. Zhou 4 Experimental Results Our experimental environment is an IBM Netfinity 5600 server with dual PIII 866 CPUs and 512M memory, which runs Linux operating system. The experiment has been done over a real data set of bodyfat [10]. The attributes in the bodyfat dataset are: density, age, weight, height, neck, chest, abdomen, hip, thigh, knee, ankle, biceps, forearm and wrist. All of the attributes are quantitative attributes. There are 252 records of various people in the dataset. Our purpose is to find association rules over all attributes. For our algorithm, the parameters needed from the user are the minimum support, the minimum confidence, and the number of clusters. In our experiment, we use minimum support of 10%, minimum confidence of 60%, and the clusters of six. We first use a common clustering algorithm (k-means) to find clusters, then project the clusters into the domains of the quantitative attributes, and finally use a boolean association rule mining algorithm (apriori) to find association rules. Some of the rules which we have found are listed in Figure 2. From the above rules, we can see that the intervals are overlapped, which cannot be discovered by the previous partitioning methods. The equi-width method cannot divide some quantitative attributes properly (such as density), because the attributes range only in a very small domain, while the equi-depth method may put far apart transactions into the same interval. As shown in Figure 1, our partitioning method projects the clusters into the domains of the quantitative attributes, and forms overlapped intervals. Our method considers both the distance among transactions and the relations among attributes. For previous methods, if an interval is small, it may not meet the minimum support; if an interval is large, it may not meet the minimum confidence. In our method, since the intervals can be overlapped, we can avoid the conflict between the minimum support problem and the minimum confidence problem. Moreover, since our intervals tend to be less than those of the previous methods, the boolean association rule mining algorithm works more efficiently. ID Rules 1 Age[40, 74]&Weight[178, 216] Abdomen[88.7, 113.1] 2 Age[34, 42]&Weight[195.75, ] Chest[99.6, 115.6] 3 Weight[219, ] Hip[105.5, 147.7]&Chest[108.3, 136.2] 4 Weight[154, 191]&Height[65.5, 77.5] Density[1.025, 1.09] 5 Abdomen[88.6, 111.2]&Hip[101.8, 115.5] Weight[196,224] 6 Biceps[24.8, 38.5] & Forearm[22, 34.9] Wrist[15.8, 18.5] 7 Thigh[54.7, 69]&Knee[34.2, 42.2] Ankle[21.4, 33.9] 8 Weight[118.5, ]&Height[64, 73.5] Density[1.047, 1.11] Fig. 2. Some of the rules discovered by our algorithm with the parameters (minimum support = 10%, minimum confidence = 60%, and the number of clusters k = 6)

7 Mining Quantitative Association Rules on Overlapped Intervals 49 5 Conclusions and Future Work In this paper, we have proposed a novel approach to efficiently find quantitative association rules. The critical part of quantitative association rule mining is to partition the domains of quantitative attributes into intervals. The previous algorithms dealt with this problem by dividing the domains of quantitative attributes into equi-depth or equi-width intervals, or using a clustering algorithm on a single attribute (or a set of attributes) alone. They cannot avoid the conflict between the minimum support problem and the minimum confidence problem, and risk missing some important rules. In our approach, we treat a transaction as an n-dimension vector, and apply a common clustering algorithm to the vectors, then project the clusters into the domains of the quantitative attributes to form overlapped intervals. We finally use a classical boolean algorithm to find association rules. Our approach takes the relations and the distances among attributes into account, and can resolve the conflict between the minimum support problem and the minimum confidence problem by allowing intervals to be overlapped. Experimental results show that our approach can efficiently find quantitative association rules, and can find important association rules which may be missed by the previous algorithms. Since our approach adopts a common clustering algorithm and a classical boolean association rule mining algorithm rather than integrates the two algorithms together, we believe that our approach can be further improved by integrating the clustering algorithm and the association rule mining algorithm tightly in our future work. Acknowledgement This work is partially supported by the National Hi-Tech Development 863 Program of China under grant No. 2002AA104240, and the Informatization Project under the Tenth Five-Year Plan of the Chinese Academy of Sciences under grant No. INF105-SDB. We thank Mr. Longshe Huo and Miss Hong Pan for their helpful suggestions. References 1. Han, J., Kamber, M.: Data Mining Concepts and Techniques. China Machine Press and Morgan Kaufmann Publishers (2001) 2. Agrawal, R., Imielinski, T. and Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In Proc. of the 1993 ACM SIGMOD International Conf. on Management of Data, Washington, D.C., May (1993) Agrawal, R. and Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In Proc. of 20th International Conf. on Very Large Data Bases, Santiago, Chile, September (1994) Han, J., Pei, J., Yin, Y. and Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery (2004) 8, 53 87

8 50 Q. Tong, B. Yan, and Y. Zhou 5. Srikant, R. and Agrawal, R.: Mining Quantitative Association Rules in Large Relational Tables. In Proc. of the 1996 ACM SIGMOD International Conf. on Management of Data, Montreal, Canada, June (1996) Miller, R. J. and Yang, Y.: Association Rules over Interval Data. In Proc. of the 1997 ACM SIGMOD International Conf. on Management of Data, Tucson, Arizona, United States, May (1997) Zhang, R., Ramakrishnan, R. and Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In Proc. of the 1996 ACM SIGMOD International Conf. on Management of Data, Montreal, Canada, June (1996) Jain, A. K., Dubes, R. C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, New Jersey (1988) 9. Kaufman, L. and Rousseeuw, P. J.: Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley and Sons (1990) 10. Bailey, C.: Smart Exercise: Burning Fat, Getting Fit. Houghton-Mifflin Co., Boston (1994)

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm

Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Expert Systems: Final (Research Paper) Project Daniel Josiah-Akintonde December

More information

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach

An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach ABSTRACT G.Ravi Kumar 1 Dr.G.A. Ramachandra 2 G.Sunitha 3 1. Research Scholar, Department of Computer Science &Technology,

More information

Using Association Rules for Better Treatment of Missing Values

Using Association Rules for Better Treatment of Missing Values Using Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine Intelligence Group) National University

More information

Improved Frequent Pattern Mining Algorithm with Indexing

Improved Frequent Pattern Mining Algorithm with Indexing IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.

More information

Discovery of Multi Dimensional Quantitative Closed Association Rules by Attributes Range Method

Discovery of Multi Dimensional Quantitative Closed Association Rules by Attributes Range Method Discovery of Multi Dimensional Quantitative Closed Association Rules by Attributes Range Method Preetham Kumar, Ananthanarayana V S Abstract In this paper we propose a novel algorithm for discovering multi

More information

Associating Terms with Text Categories

Associating Terms with Text Categories Associating Terms with Text Categories Osmar R. Zaïane Department of Computing Science University of Alberta Edmonton, AB, Canada Maria-Luiza Antonie Department of Computing Science

More information

A Conflict-Based Confidence Measure for Associative Classification

A Conflict-Based Confidence Measure for Associative Classification A Conflict-Based Confidence Measure for Associative Classification Peerapon Vateekul and Mei-Ling Shyu Department of Electrical and Computer Engineering University of Miami Coral Gables, FL 33124, USA

More information



More information

Discovering interesting rules from financial data

Discovering interesting rules from financial data Discovering interesting rules from financial data Przemysław Sołdacki Institute of Computer Science Warsaw University of Technology Ul. Andersa 13, 00-159 Warszawa Tel: +48 609129896 email:

More information

Temporal Weighted Association Rule Mining for Classification

Temporal Weighted Association Rule Mining for Classification Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider

More information

PTclose: A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets

PTclose: A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets : A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets J. Tahmores Nezhad ℵ, M.H.Sadreddini Abstract In recent years, various algorithms for mining closed frequent

More information

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results Yaochun Huang, Hui Xiong, Weili Wu, and Sam Y. Sung 3 Computer Science Department, University of Texas - Dallas, USA, {yxh03800,wxw0000}

More information

Mining High Average-Utility Itemsets

Mining High Average-Utility Itemsets Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering

More information



More information

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets

PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets 2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets Tao Xiao Chunfeng Yuan Yihua Huang Department

More information

Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree

Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree Virendra Kumar Shrivastava 1, Parveen Kumar 2, K. R. Pardasani 3 1 Department of Computer Science & Engineering, Singhania

More information

To Enhance Projection Scalability of Item Transactions by Parallel and Partition Projection using Dynamic Data Set

To Enhance Projection Scalability of Item Transactions by Parallel and Partition Projection using Dynamic Data Set To Enhance Scalability of Item Transactions by Parallel and Partition using Dynamic Data Set Priyanka Soni, Research Scholar (CSE), MTRI, Bhopal, Dhirendra Kumar Jha, MTRI, Bhopal,

More information

Efficient Mining of Generalized Negative Association Rules

Efficient Mining of Generalized Negative Association Rules 2010 IEEE International Conference on Granular Computing Efficient Mining of Generalized egative Association Rules Li-Min Tsai, Shu-Jing Lin, and Don-Lin Yang Dept. of Information Engineering and Computer

More information

An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets

An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.8, August 2008 121 An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets

More information

Mining Temporal Association Rules in Network Traffic Data

Mining Temporal Association Rules in Network Traffic Data Mining Temporal Association Rules in Network Traffic Data Guojun Mao Abstract Mining association rules is one of the most important and popular task in data mining. Current researches focus on discovering

More information

Parallel Implementation of Apriori Algorithm Based on MapReduce

Parallel Implementation of Apriori Algorithm Based on MapReduce International Journal of Networked and Distributed Computing, Vol. 1, No. 2 (April 2013), 89-96 Parallel Implementation of Apriori Algorithm Based on MapReduce Ning Li * The Key Laboratory of Intelligent

More information



More information

Structure of Association Rule Classifiers: a Review

Structure of Association Rule Classifiers: a Review Structure of Association Rule Classifiers: a Review Koen Vanhoof Benoît Depaire Transportation Research Institute (IMOB), University Hasselt 3590 Diepenbeek, Belgium

More information

Data Structure for Association Rule Mining: T-Trees and P-Trees

Data Structure for Association Rule Mining: T-Trees and P-Trees IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO. 6, JUNE 2004 1 Data Structure for Association Rule Mining: T-Trees and P-Trees Frans Coenen, Paul Leng, and Shakil Ahmed Abstract Two new

More information

Association Rule Mining from XML Data

Association Rule Mining from XML Data 144 Conference on Data Mining DMIN'06 Association Rule Mining from XML Data Qin Ding and Gnanasekaran Sundarraj Computer Science Program The Pennsylvania State University at Harrisburg Middletown, PA 17057,

More information

Association Rule Mining. Introduction 46. Study core 46

Association Rule Mining. Introduction 46. Study core 46 Learning Unit 7 Association Rule Mining Introduction 46 Study core 46 1 Association Rule Mining: Motivation and Main Concepts 46 2 Apriori Algorithm 47 3 FP-Growth Algorithm 47 4 Assignment Bundle: Frequent

More information

FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases Sanguthevar Rajasekaran

FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases Sanguthevar Rajasekaran FIT: A Fast Algorithm for Discovering Frequent Itemsets in Large Databases Jun Luo Sanguthevar Rajasekaran Dept. of Computer Science Ohio Northern University Ada, OH 4581 Email: Dept. of

More information

A fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p.

A fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. Title A fuzzy k-modes algorithm for clustering categorical data Author(s) Huang, Z; Ng, MKP Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. 446-452 Issued Date 1999 URL

More information

Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data

Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Shilpa Department of Computer Science & Engineering Haryana College of Technology & Management, Kaithal, Haryana, India

More information


COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS Mariam Rehman Lahore College for Women University Lahore, Pakistan Syed Atif Mehdi University of Management and Technology Lahore,

More information

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University Chapter MINING HIGH-DIMENSIONAL DATA Wei Wang 1 and Jiong Yang 2 1. Department of Computer Science, University of North Carolina at Chapel Hill 2. Department of Electronic Engineering and Computer Science,

More information

Combined Intra-Inter transaction based approach for mining Association among the Sectors in Indian Stock Market

Combined Intra-Inter transaction based approach for mining Association among the Sectors in Indian Stock Market Ranjeetsingh BParihar et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol 3 (3), 01,3895-3899 Combined Intra-Inter transaction based approach for mining Association

More information


A NEW ASSOCIATION RULE MINING BASED ON FREQUENT ITEM SET A NEW ASSOCIATION RULE MINING BASED ON FREQUENT ITEM SET Ms. Sanober Shaikh 1 Ms. Madhuri Rao 2 and Dr. S. S. Mantha 3 1 Department of Information Technology, TSEC, Bandra (w), Mumbai

More information

Multi-Modal Data Fusion: A Description

Multi-Modal Data Fusion: A Description Multi-Modal Data Fusion: A Description Sarah Coppock and Lawrence J. Mazlack ECECS Department University of Cincinnati Cincinnati, Ohio 45221-0030 USA {coppocs,mazlack} Abstract. Clustering groups

More information

Discovering Numeric Association Rules via Evolutionary Algorithm

Discovering Numeric Association Rules via Evolutionary Algorithm Discovering Numeric Association Rules via Evolutionary Algorithm Jacinto Mata 1,José-Luis Alvarez 1,andJosé-Cristobal Riquelme 2 1 Dpto. Ingeniería Electrónica, Sistemas Informáticos y Automática Universidad

More information

620 HUANG Liusheng, CHEN Huaping et al. Vol.15 this itemset. Itemsets that have minimum support (minsup) are called large itemsets, and all the others

620 HUANG Liusheng, CHEN Huaping et al. Vol.15 this itemset. Itemsets that have minimum support (minsup) are called large itemsets, and all the others Vol.15 No.6 J. Comput. Sci. & Technol. Nov. 2000 A Fast Algorithm for Mining Association Rules HUANG Liusheng (ΛΠ ), CHEN Huaping ( ±), WANG Xun (Φ Ψ) and CHEN Guoliang ( Ξ) National High Performance Computing

More information

Performance Based Study of Association Rule Algorithms On Voter DB

Performance Based Study of Association Rule Algorithms On Voter DB Performance Based Study of Association Rule Algorithms On Voter DB K.Padmavathi 1, R.Aruna Kirithika 2 1 Department of BCA, St.Joseph s College, Thiruvalluvar University, Cuddalore, Tamil Nadu, India,

More information

A Decremental Algorithm for Maintaining Frequent Itemsets in Dynamic Databases *

A Decremental Algorithm for Maintaining Frequent Itemsets in Dynamic Databases * A Decremental Algorithm for Maintaining Frequent Itemsets in Dynamic Databases * Shichao Zhang 1, Xindong Wu 2, Jilian Zhang 3, and Chengqi Zhang 1 1 Faculty of Information Technology, University of Technology

More information

Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials *

Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials * Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials * Galina Bogdanova, Tsvetanka Georgieva Abstract: Association rules mining is one kind of data mining techniques

More information

A Data Mining Framework for Extracting Product Sales Patterns in Retail Store Transactions Using Association Rules: A Case Study

A Data Mining Framework for Extracting Product Sales Patterns in Retail Store Transactions Using Association Rules: A Case Study A Data Mining Framework for Extracting Product Sales Patterns in Retail Store Transactions Using Association Rules: A Case Study Mirzaei.Afshin 1, Sheikh.Reza 2 1 Department of Industrial Engineering and

More information

Data Mining Part 3. Associations Rules

Data Mining Part 3. Associations Rules Data Mining Part 3. Associations Rules 3.2 Efficient Frequent Itemset Mining Methods Fall 2009 Instructor: Dr. Masoud Yaghini Outline Apriori Algorithm Generating Association Rules from Frequent Itemsets

More information

Finding the boundaries of attributes domains of quantitative association rules using abstraction- A Dynamic Approach

Finding the boundaries of attributes domains of quantitative association rules using abstraction- A Dynamic Approach 7th WSEAS International Conference on APPLIED COMPUTER SCIENCE, Venice, Italy, November 21-23, 2007 52 Finding the boundaries of attributes domains of quantitative association rules using abstraction-

More information



More information

Comparative Study of Subspace Clustering Algorithms

Comparative Study of Subspace Clustering Algorithms Comparative Study of Subspace Clustering Algorithms S.Chitra Nayagam, Asst Prof., Dept of Computer Applications, Don Bosco College, Panjim, Goa. Abstract-A cluster is a collection of data objects that

More information

Anju Singh Information Technology,Deptt. BUIT, Bhopal, India. Keywords- Data mining, Apriori algorithm, minimum support threshold, multiple scan.

Anju Singh Information Technology,Deptt. BUIT, Bhopal, India. Keywords- Data mining, Apriori algorithm, minimum support threshold, multiple scan. Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A Survey on Association

More information

Algorithm for Efficient Multilevel Association Rule Mining

Algorithm for Efficient Multilevel Association Rule Mining Algorithm for Efficient Multilevel Association Rule Mining Pratima Gautam Department of computer Applications MANIT, Bhopal Abstract over the years, a variety of algorithms for finding frequent item sets

More information


SEQUENTIAL PATTERN MINING FROM WEB LOG DATA SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, Abstract

More information

A Literature Review of Modern Association Rule Mining Techniques

A Literature Review of Modern Association Rule Mining Techniques A Literature Review of Modern Association Rule Mining Techniques Rupa Rajoriya, Prof. Kailash Patidar Computer Science & engineering SSSIST Sehore, India Abstract:-Data mining is

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining Miss. Rituja M. Zagade Computer Engineering Department,JSPM,NTC RSSOER,Savitribai Phule Pune University Pune,India

More information

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms

Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms Combining Distributed Memory and Shared Memory Parallelization for Data Mining Algorithms Ruoming Jin Department of Computer and Information Sciences Ohio State University, Columbus OH 4321

More information

Categorization of Sequential Data using Associative Classifiers

Categorization of Sequential Data using Associative Classifiers Categorization of Sequential Data using Associative Classifiers Mrs. R. Meenakshi, MCA., MPhil., Research Scholar, Mrs. J.S. Subhashini, MCA., M.Phil., Assistant Professor, Department of Computer Science,

More information

Mining Imperfectly Sporadic Rules with Two Thresholds

Mining Imperfectly Sporadic Rules with Two Thresholds Mining Imperfectly Sporadic Rules with Two Thresholds Cu Thu Thuy and Do Van Thanh Abstract A sporadic rule is an association rule which has low support but high confidence. In general, sporadic rules

More information

A mining method for tracking changes in temporal association rules from an encoded database

A mining method for tracking changes in temporal association rules from an encoded database A mining method for tracking changes in temporal association rules from an encoded database Chelliah Balasubramanian *, Karuppaswamy Duraiswamy ** K.S.Rangasamy College of Technology, Tiruchengode, Tamil

More information

The Objectivity Measurement of Frequent Patterns

The Objectivity Measurement of Frequent Patterns , October 20-22, 2010, San Francisco, USA The Objectivity Measurement of Frequent Patterns Phi-Khu Nguyen, Thanh-Trung Nguyen Abstract Frequent pattern mining is a basic problem in data mining and knowledge

More information

Mining of Web Server Logs using Extended Apriori Algorithm

Mining of Web Server Logs using Extended Apriori Algorithm International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

Association Rule Mining. Entscheidungsunterstützungssysteme

Association Rule Mining. Entscheidungsunterstützungssysteme Association Rule Mining Entscheidungsunterstützungssysteme Frequent Pattern Analysis Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set

More information

Sensitive Rule Hiding and InFrequent Filtration through Binary Search Method

Sensitive Rule Hiding and InFrequent Filtration through Binary Search Method International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 5 (2017), pp. 833-840 Research India Publications Sensitive Rule Hiding and InFrequent

More information

Comparing the Performance of Frequent Itemsets Mining Algorithms

Comparing the Performance of Frequent Itemsets Mining Algorithms Comparing the Performance of Frequent Itemsets Mining Algorithms Kalash Dave 1, Mayur Rathod 2, Parth Sheth 3, Avani Sakhapara 4 UG Student, Dept. of I.T., K.J.Somaiya College of Engineering, Mumbai, India

More information

A Novel Texture Classification Procedure by using Association Rules

A Novel Texture Classification Procedure by using Association Rules ITB J. ICT Vol. 2, No. 2, 2008, 03-4 03 A Novel Texture Classification Procedure by using Association Rules L. Jaba Sheela & V.Shanthi 2 Panimalar Engineering College, Chennai. 2 St.Joseph s Engineering

More information

MEIT: Memory Efficient Itemset Tree for Targeted Association Rule Mining

MEIT: Memory Efficient Itemset Tree for Targeted Association Rule Mining MEIT: Memory Efficient Itemset Tree for Targeted Association Rule Mining Philippe Fournier-Viger 1, Espérance Mwamikazi 1, Ted Gueniche 1 and Usef Faghihi 2 1 Department of Computer Science, University

More information

An Anchor Vector Based Similarity Function For Boosting Transactions Clustering. Sam Y. Sung. Robert Ho ABSTRACT

An Anchor Vector Based Similarity Function For Boosting Transactions Clustering. Sam Y. Sung. Robert Ho ABSTRACT An Anchor Vector Based Similarity Function For Boosting Transactions Clustering Sam Y. Sung South Texas College Department of Computer Science and Bachelor of Applied Technologies McAllen, Texas 7850

More information

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Manju Department of Computer Engg. CDL Govt. Polytechnic Education Society Nathusari Chopta, Sirsa Abstract The discovery

More information

Medical Data Mining Based on Association Rules

Medical Data Mining Based on Association Rules Medical Data Mining Based on Association Rules Ruijuan Hu Dep of Foundation, PLA University of Foreign Languages, Luoyang 471003, China E-mail: Abstract Detailed elaborations are presented

More information

A Further Study in the Data Partitioning Approach for Frequent Itemsets Mining

A Further Study in the Data Partitioning Approach for Frequent Itemsets Mining A Further Study in the Data Partitioning Approach for Frequent Itemsets Mining Son N. Nguyen, Maria E. Orlowska School of Information Technology and Electrical Engineering The University of Queensland,

More information

An Apriori-like algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents

An Apriori-like algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents An Apriori-lie algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents Guy Danon Department of Information Systems Engineering Ben-Gurion University of the Negev Beer-Sheva

More information

Appropriate Item Partition for Improving the Mining Performance

Appropriate Item Partition for Improving the Mining Performance Appropriate Item Partition for Improving the Mining Performance Tzung-Pei Hong 1,2, Jheng-Nan Huang 1, Kawuu W. Lin 3 and Wen-Yang Lin 1 1 Department of Computer Science and Information Engineering National

More information

Association Rules Mining using BOINC based Enterprise Desktop Grid

Association Rules Mining using BOINC based Enterprise Desktop Grid Association Rules Mining using BOINC based Enterprise Desktop Grid Evgeny Ivashko and Alexander Golovin Institute of Applied Mathematical Research, Karelian Research Centre of Russian Academy of Sciences,

More information

Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm

Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm Marek Wojciechowski, Krzysztof Galecki, Krzysztof Gawronek Poznan University of Technology Institute of Computing Science ul.

More information

A Model of Machine Learning Based on User Preference of Attributes

A Model of Machine Learning Based on User Preference of Attributes 1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada

More information

An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining

An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining P.Subhashini 1, Dr.G.Gunasekaran 2 Research Scholar, Dept. of Information Technology, St.Peter s University,

More information

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery : Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj},

More information

Mining Frequent Itemsets Along with Rare Itemsets Based on Categorical Multiple Minimum Support

Mining Frequent Itemsets Along with Rare Itemsets Based on Categorical Multiple Minimum Support IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 6, Ver. IV (Nov.-Dec. 2016), PP 109-114 Mining Frequent Itemsets Along with Rare

More information

Maintenance of the Prelarge Trees for Record Deletion

Maintenance of the Prelarge Trees for Record Deletion 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31, 2007 105 Maintenance of the Prelarge Trees for Record Deletion Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Department of

More information

Mining Recent Frequent Itemsets in Data Streams with Optimistic Pruning

Mining Recent Frequent Itemsets in Data Streams with Optimistic Pruning Mining Recent Frequent Itemsets in Data Streams with Optimistic Pruning Kun Li 1,2, Yongyan Wang 1, Manzoor Elahi 1,2, Xin Li 3, and Hongan Wang 1 1 Institute of Software, Chinese Academy of Sciences,

More information

An Algorithm for Frequent Pattern Mining Based On Apriori

An Algorithm for Frequent Pattern Mining Based On Apriori An Algorithm for Frequent Pattern Mining Based On Goswami D.N.*, Chaturvedi Anshu. ** Raghuvanshi C.S.*** *SOS In Computer Science Jiwaji University Gwalior ** Computer Application Department MITS Gwalior

More information

Hierarchical Online Mining for Associative Rules

Hierarchical Online Mining for Associative Rules Hierarchical Online Mining for Associative Rules Naresh Jotwani Dhirubhai Ambani Institute of Information & Communication Technology Gandhinagar 382009 INDIA Abstract Mining

More information

Data Mining: Mining Association Rules. Definitions. .. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

Data Mining: Mining Association Rules. Definitions. .. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. .. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Mining Association Rules Definitions Market Baskets. Consider a set I = {i 1,...,i m }. We call the elements of I, items.

More information

H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases. Paper s goals. H-mine characteristics. Why a new algorithm?

H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases. Paper s goals. H-mine characteristics. Why a new algorithm? H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases Paper s goals Introduce a new data structure: H-struct J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang Int. Conf. on Data Mining

More information

Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets

Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets American Journal of Applied Sciences 2 (5): 926-931, 2005 ISSN 1546-9239 Science Publications, 2005 Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets 1 Ravindra Patel, 2 S.S.

More information

Frequent Pattern-Growth Approach for Document Organization

Frequent Pattern-Growth Approach for Document Organization Frequent Pattern-Growth Approach for Document Organization Monika Akbar Department of Computer Science Virginia Tech, Blacksburg, VA 246, USA. Rafal A. Angryk Department of Computer Science

More information

Efficiently Mining Positive Correlation Rules

Efficiently Mining Positive Correlation Rules Applied Mathematics & Information Sciences An International Journal 2011 NSP 5 (2) (2011), 39S-44S Efficiently Mining Positive Correlation Rules Zhongmei Zhou Department of Computer Science & Engineering,

More information

A fast parallel algorithm for frequent itemsets mining

A fast parallel algorithm for frequent itemsets mining A fast parallel algorithm for frequent itemsets mining Dora Souliou, Aris Pagourtzis, and Panayiotis Tsanakas "*" School of Electrical and Computer Engineering National Technical University of Athens Heroon

More information

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams *

An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams * An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams * Jia-Ling Koh and Shu-Ning Shin Department of Computer Science and Information Engineering National Taiwan Normal University

More information

Association mining rules

Association mining rules Association mining rules Given a data set, find the items in data that are associated with each other. Association is measured as frequency of occurrence in the same context. Purchasing one product when

More information

Mining Spatial Gene Expression Data Using Association Rules

Mining Spatial Gene Expression Data Using Association Rules Mining Spatial Gene Expression Data Using Association Rules M.Anandhavalli Reader, Department of Computer Science & Engineering Sikkim Manipal Institute of Technology Majitar-737136, India M.K.Ghose Prof&Head,

More information

Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce

Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce Maria Malek and Hubert Kadima EISTI-LARIS laboratory, Ave du Parc, 95011 Cergy-Pontoise, FRANCE {maria.malek,hubert.kadima}

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

Mining Frequent Itemsets for data streams over Weighted Sliding Windows

Mining Frequent Itemsets for data streams over Weighted Sliding Windows Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology

More information

An Improved Algorithm for Mining Association Rules Using Multiple Support Values

An Improved Algorithm for Mining Association Rules Using Multiple Support Values An Improved Algorithm for Mining Association Rules Using Multiple Support Values Ioannis N. Kouris, Christos H. Makris, Athanasios K. Tsakalidis University of Patras, School of Engineering Department of

More information

Monotone Constraints in Frequent Tree Mining

Monotone Constraints in Frequent Tree Mining Monotone Constraints in Frequent Tree Mining Jeroen De Knijf Ad Feelders Abstract Recent studies show that using constraints that can be pushed into the mining process, substantially improves the performance

More information

Building a Concept Hierarchy from a Distance Matrix

Building a Concept Hierarchy from a Distance Matrix Building a Concept Hierarchy from a Distance Matrix Huang-Cheng Kuo 1 and Jen-Peng Huang 2 1 Department of Computer Science and Information Engineering National Chiayi University, Taiwan 600

More information

Available online at ScienceDirect. Procedia Computer Science 45 (2015 )

Available online at   ScienceDirect. Procedia Computer Science 45 (2015 ) Available online at ScienceDirect Procedia Computer Science 45 (2015 ) 101 110 International Conference on Advanced Computing Technologies and Applications (ICACTA- 2015) An optimized

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

A Comparative Study of Association Rules Mining Algorithms

A Comparative Study of Association Rules Mining Algorithms A Comparative Study of Association Rules Mining Algorithms Cornelia Győrödi *, Robert Győrödi *, prof. dr. ing. Stefan Holban ** * Department of Computer Science, University of Oradea, Str. Armatei Romane

More information

Integration of Candidate Hash Trees in Concurrent Processing of Frequent Itemset Queries Using Apriori

Integration of Candidate Hash Trees in Concurrent Processing of Frequent Itemset Queries Using Apriori Integration of Candidate Hash Trees in Concurrent Processing of Frequent Itemset Queries Using Apriori Przemyslaw Grudzinski 1, Marek Wojciechowski 2 1 Adam Mickiewicz University Faculty of Mathematics

More information

An Efficient Hash-based Association Rule Mining Approach for Document Clustering

An Efficient Hash-based Association Rule Mining Approach for Document Clustering An Efficient Hash-based Association Rule Mining Approach for Document Clustering NOHA NEGM #1, PASSENT ELKAFRAWY #2, ABD-ELBADEEH SALEM * 3 # Faculty of Science, Menoufia University Shebin El-Kom, EGYPT

More information

A Bounded Index for Cluster Validity

A Bounded Index for Cluster Validity A Bounded Index for Cluster Validity Sandro Saitta, Benny Raphael, and Ian F.C. Smith Ecole Polytechnique Fédérale de Lausanne (EPFL) Station 18, 1015 Lausanne, Switzerland,,

More information

Research Article Apriori Association Rule Algorithms using VMware Environment

Research Article Apriori Association Rule Algorithms using VMware Environment Research Journal of Applied Sciences, Engineering and Technology 8(2): 16-166, 214 DOI:1.1926/rjaset.8.955 ISSN: 24-7459; e-issn: 24-7467 214 Maxwell Scientific Publication Corp. Submitted: January 2,

More information

Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Data mining - detailed outline. Problem

Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Data mining - detailed outline. Problem Faloutsos & Pavlo 15415/615 Carnegie Mellon Univ. Dept. of Computer Science 15415/615 DB Applications Lecture # 24: Data Warehousing / Data Mining (R&G, ch 25 and 26) Data mining detailed outline Problem

More information