Efficient Mining of Generalized Negative Association Rules
|
|
- Darcy Allison
- 5 years ago
- Views:
Transcription
1 2010 IEEE International Conference on Granular Computing Efficient Mining of Generalized egative Association Rules Li-Min Tsai, Shu-Jing Lin, and Don-Lin Yang Dept. of Information Engineering and Computer Science Feng Chia University, Taichung, 407 Taiwan Abstract Most association rule mining research focuses on finding positive relationships between items. However, many studies in intelligent data analysis indicate that negative association rules are as important as positive ones. Therefore, we propose a method improved upon the traditional negative association rule mining. Our method mainly decreases the huge computing cost of mining negative association rules and reduces most non-interesting negative rules. By using a taxonomy tree that was obtained previously, we can diminish computing costs; through negative interestingness measures, we can quickly extract negative association data from the database. Keywords-data mining; negative association rule; concept hierarchy; taxonomy; negative interestingness I. ITRODUCTIO In the research of data mining, association rule mining is one of the most important research topics. Most association rule algorithms focus on finding positive association rules. Many literatures in intelligent data analysis show that negative association rules are as important as positive rules. Especially, negative association rule mining can be applied to a domain that has too many types of factors. egative association rules can help users quickly decide which ones are important instead of checking too many rules. For example, in Bioinformatics, we may find out a negative association rule such as {if protein A appears, then protein B and protein C will not appear}. This kind of negative association rule is useful for biologists when they research on disease or drug development. In traditional approach, finding negative association rules encounters a large search space and generates too many non-interesting rules. Therefore, an efficient and useful algorithm for finding negative association rules is very valuable. Our research focuses on reducing computing time and trying to find interesting negative association rules. The proposed algorithm could speed up computing time efficiently and through the domain taxonomy tree, we could find interesting negative association rules more easily. II. RELATED WORK The Apriori algorithm [1] is a basic algorithm in mining association rules. Based on Apriori, many improved algorithms are proposed. For example, partition algorithm [2] was developed for reducing the number of database scan; FP-growth algorithm [3] was used to speed up computation time; generalized association rule [4] was an extension of association rule where no item in Y is an ancestor of any item in X for the rule X Y. evertheless, until now, most improved association rules algorithms focus on positive association rules or some rare association rules mining [5]. Mining negative association rules is another issue that has raised some researchers attention [6]. For example, when we determine strategies of product placement and purchase analysis, there are many factors that we must weight the pros and cons. To minimize negative impacts and increase possible benefits [7], managers must consider which side-effect is unlikely to occur when the expected advantage factor is selected. In such a situation, a negative association rule like X Y would be useful. Because this rule tells us that C (e.g., a disadvantage factor) does not occur or rarely occurs when A (e.g., an advantage factor) shows up. Algorithms for discovering negative association rules are not widely discussed [8,9]. The discovery procedure of these algorithms can be decomposed into three stages: (1) find a set of positive rules; (2) generate negative rules based on existing positive rules and domain knowledge; (3) prune the redundant rules. Ashok, et al. [8] generates negative association rules based on a complex measure of rule parts. A negative association rule defined in [8] is as an implication of the form X Y, where X Y =, X is called the antecedent, and Y is the consequence of the rule. Every negative association rule has a rule interest measure RI, which is defined as: [support( X Y ) support( X Y )] RI = ε (1) support( X ) where ε[support(x)] is the expected support of an itemset X. The rule interest RI is negatively related to the actual support of the itemset X Y. It is the highest if the actual support is zero, and zero if the actual support is the /10 $ IEEE DOI /GrC
2 same as the expected support. Xiaohui, et al [9] generates negative association rules by using a similar concept of [8]. Three measures used in the algorithm are minimum support, minimum confidence and SM (salience measure). A negative association rule defined in [9] is as an implication of the form X Y (or X Y), where X I, Y I, and X Y =. The SM (salience measure) is used to provide clues to potentially useful negative rules and defined as follows: SM = conf ( r') E( conf ( r')) (2) where conf (r') is the actual confidence of rule r. A large value for SM is an evidence for accepting the hypothesis that X ' Y is false. That is, X ' Y may be true. In brief, to qualify as a negative rule, it must satisfy two conditions: first, there must exist a large deviation between the estimated and actual confidence values and, second, the support and confidence are greater than the minimum required. In this paper, we focus on efficiently mining negative association rules. The reasons that motivate us are: (1) egative association rules are as important as positive rules. (2) Traditional approaches lead to a very large number of rules and expensive computing costs. On account of these motivations, we developed a method to solve these problems. Our method can speed up computing time and find the interesting negative rules according to user s requirements. The rest of the paper is organized as follows. Section III gives the detailed process of the proposed algorithm. The experiment results and their discussion are presented in Section IV. Finally in Section V, we conclude this paper. III. PROPOSED METHOD egative association rule discovery encounters a large search space such that it may spend more computing time than traditional positive rule discovery using intuitive mining algorithms like Apriori. Therefore, we propose an improved approach called Generalized egative Association Rule (GAR) algorithm. For efficiency, we scan the database once and transform transactions into a space-reduced structure called vertical TID table stored in main memory. We assume that the information of taxonomy tree is available in advance. The taxonomy tree is used to assist creating vertical TID table. That is, through the taxonomy tree, we can filter transactions that do not belong to this domain and make no contribution to the end result. In addition to eliminate a large number of useless transactions, the information in the taxonomy tree can be used to mine negative association rules. In the mining steps of GAR, we use negative interestingness and negative confidence to increase accuracy of mined results. Pruning techniques are used to remove non-interesting negative association rules. A. The Concepts of GAR The concepts used in GAR can be divided into two parts: concept hierarchy and negative interestingness. Since the search space of mining negative association rules is extremely large, a concise representation of negative association rule must be developed. Concept hierarchy or taxonomy is used for this purpose. The second is negative interestingness. As mentioned above, we can think of negative association rules as a complement of positive association rules. The nature of negative association rule is totally different from positive one. Therefore, traditional measures such as support and confidence used for positive association rules are not proper for negative association rule anymore. Suitable measures for mining negative association rules are needed. More detailed descriptions about these two concepts are as follows: Concept hierarchy: A concept hierarchy allows a series of mappings from a set of low-level concepts to higher-level, more general concepts. It is a useful form of background knowledge in that they allow raw data to be represented at generalized levels of abstraction. Generalization of the data, or rolling up, is achieved by replacing primitive-level data by higher-level concept. By using concept hierarchy, we can condense the negative association rules to a more succinct form. Figure 1. A concept hierarchy for the dimension snacks Fig. 1 shows a concept hierarchy for the dimension snacks. In this paper, concept hierarchy and taxonomy (tree) will be used interchangeably. We take Fig. 1 as an example, if a generated rule of the form R: Pepsi Brand B cracker, then the rule of the form R 1 : Soft Drink Cracker would also be generated and hold a larger support than R. This kind of concept is suitable for mining negative association rules. Because the number of generated negative association rules would 472
3 be greater than generated positive ones. Therefore, a negative association rule would be more easily understood if we present it with a concept hierarchy. It also allows users to view the data at more meaningful and explicit abstractions. Fig. 1 shows three nodes for different use. Only items of leaf node are presented in the database. The other two nodes (Root and Internal node) are used for concept hierarchy presentation. Three types of generalized negative association rule would be mined in our method: [ ]Coke [ ]BrandA Leaf Leaf [ ]Soft Drink [ ]BrandB Internalnode Leaf [ ]Soft Drink [ ]Cracker Internalnode Internalnode In our proposed method, we assume that this kind of taxonomy tree can be provided in advance. Through the taxonomy tree, we can first eliminate transactions that do not belong to the domain or contain user-specified items. After counting support of each item, the taxonomy tree would be further pruned to become a smaller one. The taxonomy information is reserved for the following negative association rule mining process. egative interestingness: Before we start to introduce negative interestingness, we shall discuss relationships between items first. Here, we only discuss binary relationship. A state diagram is shown in Table 1. X and Y are different items in a database. Each item in the database has two conditions, that is, presence or absence. Therefore, such a four-state table is created for item X and Y. From Table 1, we can easily deduce support and confidence for mining traditional association rules. For instance, support of rule X Y can be expressed by a and confidence of rule X Y a b c d a can be expressed by. In order to extract a + b interesting negative association rules from large databases, we must define a proper measure for negative association rule mining. From Table 1, we find that attribute a is the condition that X and Y occur at the same time. The others have at least one negative (Absence) factor. Therefore, instead of using traditional measures such as a for support and a for a + b + c + d a + b confidence, we define a measure for mining interesting negative association rules as follows: w1 b + w2 c + w3 d egative interestingness = w4 ( ) w (3) a + b + c + d 5 This measure is a general case that contains most of dissimilarity measures to the best of our knowledge. For example, dissimilarity measures binary pattern difference, average squared and binary Euclidean are subsets of negative interestingness. Users are allowed to modify these flexible parameters ( w 1 to w 5 ) according to their applications and specific demands. Moreover, we also can easily define confidence for negative association rules from the four-state diagram of Table 1. Three types of negative association rules are shown as follows: X X X B. The Process of GAR Y Y Y b a + c c + TABLE 1. Binary relations : : : d b d c + In this section, we give a detailed description of the proposed GAR algorithm in the following three steps: (1) First, we scan the database into a vertical TID table in main memory. The vertical TID table is a memory space-reduced structure. It transforms transactions into a bit-map string mode according to data distribution in the original database. If the original database is dense (most of items occur more than half of total transactions), the vertical TID table can then change to record TID of each item, which is not occurred in the database. If the original database is sparse, then the vertical TID table only records TID of each item occurred in the database. Because our GAR algorithm is a memory-based algorithm, the use of memory space must be considered carefully. The vertical TID table can be applied in both dense and sparse databases. d 473
4 graph based on L1, L2 and frequent taxonomy items T. The association graph is used to join frequent taxonomy items with original large items in the database and to keep taxonomy information for the following mining process. Based on the association graph, we can produce k-generalized negative association rules. In our GAR algorithm, we only consider generalized negative association rules in the form of [ ]{ ItemsetA } [ ]{ ItemsetB} that items in braces ( ItemsetA or ItemsetB ) are positively associated respectively. Fig. 2 shows our GAR algorithm. egative confidence is used to extract three types of rules in each rule-generation step. IV. EXPERIMET RESULT AD DISCUSSIO We use Visual C++ programming language to implement the GAR algorithm. We perform our experiments on a personal computer of Intel Pentium 4 processor with a clock rate of 2.4AGHz and 512MB DDR266MHz main memory. The test data of our experiments were produced from IBM dataset generator [11]. Figure 2. The GAR algorithm (2) Second, we assume the information of taxonomy tree is always available. According to this taxonomy tree, we can eliminate items and transactions that do not belong to this domain. Then, with a minimum support, we can find L1 from the vertical TID table and calculate support of each internal node in the taxonomy tree. In this step, the support of each internal node and root node can be calculated by using the OR operation. In the GAR process, we use negative interestingness mentioned before to replace support measure except when forming L1. (3) After calculating all support of internal nodes in the taxonomy tree, we can generate frequent taxonomy itemsets T. Then we generate C2 from L1. When counting the support for L2, we use negative interestingness as its threshold and apply a pruning technique. That is, items in C2 that belong to the same parent node according to the taxonomy tree will be pruned. From L2, we can generate R2 with another pruning technique being applied here. That is, assume a rule in the form of [ ] I 1 [ ] I2, I 1 I 2 =, no item in I 2 is an ancestor of any items in I 1. After that, we construct an association A. Experimental Parameters Table 2 shows the parameter settings used in generating the three testing databases. T is the average number of items per transaction. I is the average length of maximal frequent patterns. D is the total number of transactions. In addition to compare with traditional negative association rule algorithm, we also performed experiments on different parameter settings (test1~test5 in Table 3) of our algorithm. TABLE 2. Test databases TABLE 3. Test parameters 474
5 B. Experiment results and discussion We ran both GAR and traditional algorithms on the three datasets with parameter setting of test1~test5 in Table 3. The average level of taxonomy data is set to 6 and 11 categories were given here. We use dataset T10I6D10K for the first experiment. The weight parameters of GAR are w1 = w2 = w3 = w4 = w5 = 1, and negative interestingness = negative confidence = 0.6. Figure 4. T15I12D100K Figure 3. Execution time experiments on T10I6D10K Fig. 3 shows the result of experiment for database T10I6D10K. X-axis represents various values of initial support of GAR and support of traditional negative association rule algorithm. These values range from 0.5 to 1.0. Y-axis represents execution time of the two algorithms according to different support values. From Fig. 3, we can find that GAR spends less time than traditional algorithm in most cases. When support is close to 0.5, GAR performs much better than traditional negative association rule algorithm. On the other hand, when support is close to 1, the performance of these two algorithms is similar. We use dataset T15I12D100K in the second experiment to analyze the performance of algorithms when the average length of maximal frequent patterns is long. In this experiment, we set the parameters of GAR as w1=1, w2=1, w3=1, w4=1, w5=1, negative interestingness = 0.6 with different Ini_Sup values, and different supports of traditional algorithm. We found traditional algorithm is inefficient, especially when the average size of transactions of maximal potentially large itemsets is doubled from 6 to 12. Fig. 4 shows that traditional algorithm spends more time to generate negative association rules than GAR when support is close to 0.5. When support is close to 1, the performance of GAR and traditional algorithms is almost the same. In the third experiment, we use different parameter settings (test1~test5) from Table 3 to analyze the generated negative association rules for dataset T12I8D50K. Taxonomy data used here are set as the average number of levels = 6 and the average number of categories = 11. Fig. 5 shows the result of this experiment. We found that test4 generated the most amounts of negative association rules. This is because the denominator of negative interestingness decreases such that many rules can be extracted by using negative interestingness. On the other hand, both generated rules of test3 and test5 are much less than other tests. The reason is that the feature of the tested database has less negative relationship. Figure 5. Generated negative association rules using different testing settings of test1~test5 In the last experiment, we use different taxonomy data for comparing the effect of our GAR algorithm with different taxonomic structures. In this experiment, we set the parameters of GAR as w1=1, w2=1, w3=1, 475
6 w4=1, w5=1, negative interestingness = 0.6, Ini_Sup = 0.6 and negative confidence = 0.6. Two taxonomic structures are used here for comparison. Taxonomy1 is set to the average number of levels = 3 and the average number of categories = 11. Taxonomy2 is set to the average level size = 9 and the average number of categories = 11. The tested dataset is T12I8D50. Fig. 6 shows the execution time of these two taxonomies. From Fig. 6, we found that our method is more efficient when the level size of taxonomy is larger. The reason is that the fan-out of taxonomy deeply effects the performance of GAR. Since Taxonomy2 is set to have a larger level size than Taxonomy1, Taxonomy2 has smaller fan-out than Taxonomy1. Therefore, the GAR algorithm is more efficient for mining negative association rules with a larger level size. ACKOWLEDGMET This research was supported by the ational Science Council, Taiwan, under grants SC E MY2 and SC E REFERECES [1] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases, Proc. of the 20th International Conference on Very Large Databases, pp , Santiago, Chile, [2] A. Savasere, E. Omiecinski, and S. B. avathe, An Efficient Algorithm for Mining Association Rules in Large Databases, Proceedings of the 21st International Conference on Very Large Databases, pp , Zurich, Switzerland, [3] J. Han, J. Pei, and Y. Yin, Mining Frequent Patterns without Candidate Generation, Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, pp , May [4] R. Srikant and R. Agrawal, Mining Generalized Association Rules, VLDB 95, pp , Zurich Switzerland, [5] Y. S. Koh and. Rountree, Rare Association Rule Mining and Knowledge Discovery: Technologies for Infrequent and Critical Event Detection, Information Science Reference publisher, August [6] X. Wu, C. Zhang, S. Zhang, Efficient Mining of Both Positive and egative Association Rules, ACM Transactions on Information Systems, Vol. 22, o. 3, July 2004, pp Figure 6. Comparison of different taxonomies V. COCLUSIO A considerable body of work has been carried out on the problem of positive association rule mining, but negative association rule mining has received very little attention. egative association rule mining can be applied to a domain that has various types of factors and it can help user quickly decide which one is an important factor instead of checking too many rules. In this paper, we proposed an efficient method of mining generalized negative association rules. Instead of mining negative association rules with an intuitive method, we use negative interestingness to characterize the property of negative association rules and justify the effectiveness. With taxonomy tree information, we reduce the search space of the mining process and a useful representation of generalized negative association rule is proposed. In the future, mining sequential patterns with negative conclusions and developing scalable parallel algorithms are two major directions of our future research. [7] Chengqi Zhang, Schichao Zhang, Shichao Zhang and Berno Eugene Heymer, Association Rule Mining: Models and Algorithms (Lecture otes in Artificial Intelligence), Springer Verlag, July [8] A. Savasere, E. Omiecinski, and S. avathe, Mining for strong negative associations in a large database of customer transactions, Proceedings of International Conference on Data Engineering, pp , February [9] X. Yuan, B. Buckles, Z. Yuan, and J. Zhang, Mining egative Association Rules, Proceedings of the Seventh IEEE International Symposium on Computers and Communications (ISCC 02), pp , [10] C. Cornelis, P. Yan, X. Zhang, and G. Chenand, Mining Positive and egative Association Rules from Large Databases, IEEE Conference on Cybernetics and Intelligent Systems, pp , [11] IBM Almaden Research Center, Synthetic data generation code, software/quest/ 476
Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree
Discovery of Multi-level Association Rules from Primitive Level Frequent Patterns Tree Virendra Kumar Shrivastava 1, Parveen Kumar 2, K. R. Pardasani 3 1 Department of Computer Science & Engineering, Singhania
More informationA Decremental Algorithm for Maintaining Frequent Itemsets in Dynamic Databases *
A Decremental Algorithm for Maintaining Frequent Itemsets in Dynamic Databases * Shichao Zhang 1, Xindong Wu 2, Jilian Zhang 3, and Chengqi Zhang 1 1 Faculty of Information Technology, University of Technology
More informationAn Evolutionary Algorithm for Mining Association Rules Using Boolean Approach
An Evolutionary Algorithm for Mining Association Rules Using Boolean Approach ABSTRACT G.Ravi Kumar 1 Dr.G.A. Ramachandra 2 G.Sunitha 3 1. Research Scholar, Department of Computer Science &Technology,
More informationEfficient Remining of Generalized Multi-supported Association Rules under Support Update
Efficient Remining of Generalized Multi-supported Association Rules under Support Update WEN-YANG LIN 1 and MING-CHENG TSENG 1 Dept. of Information Management, Institute of Information Engineering I-Shou
More informationPSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets
2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets Tao Xiao Chunfeng Yuan Yihua Huang Department
More informationMining Quantitative Association Rules on Overlapped Intervals
Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang,
More informationCS570 Introduction to Data Mining
CS570 Introduction to Data Mining Frequent Pattern Mining and Association Analysis Cengiz Gunay Partial slide credits: Li Xiong, Jiawei Han and Micheline Kamber George Kollios 1 Mining Frequent Patterns,
More informationAppropriate Item Partition for Improving the Mining Performance
Appropriate Item Partition for Improving the Mining Performance Tzung-Pei Hong 1,2, Jheng-Nan Huang 1, Kawuu W. Lin 3 and Wen-Yang Lin 1 1 Department of Computer Science and Information Engineering National
More informationA mining method for tracking changes in temporal association rules from an encoded database
A mining method for tracking changes in temporal association rules from an encoded database Chelliah Balasubramanian *, Karuppaswamy Duraiswamy ** K.S.Rangasamy College of Technology, Tiruchengode, Tamil
More informationPerformance Analysis of Apriori Algorithm with Progressive Approach for Mining Data
Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Shilpa Department of Computer Science & Engineering Haryana College of Technology & Management, Kaithal, Haryana, India
More informationMining Top-K Association Rules Philippe Fournier-Viger 1, Cheng-Wei Wu 2 and Vincent S. Tseng 2 1 Dept. of Computer Science, University of Moncton, Canada philippe.fv@gmail.com 2 Dept. of Computer Science
More informationAssociation Rule Mining. Introduction 46. Study core 46
Learning Unit 7 Association Rule Mining Introduction 46 Study core 46 1 Association Rule Mining: Motivation and Main Concepts 46 2 Apriori Algorithm 47 3 FP-Growth Algorithm 47 4 Assignment Bundle: Frequent
More informationAssociating Terms with Text Categories
Associating Terms with Text Categories Osmar R. Zaïane Department of Computing Science University of Alberta Edmonton, AB, Canada zaiane@cs.ualberta.ca Maria-Luiza Antonie Department of Computing Science
More informationMining Frequent Itemsets for data streams over Weighted Sliding Windows
Mining Frequent Itemsets for data streams over Weighted Sliding Windows Pauray S.M. Tsai Yao-Ming Chen Department of Computer Science and Information Engineering Minghsin University of Science and Technology
More informationInfrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset
Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,
More informationWIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity
WIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity Unil Yun and John J. Leggett Department of Computer Science Texas A&M University College Station, Texas 7783, USA
More informationAn Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining
An Efficient Algorithm for Finding the Support Count of Frequent 1-Itemsets in Frequent Pattern Mining P.Subhashini 1, Dr.G.Gunasekaran 2 Research Scholar, Dept. of Information Technology, St.Peter s University,
More informationGenerating Cross level Rules: An automated approach
Generating Cross level Rules: An automated approach Ashok 1, Sonika Dhingra 1 1HOD, Dept of Software Engg.,Bhiwani Institute of Technology, Bhiwani, India 1M.Tech Student, Dept of Software Engg.,Bhiwani
More informationImproved Frequent Pattern Mining Algorithm with Indexing
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.
More informationDiscovering interesting rules from financial data
Discovering interesting rules from financial data Przemysław Sołdacki Institute of Computer Science Warsaw University of Technology Ul. Andersa 13, 00-159 Warszawa Tel: +48 609129896 email: psoldack@ii.pw.edu.pl
More informationProduct presentations can be more intelligently planned
Association Rules Lecture /DMBI/IKI8303T/MTI/UI Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id) Faculty of Computer Science, Objectives Introduction What is Association Mining? Mining Association Rules
More informationSalah Alghyaline, Jun-Wei Hsieh, and Jim Z. C. Lai
EFFICIENTLY MINING FREQUENT ITEMSETS IN TRANSACTIONAL DATABASES This article has been peer reviewed and accepted for publication in JMST but has not yet been copyediting, typesetting, pagination and proofreading
More informationPTclose: A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets
: A novel algorithm for generation of closed frequent itemsets from dense and sparse datasets J. Tahmores Nezhad ℵ, M.H.Sadreddini Abstract In recent years, various algorithms for mining closed frequent
More informationAssociation Pattern Mining. Lijun Zhang
Association Pattern Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction The Frequent Pattern Mining Model Association Rule Generation Framework Frequent Itemset Mining Algorithms
More informationAssociation Rule Mining from XML Data
144 Conference on Data Mining DMIN'06 Association Rule Mining from XML Data Qin Ding and Gnanasekaran Sundarraj Computer Science Program The Pennsylvania State University at Harrisburg Middletown, PA 17057,
More informationAC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery
: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj}@cs.uiuc.edu,
More informationA Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining
A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining Miss. Rituja M. Zagade Computer Engineering Department,JSPM,NTC RSSOER,Savitribai Phule Pune University Pune,India
More informationLecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R,
Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, statistics foundations 5 Introduction to D3, visual analytics
More informationTo Enhance Projection Scalability of Item Transactions by Parallel and Partition Projection using Dynamic Data Set
To Enhance Scalability of Item Transactions by Parallel and Partition using Dynamic Data Set Priyanka Soni, Research Scholar (CSE), MTRI, Bhopal, priyanka.soni379@gmail.com Dhirendra Kumar Jha, MTRI, Bhopal,
More informationAssociation Rules. Berlin Chen References:
Association Rules Berlin Chen 2005 References: 1. Data Mining: Concepts, Models, Methods and Algorithms, Chapter 8 2. Data Mining: Concepts and Techniques, Chapter 6 Association Rules: Basic Concepts A
More informationAlgorithm for Efficient Multilevel Association Rule Mining
Algorithm for Efficient Multilevel Association Rule Mining Pratima Gautam Department of computer Applications MANIT, Bhopal Abstract over the years, a variety of algorithms for finding frequent item sets
More informationAn Algorithm for Frequent Pattern Mining Based On Apriori
An Algorithm for Frequent Pattern Mining Based On Goswami D.N.*, Chaturvedi Anshu. ** Raghuvanshi C.S.*** *SOS In Computer Science Jiwaji University Gwalior ** Computer Application Department MITS Gwalior
More informationA Further Study in the Data Partitioning Approach for Frequent Itemsets Mining
A Further Study in the Data Partitioning Approach for Frequent Itemsets Mining Son N. Nguyen, Maria E. Orlowska School of Information Technology and Electrical Engineering The University of Queensland,
More informationWeb page recommendation using a stochastic process model
Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,
More informationData Structure for Association Rule Mining: T-Trees and P-Trees
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO. 6, JUNE 2004 1 Data Structure for Association Rule Mining: T-Trees and P-Trees Frans Coenen, Paul Leng, and Shakil Ahmed Abstract Two new
More informationAn Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets
IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.8, August 2008 121 An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets
More informationCHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM. Please purchase PDF Split-Merge on to remove this watermark.
119 CHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM 120 CHAPTER V ADAPTIVE ASSOCIATION RULE MINING ALGORITHM 5.1. INTRODUCTION Association rule mining, one of the most important and well researched
More informationANU MLSS 2010: Data Mining. Part 2: Association rule mining
ANU MLSS 2010: Data Mining Part 2: Association rule mining Lecture outline What is association mining? Market basket analysis and association rule examples Basic concepts and formalism Basic rule measurements
More informationGeneration of Potential High Utility Itemsets from Transactional Databases
Generation of Potential High Utility Itemsets from Transactional Databases Rajmohan.C Priya.G Niveditha.C Pragathi.R Asst.Prof/IT, Dept of IT Dept of IT Dept of IT SREC, Coimbatore,INDIA,SREC,Coimbatore,.INDIA
More informationTemporal Weighted Association Rule Mining for Classification
Temporal Weighted Association Rule Mining for Classification Purushottam Sharma and Kanak Saxena Abstract There are so many important techniques towards finding the association rules. But, when we consider
More informationResearch of Improved FP-Growth (IFP) Algorithm in Association Rules Mining
International Journal of Engineering Science Invention (IJESI) ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 www.ijesi.org PP. 24-31 Research of Improved FP-Growth (IFP) Algorithm in Association Rules
More informationConcurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm
Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm Marek Wojciechowski, Krzysztof Galecki, Krzysztof Gawronek Poznan University of Technology Institute of Computing Science ul.
More informationUsing Association Rules for Better Treatment of Missing Values
Using Association Rules for Better Treatment of Missing Values SHARIQ BASHIR, SAAD RAZZAQ, UMER MAQBOOL, SONYA TAHIR, A. RAUF BAIG Department of Computer Science (Machine Intelligence Group) National University
More informationA Graph-Based Approach for Mining Closed Large Itemsets
A Graph-Based Approach for Mining Closed Large Itemsets Lee-Wen Huang Dept. of Computer Science and Engineering National Sun Yat-Sen University huanglw@gmail.com Ye-In Chang Dept. of Computer Science and
More informationGraph Based Approach for Finding Frequent Itemsets to Discover Association Rules
Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Manju Department of Computer Engg. CDL Govt. Polytechnic Education Society Nathusari Chopta, Sirsa Abstract The discovery
More informationAn Efficient Algorithm for finding high utility itemsets from online sell
An Efficient Algorithm for finding high utility itemsets from online sell Sarode Nutan S, Kothavle Suhas R 1 Department of Computer Engineering, ICOER, Maharashtra, India 2 Department of Computer Engineering,
More informationMaintenance of the Prelarge Trees for Record Deletion
12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31, 2007 105 Maintenance of the Prelarge Trees for Record Deletion Chun-Wei Lin, Tzung-Pei Hong, and Wen-Hsiang Lu Department of
More information620 HUANG Liusheng, CHEN Huaping et al. Vol.15 this itemset. Itemsets that have minimum support (minsup) are called large itemsets, and all the others
Vol.15 No.6 J. Comput. Sci. & Technol. Nov. 2000 A Fast Algorithm for Mining Association Rules HUANG Liusheng (ΛΠ ), CHEN Huaping ( ±), WANG Xun (Φ Ψ) and CHEN Guoliang ( Ξ) National High Performance Computing
More informationItem Set Extraction of Mining Association Rule
Item Set Extraction of Mining Association Rule Shabana Yasmeen, Prof. P.Pradeep Kumar, A.Ranjith Kumar Department CSE, Vivekananda Institute of Technology and Science, Karimnagar, A.P, India Abstract:
More informationMining Temporal Association Rules in Network Traffic Data
Mining Temporal Association Rules in Network Traffic Data Guojun Mao Abstract Mining association rules is one of the most important and popular task in data mining. Current researches focus on discovering
More informationAn Algorithm for Mining Frequent Itemsets from Library Big Data
JOURNAL OF SOFTWARE, VOL. 9, NO. 9, SEPTEMBER 2014 2361 An Algorithm for Mining Frequent Itemsets from Library Big Data Xingjian Li lixingjianny@163.com Library, Nanyang Institute of Technology, Nanyang,
More informationComparing the Performance of Frequent Itemsets Mining Algorithms
Comparing the Performance of Frequent Itemsets Mining Algorithms Kalash Dave 1, Mayur Rathod 2, Parth Sheth 3, Avani Sakhapara 4 UG Student, Dept. of I.T., K.J.Somaiya College of Engineering, Mumbai, India
More informationMining High Average-Utility Itemsets
Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering
More informationINTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)
More informationA NEW ASSOCIATION RULE MINING BASED ON FREQUENT ITEM SET
A NEW ASSOCIATION RULE MINING BASED ON FREQUENT ITEM SET Ms. Sanober Shaikh 1 Ms. Madhuri Rao 2 and Dr. S. S. Mantha 3 1 Department of Information Technology, TSEC, Bandra (w), Mumbai s.sanober1@gmail.com
More informationData Mining: Concepts and Techniques. Chapter 5. SS Chung. April 5, 2013 Data Mining: Concepts and Techniques 1
Data Mining: Concepts and Techniques Chapter 5 SS Chung April 5, 2013 Data Mining: Concepts and Techniques 1 Chapter 5: Mining Frequent Patterns, Association and Correlations Basic concepts and a road
More informationETP-Mine: An Efficient Method for Mining Transitional Patterns
ETP-Mine: An Efficient Method for Mining Transitional Patterns B. Kiran Kumar 1 and A. Bhaskar 2 1 Department of M.C.A., Kakatiya Institute of Technology & Science, A.P. INDIA. kirankumar.bejjanki@gmail.com
More informationAN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES
AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES ABSTRACT Wael AlZoubi Ajloun University College, Balqa Applied University PO Box: Al-Salt 19117, Jordan This paper proposes an improved approach
More informationMaintenance of fast updated frequent pattern trees for record deletion
Maintenance of fast updated frequent pattern trees for record deletion Tzung-Pei Hong a,b,, Chun-Wei Lin c, Yu-Lung Wu d a Department of Computer Science and Information Engineering, National University
More informationEfficient Tree Based Structure for Mining Frequent Pattern from Transactional Databases
International Journal of Computational Engineering Research Vol, 03 Issue, 6 Efficient Tree Based Structure for Mining Frequent Pattern from Transactional Databases Hitul Patel 1, Prof. Mehul Barot 2,
More informationAn Approximate Approach for Mining Recently Frequent Itemsets from Data Streams *
An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams * Jia-Ling Koh and Shu-Ning Shin Department of Computer Science and Information Engineering National Taiwan Normal University
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More informationMining Frequent Patterns with Counting Inference at Multiple Levels
International Journal of Computer Applications (097 7) Volume 3 No.10, July 010 Mining Frequent Patterns with Counting Inference at Multiple Levels Mittar Vishav Deptt. Of IT M.M.University, Mullana Ruchika
More informationAssociation Rules Mining using BOINC based Enterprise Desktop Grid
Association Rules Mining using BOINC based Enterprise Desktop Grid Evgeny Ivashko and Alexander Golovin Institute of Applied Mathematical Research, Karelian Research Centre of Russian Academy of Sciences,
More informationMining of Web Server Logs using Extended Apriori Algorithm
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational
More informationDMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE
DMSA TECHNIQUE FOR FINDING SIGNIFICANT PATTERNS IN LARGE DATABASE Saravanan.Suba Assistant Professor of Computer Science Kamarajar Government Art & Science College Surandai, TN, India-627859 Email:saravanansuba@rediffmail.com
More informationMining for Mutually Exclusive Items in. Transaction Databases
Mining for Mutually Exclusive Items in Transaction Databases George Tzanis and Christos Berberidis Department of Informatics, Aristotle University of Thessaloniki Thessaloniki 54124, Greece {gtzanis, berber,
More informationA Data Mining Framework for Extracting Product Sales Patterns in Retail Store Transactions Using Association Rules: A Case Study
A Data Mining Framework for Extracting Product Sales Patterns in Retail Store Transactions Using Association Rules: A Case Study Mirzaei.Afshin 1, Sheikh.Reza 2 1 Department of Industrial Engineering and
More informationUSING FREQUENT PATTERN MINING ALGORITHMS IN TEXT ANALYSIS
INFORMATION SYSTEMS IN MANAGEMENT Information Systems in Management (2017) Vol. 6 (3) 213 222 USING FREQUENT PATTERN MINING ALGORITHMS IN TEXT ANALYSIS PIOTR OŻDŻYŃSKI, DANUTA ZAKRZEWSKA Institute of Information
More informationINFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM
INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM G.Amlu #1 S.Chandralekha #2 and PraveenKumar *1 # B.Tech, Information Technology, Anand Institute of Higher Technology, Chennai, India
More informationStudy on Mining Weighted Infrequent Itemsets Using FP Growth
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 6 June 2015, Page No. 12719-12723 Study on Mining Weighted Infrequent Itemsets Using FP Growth K.Hemanthakumar
More informationA Hierarchical Document Clustering Approach with Frequent Itemsets
A Hierarchical Document Clustering Approach with Frequent Itemsets Cheng-Jhe Lee, Chiun-Chieh Hsu, and Da-Ren Chen Abstract In order to effectively retrieve required information from the large amount of
More informationSurvey: Efficent tree based structure for mining frequent pattern from transactional databases
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 9, Issue 5 (Mar. - Apr. 2013), PP 75-81 Survey: Efficent tree based structure for mining frequent pattern from
More informationAvailable online at ScienceDirect. Procedia Computer Science 45 (2015 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 45 (2015 ) 101 110 International Conference on Advanced Computing Technologies and Applications (ICACTA- 2015) An optimized
More informationUAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA
UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University
More informationEFFICIENT TRANSACTION REDUCTION IN ACTIONABLE PATTERN MINING FOR HIGH VOLUMINOUS DATASETS BASED ON BITMAP AND CLASS LABELS
EFFICIENT TRANSACTION REDUCTION IN ACTIONABLE PATTERN MINING FOR HIGH VOLUMINOUS DATASETS BASED ON BITMAP AND CLASS LABELS K. Kavitha 1, Dr.E. Ramaraj 2 1 Assistant Professor, Department of Computer Science,
More informationA Modern Search Technique for Frequent Itemset using FP Tree
A Modern Search Technique for Frequent Itemset using FP Tree Megha Garg Research Scholar, Department of Computer Science & Engineering J.C.D.I.T.M, Sirsa, Haryana, India Krishan Kumar Department of Computer
More informationMining Quantitative Maximal Hyperclique Patterns: A Summary of Results
Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results Yaochun Huang, Hui Xiong, Weili Wu, and Sam Y. Sung 3 Computer Science Department, University of Texas - Dallas, USA, {yxh03800,wxw0000}@utdallas.edu
More informationApplying Objective Interestingness Measures. in Data Mining Systems. Robert J. Hilderman and Howard J. Hamilton. Department of Computer Science
Applying Objective Interestingness Measures in Data Mining Systems Robert J. Hilderman and Howard J. Hamilton Department of Computer Science University of Regina Regina, Saskatchewan, Canada SS 0A fhilder,hamiltong@cs.uregina.ca
More informationMining Negative Rules using GRD
Mining Negative Rules using GRD D. R. Thiruvady and G. I. Webb School of Computer Science and Software Engineering, Monash University, Wellington Road, Clayton, Victoria 3800 Australia, Dhananjay Thiruvady@hotmail.com,
More informationAn Algorithm for Interesting Negated Itemsets for Negative Association Rules from XML Stream Data
An Algorithm for Interesting Negated Itemsets for Negative Association Rules from XML Stream Data Juryon Paik Department of Digital Information & Statistics Pyeongtaek University Pyeongtaek-si S.Korea
More informationCSCI6405 Project - Association rules mining
CSCI6405 Project - Association rules mining Xuehai Wang xwang@ca.dalc.ca B00182688 Xiaobo Chen xiaobo@ca.dal.ca B00123238 December 7, 2003 Chen Shen cshen@cs.dal.ca B00188996 Contents 1 Introduction: 2
More informationAn Approach for Privacy Preserving in Association Rule Mining Using Data Restriction
International Journal of Engineering Science Invention Volume 2 Issue 1 January. 2013 An Approach for Privacy Preserving in Association Rule Mining Using Data Restriction Janakiramaiah Bonam 1, Dr.RamaMohan
More informationSQL Based Frequent Pattern Mining with FP-growth
SQL Based Frequent Pattern Mining with FP-growth Shang Xuequn, Sattler Kai-Uwe, and Geist Ingolf Department of Computer Science University of Magdeburg P.O.BOX 4120, 39106 Magdeburg, Germany {shang, kus,
More informationTadeusz Morzy, Maciej Zakrzewicz
From: KDD-98 Proceedings. Copyright 998, AAAI (www.aaai.org). All rights reserved. Group Bitmap Index: A Structure for Association Rules Retrieval Tadeusz Morzy, Maciej Zakrzewicz Institute of Computing
More informationDiscovery of Multi Dimensional Quantitative Closed Association Rules by Attributes Range Method
Discovery of Multi Dimensional Quantitative Closed Association Rules by Attributes Range Method Preetham Kumar, Ananthanarayana V S Abstract In this paper we propose a novel algorithm for discovering multi
More informationIncremental Mining of Frequent Patterns Without Candidate Generation or Support Constraint
Incremental Mining of Frequent Patterns Without Candidate Generation or Support Constraint William Cheung and Osmar R. Zaïane University of Alberta, Edmonton, Canada {wcheung, zaiane}@cs.ualberta.ca Abstract
More informationFast Algorithm for Mining Association Rules
Fast Algorithm for Mining Association Rules M.H.Margahny and A.A.Mitwaly Dept. of Computer Science, Faculty of Computers and Information, Assuit University, Egypt, Email: marghny@acc.aun.edu.eg. Abstract
More informationA Technical Analysis of Market Basket by using Association Rule Mining and Apriori Algorithm
A Technical Analysis of Market Basket by using Association Rule Mining and Apriori Algorithm S.Pradeepkumar*, Mrs.C.Grace Padma** M.Phil Research Scholar, Department of Computer Science, RVS College of
More informationMining Frequent Itemsets Along with Rare Itemsets Based on Categorical Multiple Minimum Support
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 6, Ver. IV (Nov.-Dec. 2016), PP 109-114 www.iosrjournals.org Mining Frequent Itemsets Along with Rare
More informationEfficient Updating of Discovered Patterns for Text Mining: A Survey
Efficient Updating of Discovered Patterns for Text Mining: A Survey Anisha Radhakrishnan Post Graduate Student Karunya university Coimbatore, India Mathew Kurian Assistant Professor Karunya University
More informationFREQUENT PATTERN MINING IN BIG DATA USING MAVEN PLUGIN. School of Computing, SASTRA University, Thanjavur , India
Volume 115 No. 7 2017, 105-110 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu FREQUENT PATTERN MINING IN BIG DATA USING MAVEN PLUGIN Balaji.N 1,
More informationSTUDY ON FREQUENT PATTEREN GROWTH ALGORITHM WITHOUT CANDIDATE KEY GENERATION IN DATABASES
STUDY ON FREQUENT PATTEREN GROWTH ALGORITHM WITHOUT CANDIDATE KEY GENERATION IN DATABASES Prof. Ambarish S. Durani 1 and Mrs. Rashmi B. Sune 2 1 Assistant Professor, Datta Meghe Institute of Engineering,
More informationA Fast Algorithm for Mining Rare Itemsets
2009 Ninth International Conference on Intelligent Systems Design and Applications A Fast Algorithm for Mining Rare Itemsets Luigi Troiano University of Sannio Department of Engineering 82100 Benevento,
More informationInterestingness Measurements
Interestingness Measurements Objective measures Two popular measurements: support and confidence Subjective measures [Silberschatz & Tuzhilin, KDD95] A rule (pattern) is interesting if it is unexpected
More informationMining Generalised Emerging Patterns
Mining Generalised Emerging Patterns Xiaoyuan Qian, James Bailey, Christopher Leckie Department of Computer Science and Software Engineering University of Melbourne, Australia {jbailey, caleckie}@csse.unimelb.edu.au
More informationParallel Mining of Maximal Frequent Itemsets in PC Clusters
Proceedings of the International MultiConference of Engineers and Computer Scientists 28 Vol I IMECS 28, 19-21 March, 28, Hong Kong Parallel Mining of Maximal Frequent Itemsets in PC Clusters Vong Chan
More informationAN ENHANCED SEMI-APRIORI ALGORITHM FOR MINING ASSOCIATION RULES
AN ENHANCED SEMI-APRIORI ALGORITHM FOR MINING ASSOCIATION RULES 1 SALLAM OSMAN FAGEERI 2 ROHIZA AHMAD, 3 BAHARUM B. BAHARUDIN 1, 2, 3 Department of Computer and Information Sciences Universiti Teknologi
More informationPattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42
Pattern Mining Knowledge Discovery and Data Mining 1 Roman Kern KTI, TU Graz 2016-01-14 Roman Kern (KTI, TU Graz) Pattern Mining 2016-01-14 1 / 42 Outline 1 Introduction 2 Apriori Algorithm 3 FP-Growth
More informationAN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE
AN IMPROVISED FREQUENT PATTERN TREE BASED ASSOCIATION RULE MINING TECHNIQUE WITH MINING FREQUENT ITEM SETS ALGORITHM AND A MODIFIED HEADER TABLE Vandit Agarwal 1, Mandhani Kushal 2 and Preetham Kumar 3
More information