CS 6093 Lecture 7 Spring 2011 Basic Data Mining. Cong Yu 03/21/2011

Size: px
Start display at page:

Download "CS 6093 Lecture 7 Spring 2011 Basic Data Mining. Cong Yu 03/21/2011"

Transcription

1 CS 6093 Lecture 7 Spring 2011 Basic Data Mining Cong Yu 03/21/2011

2 Announcements No regular office hour next Monday (March 28 th ) Office hour next week will be on Tuesday through Thursday by appointment only I will be out of town from April 1 st to April 16 th Sporadic access, please plan accordingly if you need to discuss your projects with me Midterm report will be graded soon We are aiming for the end of the week Quiz next week will be based on today s lecture And it will be closed notes Any question on the projects?

3 Today s Outline Overview of Data Mining What, Why, How Classic Studies Association Rule Mining Data Cube Analysis Rule Interestingness

4 What is Data Mining? You are familiar with Data/Information Retrieval Querying the database using SQL Search the Web via keyword queries Data mining is NOT data retrieval Data mining = Discover hidden and useful knowledge from large amounts of data Hidden: you can t easily write a query to fetch what you want because you don t even know what you want Interesting: not every piece of hidden knowledge is useful... trivial discoveries can overwhelm the user Large amounts: simple techniques are no longer sufficient need efficient and scalable techniques

5 Examples of Data Mining Results Rules and Patterns Customers who buy Harry Potter books often buy Twilight books Users in NYC tend to search for expensive restaurants on Valentines Day Clusters and Classification TV viewers who watch 2+ hours of cable news every day can be divided into three groups: CNN, MSNBC, and Fox Given a viewer, predict which group s/he falls into (for advertising purpose)

6 Why Study Data Mining? Lots of Data Amazon, Walmart, Citibank, etc. Opportunities for: Purchase recommendation Credit card fraud detection Challenging for: Hidden information detection beyond human eyes Efficiency and scalability

7 How Data Mining Become a Field Started within the Database Systems community OLAP instead of OLTP OLTP: online transactional processing ATM transactions, Shopping transactions, etc. OLAP: online analytical processing Business intelligence, business reporting Heavily influenced by Machine Learning Statistics More recently Information Retrieval Recommendation Systems

8 How is Data Mining Done? Descriptive (closer to database): Classic topics: Association rule mining Frequent pattern mining Data cube analysis Clustering Group similar data points and separate dissimilar data points Anomaly detection Detect data points that significantly deviates from others Predictive (closer to statistics and machine learning): Classification Predict which label to be assigned to a data point based on its features Regression analysis Predict the value of a dependent variable (e.g., sales) based on the underlying variables (e.g., time and location)

9 Today s Outline Overview of Data Mining What, Why, How Classic Studies Association Rule Mining Data Cube Analysis Rule Interestingness

10 Association Rule Mining Classic Example: {diaper, milk} beer Intuitive definition: RHS often appears if the LHS appears, i.e., LHS implies RHS, but NOT causal! Applications Inventory management Determine which items to put together on the shelf to increase the sales Shopping recommendation You bought this book, you might want to check out these other books Many others

11 Formal Definition I = items and D = database of transactions A rule X Y is significant if Support and Confidence

12 Example

13 Brute Force Algorithm Compute support and confidence of all possible rules Exponential to the number of unique items in the database Computationally infeasible

14 The Space of All Itemsets (2 d )

15 Apriori Algorithm First reading assignment Fast Algorithms for Mining Association Rules, by Rakesh Agrawal and Ramakrishnan Srikant, VLDB 2004, Santiago, Chile Simple idea, but the impact is very high One of the classic papers on the topic, cited by others numerous times (~8000 according to Google Scholar latest results)

16 Observation (X U Y) must be frequent Once a frequent itemset is discovered, rules can be easily generated from the itemset Therefore, we have two subproblems Frequent itemset generation from the database Association rule generation from frequent itemsets

17 Apriori Principle If an itemset S is not frequent, then none of the itemsets, say T, is frequent if T is a superset of S, because Therefore, an itemset can be pruned away from consideration if any of its subsets is found to be infrequent

18 Applying Apriori Principle infrequent Processing order can be pruned

19 Example Items (1-itemsets) Minimum Support = 3 Pairs (2-itemsets) Combinations among Bread, Milk, Beer, Diaper only Triplets (3-itemsets) Ignore any 3-itemsets containing {Coke}, {Eggs}, {Bread,Beer}, {Milk,Beer}

20 Candidate Itemsets Generation Starts with frequent 1-itemset At iteration k, generate candidate (k+1)-itemsets: Items in all candidates itemsets are maintained in the same order (i.e., item_id) Merge two itemsets s 1 and s 2, if and only if they share all items except the last item and s 1.last_item_id < s 2.last_item_id Correctness Proof

21 Two Technical Challenges Efficiently eliminating candidates containing at least one infrequent subset Efficiently determining the support of the candidates Both leverage the hash tree structure

22 Hash Tree Construction For k-itemsets, construct a hash tree of depth-k At i-th level, hash on i-th item, follow the branch If there are few itemsets remaining, stop and store the itemsets at the node Hash function 1,4,7 2,5,8 3,6,

23 Match a 5-itemset Against a 3-itemsets Hash Tree Hash function ,4,7 2,5,8 3,6, Matched transaction against 3 out of 15 candidates Compared against 9 out of 15 candidates 6

24 Using the Hash Tree Eliminating candidate k-itemsets containing at least one infrequent subset Construct hash tree H of frequent (k-1)-itemsets Check each candidate k-itemset against H Prune away a candidate if at least one branch leads to an empty match Determining the support of the candidate k-itemsets Construct hash tree H of candidate k-itemsets Check each transaction against H Increment count for each matching candidate

25 The Algorithm Apriori

26 Next: Association Rules Generation Naïve algorithm: For each frequent itemset, S = {i 1, i 2,, i m }: For each subset T of S: Compute confidence as Output (S T) T if confidence is high enough The same Apriori Principle can be applied! Confidence of (S T 1 ) T 1 is always higher than (S T 2 ) T 2 if T 1 is a subset of T 2

27 Property: Closed-ness An itemset S is closed if none of its supersets have the same support {Beer} is not closed because {Beer, Diaper} has the same support {Beer, Diaper} is closed

28 Property: Maximality An itemset S is maximally frequent if none of its supersets is frequent

29 A Comparison An itemset S can be closed, but not maximally frequent The superset T of S may have lower support, but still above the minimum support threshold

30 Why Study Those Properties Closed Frequent Itemsets are interesting Their support is a proper aggregation of all their supersets instead of coming from just one Maximally Frequent Itemsets are the most interesting They are long/large enough to convey non-trivial knowledge

31 Mining Maximally Frequent Itemsets Second Reading Assignment Efficiently Mining Long Patterns from Databases, by Roberto Bayardo, SIGMOD 1998, Seattle, Washington

32 Inefficiency in Apriori Algorithm Processing order

33 Goal Discover maximally frequent itemsets before examining its subsets How? Given a candidate itemset, check not only its own support, but also the support of the largest itemset that can be derived from it If that largest itemset is frequent, then we can then ignore all possible subsets of it

34 Max-Miner: The Key Concept Candidate group (g) Head: H(g), the current candidate frequent itemset being checked Tail: T(g), all subsequent items that can be added to the group H(g) + T(g): the largest itemset that can be derived from H(g) Prune based not only on the support of H(g), as in Apriori, but also on the support of H(g) + T(g)

35 Max-Miner Processing Tree (Breadth First) abcd a bcd b cd c d d Frequent itemsets: {a}, {b}, {c}, {d}, {bc}, {bd}, {cd}, {abd}, {bcd} ab cd ac d ad bc d bd cd {bcd} is frequent {cd} is subset of {bcd} abc d abd acd bcd abcd {abc} is infrequent {ac} is infrequent Pruned candidates: Apriori: 4 Max-Miner: 8

36 Key Optimization Goal: encounter maximally frequent itemsets as often as possible so the pruning can achieve maximum potential Trick: reorder items according to their support Place most frequent items last because they appear in more candidate groups E.g., {d} in the previous example Often results in orders of magnitude increase in performance! (Section 6.3)

37 The Algorithm Max-Miner

38 Today s Outline Overview of Data Mining What, Why, How Classic Studies Association Rule Mining Data Cube Analysis Rule Interestingness

39 Data Warehouse / OLAP A decision support system that enables users to perform aggregate analysis of historical data to discover hidden patterns and trends Separated from the operational system (OLTP): OLTP records the day-to-day transactions and activities Gartner Dataquest estimates the size of the data warehouse market at $30B and growing That s even before the businesses start analyzing Web data

40 Data Cube Analysis TV PC DVD U.S.A Date 1Qtr 2Qtr 3Qtr 4Qtr Group By product, date Group By country, date Group By country, product Group By country Group By date Country Canada Mexico Group By product Group By none ALL

41 OLAP Operators Roll-up Go up the hierarchy or remove dimension from the group by Drill-down Move down the hierarchy or add dimension to the group by Pivot Pick k dimensions, group by all other dimensions Slice and Dice Point/Range selection over k dimensions

42 Cube Operator An operator designed specifically to compute all those aggregations at once Not a fundamental operator like select, project But a very useful one Challenge: how to cube efficiently? Group By product, date Group By country, date Group By country, product Cube Group By By product, country data, country Group By date Group By product Group By none

43 10 min Break

44 Efficient Cube Analysis Bottom-Up Computation of Sparse and Iceberg CUBEs, by Kevin Beyer and Raghu Ramakrishnan, SIGMOD 1999, Philadelphia, PA

45 Observation Not all aggregations are interesting Some aggregations may be computed over zero or very few actual tuples E.g., there is little or no sales of DVDs in the 1Qtr, then computing the aggregate over country for (DVD, 1Qtr) does not provide much information Concept: Iceberg Cube Cube by d 1, d 2,, Having count(*) > s Familiar? Yes, it is basically support!

46 Key Concept Measure The aggregate value computed from tuples within the group-by E.g., total sales, number of transactions Two very important properties Let S 1, S 2,, S m be a complete set of disjoint subsets of T A measure is algebraic if F(T) = H(G(S 1 ), G(S 2 ),, G(S m )), and function G returns an M-tuple as results and M is constant regardless of S i and T. A measure is monotonic (increasing or decreasing) if it is always true that F(T) >= F(S i ) or F(T) <= F(S i )

47 Example Measures Summation Algebraic? Monotonic? Yes and Yes Max and Min Algebraic? Monotonic? Yes and Yes Average Algebraic? Monotonic? Yes and No Count Unique Algebraic? Monotonic? No and Yes Measures that are being handled

48 Processing Tree (Depth First) a 1 all a 2 a b c d a 1 b 1 a k a 1 b 2 a 1 b k ab ac ad bc bd cd {b k } has low support abc abd acd bcd {a i c j } has low support abcd

49 Sorting and Partitioning BUC stops here early when support is below threshold

50 More Optimizations Three sorting algorithms are leveraged at different times Counting Sort When there are more tuples than the dimension cardinality Quick Sort When the number of tuples are much smaller than the dimension cardinality Insertion Sort When there a very few number of tuples Dimension reordering Decreasing cardinality: higher cardinality smaller partitions BUC can stop early Increasing skew: less skew higher effective cardinality

51 High Level Algorithm

52 Drawbacks of BUC If the Cube is dense, BUC is not efficient Most of the bottom partitions will satisfy the minimum support BUC also does not leverage the algebraic property Supersets are always processed before the subsets BUC also does not deal with non-monotonic measures as the pruning condition

53 So Far: Basic Concepts Association rules and frequent itemsets in a transaction database Data warehouse and data cube analysis Algorithms Apriori / Hash tree Max-Miner / Candidate reordering BUC / Dynamic selection of sorting algorithms Common theme: Good idea supported by good engineering

54 Today s Outline Overview of Data Mining What, Why, How Classic Studies Association Rule Mining Data Cube Analysis Rule Interestingness

55 Is Confidence a Good Measure? Coffee No_Coffee Tea No_Tea Association Rule: Tea Coffee Confidence = #(Tea, Coffee) / #(Tea) = 0.75 But Confidence of No_Tea Coffee = !! Although confidence is high, the rule is misleading

56 Alternative Measures: Lift Association Rule: Tea Coffee Coffee No_Coffee Tea No_Tea Lift = P(Tea, Coffee) / (P(Tea) P(Coffee)) = 0.15 / 0.18 = Negative correlation!

57 Alternative Measures: Leverage (P-S) Coffee No_Coffee Tea No_Tea Association Rule: Tea Coffee Leverage = P(Tea,Coffee) - (P(Tea) P(Coffee)) = = Less support than expected!

58 Lift/Leverage is not Perfect Either Lift is too sensitive to support Leverage is insensitive to support Y Y X X Y Y X X Lift = 0.1 (0.1)(0.1) =10 Lift = 0.9 (0.9)(0.9) =1.11

59 More Measures

60 Property of Rules Three properties of good measures M(A B) = 0 or 1 if A and B are statistically independent M(A B) increases with P(AB) if P(A) and P(B) remains constant M(A B) decreases with P(A) if P(AB) and P(B) remains constant Symmetric and asymmetric Symmetric: Lift, Leverage Asymmetric: Confidence Total/partial ordering of rules

61 Type-1 Error: False Positives In exploratory analysis: when the amount of data is large enough, some significant rules will appear purely by chance! Similar to the birthday paradox In a room with 23 people, the probability of two people with the same birthday is greater than 50% Statistics to the rescue!

62 Holdout Evaluation

63 Subjective Interestingness Applicable especially to cube analysis Sales of (ipod, NYC) is $200M Is it interesting? Unexpectedness Interesting if the average sales of (ipod, City) is $20M Actionable Interesting if knowing the sales can trigger some actions, such as adjust logistics and advertising

64 Identifying Best Rules Mining the Most Interesting Rules, Roberto Bayardo and Rakesh Agrawal, SIGKDD Optimized rule mining problem in the presence of partial ordering

65 Optimized Rule Mining If there is a total order t over all the rules, identify the optimal rule A as: A satisfies the conditions There is no rule B s.t. B satisfies the conditions and A < t B E.g., rules can be ordered according to their confidence, and we pick the rule with highest confidence However, often only a partial order is available! Selecting one optimal rule is not reasonable

66 Optimized Rule Mining with Partial Order Given a partial order p over the rules, identify a set of rules S, s.t. Each rule s S, s is optimal There is a rule for each equivalent class in the partial order If two rules can be compared according to the partial order, they are in the same equivalence class Only one rule from each equivalence class can be selected E.g., r 1 > p r 2 and r 1 > p r 3, r 2 and r 3 are not comparable Then only r 1 can be selected

67 SC-Optimal Rule Mining Partial order sc : Defined based on support and confidence A rule r 2 is greater than another rule r 1 if and only if: r 2 has greater or equal support and confidence Partial order s^c : Defined based on support and confidence A rule r 2 is greater than another rule r 1 if and only if: r 2 has greater or equal support and lesser or equal confidence

68 Theoretical Implication Many total orders are implied by the two SC partial orders r 1 < p r 2 r 1 < t r 2 r 1 = p r 2 r 1 = t r 2 Total orders implied by sc Monotonically increasing with support when confidence remains constant Monotonically increasing with confidence when support remains constant Exactly two of the properties of good measures!

69 PC-Optimal Rule Mining Partial order pc : Defined based on population, i.e., set of records characterized by the rule A rule r 2 is greater than another rule r 1 if and only if: The population of r 2 is a superset of that of r 1 r 2 has greater or equal confidence PC partial order contains many more equivalent classes Two rules that are PC comparable must be SC comparable Two rules may be SC comparable, but not PC comparable

70 Summary Association Rule Mining Apriori & Max-Miner Data Cube Analysis BUC Rule Interestingness Finding optimal rules according to a partial ordering Next Week: Advanced data mining topics 70 CS

71 Questions?

Frequent Pattern Mining. Based on: Introduction to Data Mining by Tan, Steinbach, Kumar

Frequent Pattern Mining. Based on: Introduction to Data Mining by Tan, Steinbach, Kumar Frequent Pattern Mining Based on: Introduction to Data Mining by Tan, Steinbach, Kumar Item sets A New Type of Data Some notation: All possible items: Database: T is a bag of transactions Transaction transaction

More information

Apriori Algorithm. 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke

Apriori Algorithm. 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Apriori Algorithm For a given set of transactions, the main aim of Association Rule Mining is to find rules that will predict the occurrence of an item based on the occurrences of the other items in the

More information

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R,

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, statistics foundations 5 Introduction to D3, visual analytics

More information

Chapter 4: Association analysis:

Chapter 4: Association analysis: Chapter 4: Association analysis: 4.1 Introduction: Many business enterprises accumulate large quantities of data from their day-to-day operations, huge amounts of customer purchase data are collected daily

More information

ANU MLSS 2010: Data Mining. Part 2: Association rule mining

ANU MLSS 2010: Data Mining. Part 2: Association rule mining ANU MLSS 2010: Data Mining Part 2: Association rule mining Lecture outline What is association mining? Market basket analysis and association rule examples Basic concepts and formalism Basic rule measurements

More information

Mining Association Rules in Large Databases

Mining Association Rules in Large Databases Mining Association Rules in Large Databases Association rules Given a set of transactions D, find rules that will predict the occurrence of an item (or a set of items) based on the occurrences of other

More information

Data Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems

Data Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems Data Warehousing & Mining CPS 116 Introduction to Database Systems Data integration 2 Data resides in many distributed, heterogeneous OLTP (On-Line Transaction Processing) sources Sales, inventory, customer,

More information

Data Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems

Data Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems Data Warehousing and Data Mining CPS 116 Introduction to Database Systems Announcements (December 1) 2 Homework #4 due today Sample solution available Thursday Course project demo period has begun! Check

More information

Association Pattern Mining. Lijun Zhang

Association Pattern Mining. Lijun Zhang Association Pattern Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction The Frequent Pattern Mining Model Association Rule Generation Framework Frequent Itemset Mining Algorithms

More information

CS570 Introduction to Data Mining

CS570 Introduction to Data Mining CS570 Introduction to Data Mining Frequent Pattern Mining and Association Analysis Cengiz Gunay Partial slide credits: Li Xiong, Jiawei Han and Micheline Kamber George Kollios 1 Mining Frequent Patterns,

More information

Chapter 4: Mining Frequent Patterns, Associations and Correlations

Chapter 4: Mining Frequent Patterns, Associations and Correlations Chapter 4: Mining Frequent Patterns, Associations and Correlations 4.1 Basic Concepts 4.2 Frequent Itemset Mining Methods 4.3 Which Patterns Are Interesting? Pattern Evaluation Methods 4.4 Summary Frequent

More information

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the

Chapter 6: Basic Concepts: Association Rules. Basic Concepts: Frequent Patterns. (absolute) support, or, support. (relative) support, s, is the Chapter 6: What Is Frequent ent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc) that occurs frequently in a data set frequent itemsets and association rule

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Lecture 3 Efficient Cube Computation CITS3401 CITS5504 Wei Liu School of Computer Science and Software Engineering Faculty of Engineering, Computing and Mathematics Acknowledgement:

More information

BCB 713 Module Spring 2011

BCB 713 Module Spring 2011 Association Rule Mining COMP 790-90 Seminar BCB 713 Module Spring 2011 The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline What is association rule mining? Methods for association rule mining Extensions

More information

Frequent Pattern Mining

Frequent Pattern Mining Frequent Pattern Mining How Many Words Is a Picture Worth? E. Aiden and J-B Michel: Uncharted. Reverhead Books, 2013 Jian Pei: CMPT 741/459 Frequent Pattern Mining (1) 2 Burnt or Burned? E. Aiden and J-B

More information

Data Mining: Concepts and Techniques. Chapter 5. SS Chung. April 5, 2013 Data Mining: Concepts and Techniques 1

Data Mining: Concepts and Techniques. Chapter 5. SS Chung. April 5, 2013 Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques Chapter 5 SS Chung April 5, 2013 Data Mining: Concepts and Techniques 1 Chapter 5: Mining Frequent Patterns, Association and Correlations Basic concepts and a road

More information

Association Rule Mining. Entscheidungsunterstützungssysteme

Association Rule Mining. Entscheidungsunterstützungssysteme Association Rule Mining Entscheidungsunterstützungssysteme Frequent Pattern Analysis Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 1/8/2014 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 Supermarket shelf

More information

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42

Pattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42 Pattern Mining Knowledge Discovery and Data Mining 1 Roman Kern KTI, TU Graz 2016-01-14 Roman Kern (KTI, TU Graz) Pattern Mining 2016-01-14 1 / 42 Outline 1 Introduction 2 Apriori Algorithm 3 FP-Growth

More information

CompSci 516 Data Intensive Computing Systems

CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 20 Data Mining and Mining Association Rules Instructor: Sudeepa Roy CompSci 516: Data Intensive Computing Systems 1 Reading Material Optional Reading:

More information

signicantly higher than it would be if items were placed at random into baskets. For example, we

signicantly higher than it would be if items were placed at random into baskets. For example, we 2 Association Rules and Frequent Itemsets The market-basket problem assumes we have some large number of items, e.g., \bread," \milk." Customers ll their market baskets with some subset of the items, and

More information

Efficient Computation of Data Cubes. Network Database Lab

Efficient Computation of Data Cubes. Network Database Lab Efficient Computation of Data Cubes Network Database Lab Outlines Introduction Some CUBE Algorithms ArrayCube PartitionedCube and MemoryCube Bottom-Up Cube (BUC) Conclusions References Network Database

More information

Market baskets Frequent itemsets FP growth. Data mining. Frequent itemset Association&decision rule mining. University of Szeged.

Market baskets Frequent itemsets FP growth. Data mining. Frequent itemset Association&decision rule mining. University of Szeged. Frequent itemset Association&decision rule mining University of Szeged What frequent itemsets could be used for? Features/observations frequently co-occurring in some database can gain us useful insights

More information

Supervised and Unsupervised Learning (II)

Supervised and Unsupervised Learning (II) Supervised and Unsupervised Learning (II) Yong Zheng Center for Web Intelligence DePaul University, Chicago IPD 346 - Data Science for Business Program DePaul University, Chicago, USA Intro: Supervised

More information

Chapter 7: Frequent Itemsets and Association Rules

Chapter 7: Frequent Itemsets and Association Rules Chapter 7: Frequent Itemsets and Association Rules Information Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Winter Semester 2013/14 VII.1&2 1 Motivational Example Assume you run an on-line

More information

Association Rules. A. Bellaachia Page: 1

Association Rules. A. Bellaachia Page: 1 Association Rules 1. Objectives... 2 2. Definitions... 2 3. Type of Association Rules... 7 4. Frequent Itemset generation... 9 5. Apriori Algorithm: Mining Single-Dimension Boolean AR 13 5.1. Join Step:...

More information

Data warehouses Decision support The multidimensional model OLAP queries

Data warehouses Decision support The multidimensional model OLAP queries Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing

More information

High dim. data. Graph data. Infinite data. Machine learning. Apps. Locality sensitive hashing. Filtering data streams.

High dim. data. Graph data. Infinite data. Machine learning. Apps. Locality sensitive hashing. Filtering data streams. http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Network Analysis

More information

Machine Learning: Symbolische Ansätze

Machine Learning: Symbolische Ansätze Machine Learning: Symbolische Ansätze Unsupervised Learning Clustering Association Rules V2.0 WS 10/11 J. Fürnkranz Different Learning Scenarios Supervised Learning A teacher provides the value for the

More information

Tutorial on Assignment 3 in Data Mining 2009 Frequent Itemset and Association Rule Mining. Gyozo Gidofalvi Uppsala Database Laboratory

Tutorial on Assignment 3 in Data Mining 2009 Frequent Itemset and Association Rule Mining. Gyozo Gidofalvi Uppsala Database Laboratory Tutorial on Assignment 3 in Data Mining 2009 Frequent Itemset and Association Rule Mining Gyozo Gidofalvi Uppsala Database Laboratory Announcements Updated material for assignment 3 on the lab course home

More information

Frequent Pattern Mining S L I D E S B Y : S H R E E J A S W A L

Frequent Pattern Mining S L I D E S B Y : S H R E E J A S W A L Frequent Pattern Mining S L I D E S B Y : S H R E E J A S W A L Topics to be covered Market Basket Analysis, Frequent Itemsets, Closed Itemsets, and Association Rules; Frequent Pattern Mining, Efficient

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 6

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 6 Data Mining: Concepts and Techniques (3 rd ed.) Chapter 6 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013-2017 Han, Kamber & Pei. All

More information

Chapter 4 Data Mining A Short Introduction

Chapter 4 Data Mining A Short Introduction Chapter 4 Data Mining A Short Introduction Data Mining - 1 1 Today's Question 1. Data Mining Overview 2. Association Rule Mining 3. Clustering 4. Classification Data Mining - 2 2 1. Data Mining Overview

More information

Association Rule Discovery

Association Rule Discovery Association Rule Discovery Association Rules describe frequent co-occurences in sets an item set is a subset A of all possible items I Example Problems: Which products are frequently bought together by

More information

Chapter 5, Data Cube Computation

Chapter 5, Data Cube Computation CSI 4352, Introduction to Data Mining Chapter 5, Data Cube Computation Young-Rae Cho Associate Professor Department of Computer Science Baylor University A Roadmap for Data Cube Computation Full Cube Full

More information

Association Rules. Berlin Chen References:

Association Rules. Berlin Chen References: Association Rules Berlin Chen 2005 References: 1. Data Mining: Concepts, Models, Methods and Algorithms, Chapter 8 2. Data Mining: Concepts and Techniques, Chapter 6 Association Rules: Basic Concepts A

More information

We will be releasing HW1 today It is due in 2 weeks (1/25 at 23:59pm) The homework is long

We will be releasing HW1 today It is due in 2 weeks (1/25 at 23:59pm) The homework is long 1/21/18 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 1 We will be releasing HW1 today It is due in 2 weeks (1/25 at 23:59pm) The homework is long Requires proving theorems

More information

Data Mining Part 3. Associations Rules

Data Mining Part 3. Associations Rules Data Mining Part 3. Associations Rules 3.2 Efficient Frequent Itemset Mining Methods Fall 2009 Instructor: Dr. Masoud Yaghini Outline Apriori Algorithm Generating Association Rules from Frequent Itemsets

More information

Association rules. Marco Saerens (UCL), with Christine Decaestecker (ULB)

Association rules. Marco Saerens (UCL), with Christine Decaestecker (ULB) Association rules Marco Saerens (UCL), with Christine Decaestecker (ULB) 1 Slides references Many slides and figures have been adapted from the slides associated to the following books: Alpaydin (2004),

More information

Frequent Item Sets & Association Rules

Frequent Item Sets & Association Rules Frequent Item Sets & Association Rules V. CHRISTOPHIDES vassilis.christophides@inria.fr https://who.rocq.inria.fr/vassilis.christophides/big/ Ecole CentraleSupélec 1 Some History Bar code technology allowed

More information

Association Rule Discovery

Association Rule Discovery Association Rule Discovery Association Rules describe frequent co-occurences in sets an itemset is a subset A of all possible items I Example Problems: Which products are frequently bought together by

More information

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application Data Structures Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali 2009-2010 Association Rules: Basic Concepts and Application 1. Association rules: Given a set of transactions, find

More information

An Empirical Comparison of Methods for Iceberg-CUBE Construction. Leah Findlater and Howard J. Hamilton Technical Report CS August, 2000

An Empirical Comparison of Methods for Iceberg-CUBE Construction. Leah Findlater and Howard J. Hamilton Technical Report CS August, 2000 An Empirical Comparison of Methods for Iceberg-CUBE Construction Leah Findlater and Howard J. Hamilton Technical Report CS-2-6 August, 2 Copyright 2, Leah Findlater and Howard J. Hamilton Department of

More information

On-Line Analytical Processing (OLAP) Traditional OLTP

On-Line Analytical Processing (OLAP) Traditional OLTP On-Line Analytical Processing (OLAP) CSE 6331 / CSE 6362 Data Mining Fall 1999 Diane J. Cook Traditional OLTP DBMS used for on-line transaction processing (OLTP) order entry: pull up order xx-yy-zz and

More information

OPTIMISING ASSOCIATION RULE ALGORITHMS USING ITEMSET ORDERING

OPTIMISING ASSOCIATION RULE ALGORITHMS USING ITEMSET ORDERING OPTIMISING ASSOCIATION RULE ALGORITHMS USING ITEMSET ORDERING ES200 Peterhouse College, Cambridge Frans Coenen, Paul Leng and Graham Goulbourne The Department of Computer Science The University of Liverpool

More information

Roadmap. PCY Algorithm

Roadmap. PCY Algorithm 1 Roadmap Frequent Patterns A-Priori Algorithm Improvements to A-Priori Park-Chen-Yu Algorithm Multistage Algorithm Approximate Algorithms Compacting Results Data Mining for Knowledge Management 50 PCY

More information

2. Discovery of Association Rules

2. Discovery of Association Rules 2. Discovery of Association Rules Part I Motivation: market basket data Basic notions: association rule, frequency and confidence Problem of association rule mining (Sub)problem of frequent set mining

More information

Association Rule Mining. Introduction 46. Study core 46

Association Rule Mining. Introduction 46. Study core 46 Learning Unit 7 Association Rule Mining Introduction 46 Study core 46 1 Association Rule Mining: Motivation and Main Concepts 46 2 Apriori Algorithm 47 3 FP-Growth Algorithm 47 4 Assignment Bundle: Frequent

More information

Mining Frequent Patterns without Candidate Generation

Mining Frequent Patterns without Candidate Generation Mining Frequent Patterns without Candidate Generation Outline of the Presentation Outline Frequent Pattern Mining: Problem statement and an example Review of Apriori like Approaches FP Growth: Overview

More information

Fundamental Data Mining Algorithms

Fundamental Data Mining Algorithms 2018 EE448, Big Data Mining, Lecture 3 Fundamental Data Mining Algorithms Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html REVIEW What is Data

More information

Discovering interesting rules from financial data

Discovering interesting rules from financial data Discovering interesting rules from financial data Przemysław Sołdacki Institute of Computer Science Warsaw University of Technology Ul. Andersa 13, 00-159 Warszawa Tel: +48 609129896 email: psoldack@ii.pw.edu.pl

More information

Lecture 2 Wednesday, August 22, 2007

Lecture 2 Wednesday, August 22, 2007 CS 6604: Data Mining Fall 2007 Lecture 2 Wednesday, August 22, 2007 Lecture: Naren Ramakrishnan Scribe: Clifford Owens 1 Searching for Sets The canonical data mining problem is to search for frequent subsets

More information

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining. About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts

More information

Product presentations can be more intelligently planned

Product presentations can be more intelligently planned Association Rules Lecture /DMBI/IKI8303T/MTI/UI Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id) Faculty of Computer Science, Objectives Introduction What is Association Mining? Mining Association Rules

More information

CompSci 516 Data Intensive Computing Systems

CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 25 Data Mining and Mining Association Rules Instructor: Sudeepa Roy Due CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Announcements

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs

More information

Data Mining Algorithms

Data Mining Algorithms Algorithms Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Looking for patterns in data Machine

More information

Interestingness Measurements

Interestingness Measurements Interestingness Measurements Objective measures Two popular measurements: support and confidence Subjective measures [Silberschatz & Tuzhilin, KDD95] A rule (pattern) is interesting if it is unexpected

More information

Effectiveness of Freq Pat Mining

Effectiveness of Freq Pat Mining Effectiveness of Freq Pat Mining Too many patterns! A pattern a 1 a 2 a n contains 2 n -1 subpatterns Understanding many patterns is difficult or even impossible for human users Non-focused mining A manager

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 Supermarket shelf management Market-basket model: Goal: Identify items that are bought together by sufficiently many customers Approach: Process the sales data collected with barcode scanners to find dependencies

More information

Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Data mining - detailed outline. Problem

Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Data mining - detailed outline. Problem Faloutsos & Pavlo 15415/615 Carnegie Mellon Univ. Dept. of Computer Science 15415/615 DB Applications Lecture # 24: Data Warehousing / Data Mining (R&G, ch 25 and 26) Data mining detailed outline Problem

More information

Association Rules Apriori Algorithm

Association Rules Apriori Algorithm Association Rules Apriori Algorithm Market basket analysis n Market basket analysis might tell a retailer that customers often purchase shampoo and conditioner n Putting both items on promotion at the

More information

Improved Frequent Pattern Mining Algorithm with Indexing

Improved Frequent Pattern Mining Algorithm with Indexing IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.

More information

2 CONTENTS

2 CONTENTS Contents 4 Data Cube Computation and Data Generalization 3 4.1 Efficient Methods for Data Cube Computation............................. 3 4.1.1 A Road Map for Materialization of Different Kinds of Cubes.................

More information

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)? Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely

More information

Data Mining Concepts

Data Mining Concepts Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms Sequential

More information

Data Mining Clustering

Data Mining Clustering Data Mining Clustering Jingpeng Li 1 of 34 Supervised Learning F(x): true function (usually not known) D: training sample (x, F(x)) 57,M,195,0,125,95,39,25,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0 0

More information

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts Chapter 28 Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms

More information

Association Analysis. CSE 352 Lecture Notes. Professor Anita Wasilewska

Association Analysis. CSE 352 Lecture Notes. Professor Anita Wasilewska Association Analysis CSE 352 Lecture Notes Professor Anita Wasilewska Association Rules Mining An Introduction This is an intuitive (more or less ) introduction It contains explanation of the main ideas:

More information

Data mining, 4 cu Lecture 6:

Data mining, 4 cu Lecture 6: 582364 Data mining, 4 cu Lecture 6: Quantitative association rules Multi-level association rules Spring 2010 Lecturer: Juho Rousu Teaching assistant: Taru Itäpelto Data mining, Spring 2010 (Slides adapted

More information

CHAPTER 8. ITEMSET MINING 226

CHAPTER 8. ITEMSET MINING 226 CHAPTER 8. ITEMSET MINING 226 Chapter 8 Itemset Mining In many applications one is interested in how often two or more objectsofinterest co-occur. For example, consider a popular web site, which logs all

More information

Data mining - detailed outline. Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Problem.

Data mining - detailed outline. Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Problem. Faloutsos & Pavlo 15415/615 Carnegie Mellon Univ. Dept. of Computer Science 15415/615 DB Applications Data Warehousing / Data Mining (R&G, ch 25 and 26) C. Faloutsos and A. Pavlo Data mining detailed outline

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University A large set of items, e.g., things sold in a supermarket. A large set of baskets, each of which is a small set of the items, e.g., the things one customer buys on

More information

Frequent Pattern Mining

Frequent Pattern Mining Frequent Pattern Mining...3 Frequent Pattern Mining Frequent Patterns The Apriori Algorithm The FP-growth Algorithm Sequential Pattern Mining Summary 44 / 193 Netflix Prize Frequent Pattern Mining Frequent

More information

Association Rules Apriori Algorithm

Association Rules Apriori Algorithm Association Rules Apriori Algorithm Market basket analysis n Market basket analysis might tell a retailer that customers often purchase shampoo and conditioner n Putting both items on promotion at the

More information

A Technical Analysis of Market Basket by using Association Rule Mining and Apriori Algorithm

A Technical Analysis of Market Basket by using Association Rule Mining and Apriori Algorithm A Technical Analysis of Market Basket by using Association Rule Mining and Apriori Algorithm S.Pradeepkumar*, Mrs.C.Grace Padma** M.Phil Research Scholar, Department of Computer Science, RVS College of

More information

Performance and Scalability: Apriori Implementa6on

Performance and Scalability: Apriori Implementa6on Performance and Scalability: Apriori Implementa6on Apriori R. Agrawal and R. Srikant. Fast algorithms for mining associa6on rules. VLDB, 487 499, 1994 Reducing Number of Comparisons Candidate coun6ng:

More information

Advance Association Analysis

Advance Association Analysis Advance Association Analysis 1 Minimum Support Threshold 3 Effect of Support Distribution Many real data sets have skewed support distribution Support distribution of a retail data set 4 Effect of Support

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Spring 2013 " An second class in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt13 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Roadmap DB Sys. Design & Impl. Association rules - outline. Citations. Association rules - idea. Association rules - idea.

Roadmap DB Sys. Design & Impl. Association rules - outline. Citations. Association rules - idea. Association rules - idea. 15-721 DB Sys. Design & Impl. Association Rules Christos Faloutsos www.cs.cmu.edu/~christos Roadmap 1) Roots: System R and Ingres... 7) Data Analysis - data mining datacubes and OLAP classifiers association

More information

Outline. Project Update Data Mining: Answers without Queries. Principles of Information and Database Management 198:336 Week 12 Apr 25 Matthew Stone

Outline. Project Update Data Mining: Answers without Queries. Principles of Information and Database Management 198:336 Week 12 Apr 25 Matthew Stone Outline Principles of Information and Database Management 198:336 Week 12 Apr 25 Matthew Stone Project Update Data Mining: Answers without Queries Patterns and statistics Finding frequent item sets Classification

More information

Association Analysis: Basic Concepts and Algorithms

Association Analysis: Basic Concepts and Algorithms 5 Association Analysis: Basic Concepts and Algorithms Many business enterprises accumulate large quantities of data from their dayto-day operations. For example, huge amounts of customer purchase data

More information

Jarek Szlichta Acknowledgments: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques

Jarek Szlichta  Acknowledgments: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques Jarek Szlichta http://data.science.uoit.ca/ Acknowledgments: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques Frequent Itemset Mining Methods Apriori Which Patterns Are

More information

CHAPTER 5 WEIGHTED SUPPORT ASSOCIATION RULE MINING USING CLOSED ITEMSET LATTICES IN PARALLEL

CHAPTER 5 WEIGHTED SUPPORT ASSOCIATION RULE MINING USING CLOSED ITEMSET LATTICES IN PARALLEL 68 CHAPTER 5 WEIGHTED SUPPORT ASSOCIATION RULE MINING USING CLOSED ITEMSET LATTICES IN PARALLEL 5.1 INTRODUCTION During recent years, one of the vibrant research topics is Association rule discovery. This

More information

Reading Material. Data Mining - 2. Data Mining - 1. OLAP vs. Data Mining 11/19/17. Four Main Steps in KD and DM (KDD) CompSci 516: Database Systems

Reading Material. Data Mining - 2. Data Mining - 1. OLAP vs. Data Mining 11/19/17. Four Main Steps in KD and DM (KDD) CompSci 516: Database Systems Reading Material CompSci 56 Database Systems Lecture 23 Data Mining and Mining Association Rules Instructor: Sudeepa Roy Optional Reading:. [RG]: Chapter 26 2. Fast Algorithms for Mining Association Rules

More information

COMS 4721: Machine Learning for Data Science Lecture 23, 4/20/2017

COMS 4721: Machine Learning for Data Science Lecture 23, 4/20/2017 COMS 4721: Machine Learning for Data Science Lecture 23, 4/20/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University ASSOCIATION ANALYSIS SETUP Many businesses

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Mining Frequent Patterns and Associations: Basic Concepts (Chapter 6) Huan Sun, CSE@The Ohio State University 10/19/2017 Slides adapted from Prof. Jiawei Han @UIUC, Prof.

More information

An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets

An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.8, August 2008 121 An Efficient Reduced Pattern Count Tree Method for Discovering Most Accurate Set of Frequent itemsets

More information

2 CONTENTS

2 CONTENTS Contents 5 Mining Frequent Patterns, Associations, and Correlations 3 5.1 Basic Concepts and a Road Map..................................... 3 5.1.1 Market Basket Analysis: A Motivating Example........................

More information

Mining Association Rules in Large Databases

Mining Association Rules in Large Databases Mining Association Rules in Large Databases Vladimir Estivill-Castro School of Computing and Information Technology With contributions fromj. Han 1 Association Rule Mining A typical example is market basket

More information

Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a

Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a Data Mining and Information Retrieval Introduction to Data Mining Why Data Mining? Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently

More information

Understanding Rule Behavior through Apriori Algorithm over Social Network Data

Understanding Rule Behavior through Apriori Algorithm over Social Network Data Global Journal of Computer Science and Technology Volume 12 Issue 10 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING SHRI ANGALAMMAN COLLEGE OF ENGINEERING & TECHNOLOGY (An ISO 9001:2008 Certified Institution) SIRUGANOOR,TRICHY-621105. DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year / Semester: IV/VII CS1011-DATA

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

Chapter 7: Frequent Itemsets and Association Rules

Chapter 7: Frequent Itemsets and Association Rules Chapter 7: Frequent Itemsets and Association Rules Information Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Winter Semester 2011/12 VII.1-1 Chapter VII: Frequent Itemsets and Association

More information

Association Rules Outline

Association Rules Outline Association Rules Outline Goal: Provide an overview of basic Association Rule mining techniques Association Rules Problem Overview Large/Frequent itemsets Association Rules Algorithms Apriori Sampling

More information

Data warehouse and Data Mining

Data warehouse and Data Mining Data warehouse and Data Mining Lecture No. 14 Data Mining and its techniques Naeem A. Mahoto Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

Data Cube Technology

Data Cube Technology Data Cube Technology Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ s.manegold@liacs.leidenuniv.nl e.m.bakker@liacs.leidenuniv.nl

More information

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition S.Vigneswaran 1, M.Yashothai 2 1 Research Scholar (SRF), Anna University, Chennai.

More information

Tree Structures for Mining Association Rules

Tree Structures for Mining Association Rules Data Mining and Knowledge Discovery, 8, 25 51, 2004 c 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Tree Structures for Mining Association Rules FRANS COENEN GRAHAM GOULBOURNE PAUL

More information