Quotient Cube: How to Summarize the Semantics of a Data Cube

Size: px
Start display at page:

Download "Quotient Cube: How to Summarize the Semantics of a Data Cube"

Transcription

1 Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo) * Jiawei Han (Univ. of Illinois at Urbana-Champaign) + * The work is partially supported by NSERC and NCE/IRIS + The work is partially supported by NSF, UI, and Microsoft Research

2 Outline Introduction and motivation Cube lattice partitions Semantics preserving partitions Algorithms Experimental results Discussion and summary Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 2

3 Data Cube Base table Dimensions Measure Store S1 S1 S2 Product P1 P2 P1 Season Spring Spring Fall Sales Store S1 S1 Dimensions Product Season P1 Spring P2 Spring Measure AVG(Sales) 6 12 S2 P1 Fall 9 S1 * Spring 9 Aggregation * * * 9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 3

4 Previous Work: Efficient Cube Computation Compute a cube from a base table: e.g. (Agarwal et al. 98), (Zhao et al. 97) View materialization with space constraint: e.g. Harinarayann et al. 96 Handling scarcity (Ross & Srivastava 97) Cube compression: e.g. (Sismanis et al. 02), (Shanmugasundaram et al. 99), (Want et al. 02) Approximation: e.g. (Barbara & Sullivan 97), (Barbara & Xu 00), (Vitter et al. 98) Constrained cube construction: e.g. (Beyer & Ramakrishnan 99) Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 4

5 Previous Work: Extracting Semantics From Cubes General contexts of patterns (Sathe & Sarawagi 01) Generalize association rules (Imielinski et al. 00) Cube gradient analysis (Dong et al. 01) Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 5

6 Cube (Cell) Lattice Many cells have same aggregate values Can we summarize the semantics of the cube by grouping cells by aggregate values? (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 6

7 A Naïve Attempt Put all cells having same aggregate value in a class (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 C1 C2 C3 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 C4 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 7

8 Problems w/ the Naïve Attempt The result is not a lattice anymore! rollup rollup Anomaly C3 C4 C3 The rollup/drilldown semantics is lost (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 C1 C2 C3 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 8 C4

9 A Better Partitioning Quotient cube: partitioning reserving the rollup/drilldown semantics (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 C1 C2 C3 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9 C4 C5 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 9

10 Problem Statement Given a cube, characterize a good way (quotient cube) of partitioning its cells into classes such that The partition generates a reduced lattice preserving the rollup/drilldown semantics The partition is optimal: # classes as small as possible Compute quotient cubes efficiently Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 10

11 Why A Quotient Cube Useful? Semantic compression Semantic OLAP browsing (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 C3 (S1,*,s):9(S1,P1,*):6(*,P1,s):6(S1,P2,*):12(*,P2,s):12(S2,*,f):9 (S2,P1,*)(*,P1,f):9 C1 C2 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 C4 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 11 C5

12 Why A Quotient Cube Useful? (S2,P1,f):9 Semantic compression Semantic OLAP browsing (S2,*,f):9 (S2,P1,*) (*,P1,f):9 (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 (*,*,f):9 (S2,*,*):9 (S1,*,s):9(S1,P1,*):6(*,P1,s):6(S1,P2,*):12(*,P2,s):12(S2,*,f):9 (S2,P1,*)(*,P1,f):9 C1 C2 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 C4 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 12 C5

13 Outline Introduction and motivation Cube lattice partitions Semantics preserving partitions Algorithms Experimental results Discussion and summary Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 13

14 Convex Partitions A convex partition retains semantics c rollup rollup c2 c3, c1 c3 CLS c2 1, CLS (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 C1 C2 C3 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9 C4 C5 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 14

15 A Non-convex Partition Anomaly rollup rollup C3 C4 C3 The rollup/drilldown semantics is lost (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 C1 C2 C3 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*):9(*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 15 C4

16 Connected Partitions Cells c1 and c2 are connected if a series of rollup/drilldown operation starting from c1 can touch c2 Intuitively, (each class of) a partition should be connected Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 16

17 Cover Partition For a cell c, a tuple t in base table is in c s cover if t can be rolled up to c E.g., Cov(S1,*,spring)={(S1,P1,spring), (S1,P2,spring)} Store S1 S1 S2 Dimensions Product Season P1 Spring P2 Spring P1 Fall Measure Sales Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 17

18 Cover Partitions Are Convex All cells having the same cover are in a class (S1,P2,s) and (*,P2,*) cover same tuples in the base table (S1,P2,*) and (*,P2,s) are in the same class. (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 18

19 Cover Partitions Are Connected Cells c1 and c2 have the same cover there must be some common ancestor c3 of c1 and c2 st c3 has the same cover Cells c1 and c2 are in the same class and connected (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 19

20 Cover Partitions & Aggregates All cells in a cover partition carry the same aggregate value w.r.t. any aggregate function But cells in a class of MIN() may have different covers For COUNT() and SUM() (positive), cover equivalence coincides with aggregate equivalence Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 20

21 Outline Introduction and motivation Cube lattice partitions Semantics preserving partitions Algorithms Experimental results Discussion and summary Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 21

22 Weak Congruence Weak congruence preserves semantics c Class 1 c c c rollup rollup imply rollup Class 1 = Class 2 rollup d Class 2 d d d Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 22

23 Weak Congruence = Convex Convex no hole in the class weak congruence They preserve the rollup/drilldown semantics Quotient cube lattice is the lattice of convex classes How to derive the coarsest quotient cube? Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 23

24 Monotone Aggregate Functions Monotone functions S T f(s) f(t) S T f(s) f(t) MIN(), MAX(), COUNT(), PSUM(), The aggregate function f is monotone f is the unique coarsest partition MIN(): put all cells having the same MIN() value into a class Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 24

25 Non-monotone Functions Bad news: f may or may not be a convex/weak congruence. Good news: cover partition is convex (I.e., weak congruence) and always yields a quotient cube w.r.t. any aggregate function! Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 25

26 Outline Introduction and motivation Cube lattice partitions Semantics preserving partitions Algorithms Experimental results Discussion and summary Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 26

27 How to Compute A QC Aggregate functions Monotone functions Non-monotone functions Settings The cube is available Only the base table is available Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 27

28 Monotone Functions The cube is available grab all cells with the same aggregate value and put them into a class Only the base table is available bottom-up, depth-first search For a cell, compute its cover, find the upper bound having the same aggregate value Group lower bounds by upper bounds Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 28

29 Example: Cover QC (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 29

30 Non-monotone Functions Class merging Find cover partition classes Merge classes as long as convexity is retained Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 30

31 Example: AVG QC (S1,P1,s):6 (S1,P2,s):12 (S2,P1,f):9 (S1,*,s):9 (S1,P1,*):6 (*,P1,s):6 (S1,P2,*):12 (*,P2,s):12 (S2,*,f):9 (S2,P1,*) (*,P1,f):9 (S1,*,*):9 (*,*,s):9 (*,P1,*):7.5 (*,P2,*):12 (*,*,f):9 (S2,*,*):9 (*,*,*):9 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 31

32 Outline Introduction and motivation Cube lattice partitions Semantics preserving partitions Algorithms Experimental results Discussion and summary Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 32

33 Reduction Ratio vs. Dimensionality Reduction ratio (%) Dimensionality MinCube QC_Cov QC_MIN # base tuples = 200k Zipf factor = 2.0 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 33

34 Reduction Ratio vs. Zipf Factor MinCube QC_Cov QC_MIN Reduction ratio (%) Zipf factor # base tuples = 200k # dimensions = 6 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 34

35 Reduction Ratio vs. Base Table Size Reduction ratio (%) MinCube QC_Cov QC_MIN Number of tuples (k) Zipf factor = 2.0 # dimensions = 6 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 35

36 Runtime Runtime (seconds) MinCube QC_Cov QC_MIN BUC Number of tuples (k) Zipf factor = 2.0 # dimensions = 6 Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 36

37 Compression Ratio on Weather Data Set MinCube QC_Cov Reduction ratio (%) Reduction ratio (%) Number of dimensions Number of dimensions Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube QC_Cov QC_AVG

38 Outline Introduction and motivation Cube lattice partitions Semantics preserving partitions Algorithms Experimental results Discussion and summary Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 38

39 Semantic Cube Exploration Theoretical foundation for semantic summarization in data cube concept and properties of quotient cubes Efficient algorithms for quotient cube construction Quotient cubes can be computed directly from base tables Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 39

40 Ongoing Research Efficient implementation of quotient cube-based OLAP system Data warehouse built using quotient cubes Hierarchies and constraints Incremental maintenance Semantics based OLAP and mining Efficient query answering Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 40

41 References (1) R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994 S. Agarwal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. VLDB, D. Barbara and M. Sullivan. Quasi-cubes: Exploiting approximation in multidimensional databases. SIGMOD Record, 26:12--17, D. Barbara and X. Wu. Using loglinear models to compress datacube. In WAIM'2000}, pages , K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In SIGMOD'99. Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 41

42 Reference (2) G. Birkhoff, Lattice Theory, 2 nd edition, New York, American Mathematical Society (Colloquium Publications, vol. 25), S. Geffner, D. Agrawal, A. El Abbadi, and T. R. Smith. Relative prefix sums: An efficient approach for querying dynamic OLAP data cubes. In ICDE'99. Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh. Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE'96. C.-T. Ho, J. Bruck, and R. Agrawal. Partial-sum queries in data cubes using covering codes. In PODS'97. J. Han, J. Pei, G. Dong, and K. Wang. Efficient Computation of Iceberg Cubes with Complex Measures. In SIGMOD'01. Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 42

43 Reference (3) V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD'96. T. Imielinski, L. Khachiyan, and A. Abdulghani. Cubegrades: Generalizing Association Rules. Technical Report, Rutgers University, August H. V. Jagadish, J. Madar, R.T. Ng. Semantic Compression and Pattern Extraction with Fascicles. VLDB'99. K. Ross and D. Srivastava. Fast computation of sparse datacubes. In VLDB'97. G. Sathe and S. Sarawagi. Intelligent Rollups in Multidimensional OLAP Data. VLDB'01. Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 43

44 Reference (4) J. Shanmugasundaram, U.M. Fayyad, and P. S. Bradley. Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions. SIGKDD 99. J. S. Vitter, M. Wang, and B. R. Iyer. Data cube approximation and historgrams via wavelets. In CIKM'98. W. Wang, H. Lu, J. Feng, and J. X. Yu. Condensed cube: An effective approach to reducing data cube size. In ICDE'02. Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD'97. G.K. Zipf. Human Behavior and The Principle of Least Effort Addison-Wesley, Lakshmanan, Pei & Han. Quotient Cube: How to Summarize the Semantics of a Data Cube 44

Data Cube Technology

Data Cube Technology Data Cube Technology Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ s.manegold@liacs.leidenuniv.nl e.m.bakker@liacs.leidenuniv.nl

More information

Novel Materialized View Selection in a Multidimensional Database

Novel Materialized View Selection in a Multidimensional Database Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/

More information

Data Cube Technology. Chapter 5: Data Cube Technology. Data Cube: A Lattice of Cuboids. Data Cube: A Lattice of Cuboids

Data Cube Technology. Chapter 5: Data Cube Technology. Data Cube: A Lattice of Cuboids. Data Cube: A Lattice of Cuboids Chapter 5: Data Cube Technology Data Cube Technology Data Cube Computation: Basic Concepts Data Cube Computation Methods Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 5

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 5 Data Mining: Concepts and Techniques (3 rd ed.) Chapter 5 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013 Han, Kamber & Pei. All rights

More information

International Journal of Computer Sciences and Engineering. Research Paper Volume-6, Issue-1 E-ISSN:

International Journal of Computer Sciences and Engineering. Research Paper Volume-6, Issue-1 E-ISSN: International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-6, Issue-1 E-ISSN: 2347-2693 Precomputing Shell Fragments for OLAP using Inverted Index Data Structure D. Datta

More information

Cube-Lifecycle Management and Applications

Cube-Lifecycle Management and Applications Cube-Lifecycle Management and Applications Konstantinos Morfonios National and Kapodistrian University of Athens, Department of Informatics and Telecommunications, University Campus, 15784 Athens, Greece

More information

QC-Trees: An Efficient Summary Structure for Semantic OLAP

QC-Trees: An Efficient Summary Structure for Semantic OLAP QC-Trees: An Efficient Summary Structure for Semantic OLAP Laks V.S. Lakshmanan Jian Pei Yan Zhao University of British Columbia, Canada, {laks, yzhao}@cs.ubc.ca State University of New York at Buffalo,

More information

A Simple and Efficient Method for Computing Data Cubes

A Simple and Efficient Method for Computing Data Cubes A Simple and Efficient Method for Computing Data Cubes Viet Phan-Luong Université Aix-Marseille LIF - UMR CNRS 6166 Marseille, France Email: viet.phanluong@lif.univ-mrs.fr Abstract Based on a construction

More information

Efficient Computation of Data Cubes. Network Database Lab

Efficient Computation of Data Cubes. Network Database Lab Efficient Computation of Data Cubes Network Database Lab Outlines Introduction Some CUBE Algorithms ArrayCube PartitionedCube and MemoryCube Bottom-Up Cube (BUC) Conclusions References Network Database

More information

Map-Reduce for Cube Computation

Map-Reduce for Cube Computation 299 Map-Reduce for Cube Computation Prof. Pramod Patil 1, Prini Kotian 2, Aishwarya Gaonkar 3, Sachin Wani 4, Pramod Gaikwad 5 Department of Computer Science, Dr.D.Y.Patil Institute of Engineering and

More information

The Dynamic Data Cube

The Dynamic Data Cube Steven Geffner, Divakant Agrawal, and Amr El Abbadi Department of Computer Science University of California Santa Barbara, CA 93106 {sgeffner,agrawal,amr}@cs.ucsb.edu Abstract. Range sum queries on data

More information

The Polynomial Complexity of Fully Materialized Coalesced Cubes

The Polynomial Complexity of Fully Materialized Coalesced Cubes The Polynomial Complexity of Fully Materialized Coalesced Cubes Yannis Sismanis Dept. of Computer Science University of Maryland isis@cs.umd.edu Nick Roussopoulos Dept. of Computer Science University of

More information

The Polynomial Complexity of Fully Materialized Coalesced Cubes

The Polynomial Complexity of Fully Materialized Coalesced Cubes The Polynomial Complexity of Fully Materialized Coalesced Cubes Yannis Sismanis Dept. of Computer Science University of Maryland isis@cs.umd.edu Nick Roussopoulos Dept. of Computer Science University of

More information

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking Dong Xin Zheng Shao Jiawei Han Hongyan Liu University of Illinois at Urbana-Champaign, Urbana, IL 6, USA Tsinghua University,

More information

Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

Range CUBE: Efficient Cube Computation by Exploiting Data Correlation Range CUBE: Efficient Cube Computation by Exploiting Data Correlation Ying Feng Divyakant Agrawal Amr El Abbadi Ahmed Metwally Department of Computer Science University of California, Santa Barbara Email:

More information

Different Cube Computation Approaches: Survey Paper

Different Cube Computation Approaches: Survey Paper Different Cube Computation Approaches: Survey Paper Dhanshri S. Lad #, Rasika P. Saste * # M.Tech. Student, * M.Tech. Student Department of CSE, Rajarambapu Institute of Technology, Islampur(Sangli), MS,

More information

Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining

Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining Frank Dehne 1,ToddEavis 2, and Andrew Rau-Chaplin 2 1 Carleton University, Ottawa, Canada, frank@dehne.net, WWW home page: http://www.dehne.net

More information

CS490D: Introduction to Data Mining Chris Clifton

CS490D: Introduction to Data Mining Chris Clifton CS490D: Introduction to Data Mining Chris Clifton January 16, 2004 Data Warehousing Data Warehousing and OLAP Technology for Data Mining What is a data warehouse? A multi-dimensional data model Data warehouse

More information

Multi-Cube Computation

Multi-Cube Computation Multi-Cube Computation Jeffrey Xu Yu Department of Sys. Eng. and Eng. Management The Chinese University of Hong Kong Hong Kong, China yu@se.cuhk.edu.hk Hongjun Lu Department of Computer Science Hong Kong

More information

Improved Data Partitioning For Building Large ROLAP Data Cubes in Parallel

Improved Data Partitioning For Building Large ROLAP Data Cubes in Parallel Improved Data Partitioning For Building Large ROLAP Data Cubes in Parallel Ying Chen Dalhousie University Halifax, Canada ychen@cs.dal.ca Frank Dehne Carleton University Ottawa, Canada www.dehne.net frank@dehne.net

More information

Constructing Object Oriented Class for extracting and using data from data cube

Constructing Object Oriented Class for extracting and using data from data cube Constructing Object Oriented Class for extracting and using data from data cube Antoaneta Ivanova Abstract: The goal of this article is to depict Object Oriented Conceptual Model Data Cube using it as

More information

Chapter 5, Data Cube Computation

Chapter 5, Data Cube Computation CSI 4352, Introduction to Data Mining Chapter 5, Data Cube Computation Young-Rae Cho Associate Professor Department of Computer Science Baylor University A Roadmap for Data Cube Computation Full Cube Full

More information

SQL Server Analysis Services

SQL Server Analysis Services DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, SQL Server 2005 Analysis Services SQL Server 2005 Analysis Services - 1 Analysis Services Database and

More information

This proposed research is inspired by the work of Mr Jagdish Sadhave 2009, who used

This proposed research is inspired by the work of Mr Jagdish Sadhave 2009, who used Literature Review This proposed research is inspired by the work of Mr Jagdish Sadhave 2009, who used the technology of Data Mining and Knowledge Discovery in Databases to build Examination Data Warehouse

More information

Communication and Memory Optimal Parallel Data Cube Construction

Communication and Memory Optimal Parallel Data Cube Construction Communication and Memory Optimal Parallel Data Cube Construction Ruoming Jin Ge Yang Karthik Vaidyanathan Gagan Agrawal Department of Computer and Information Sciences Ohio State University, Columbus OH

More information

Computing Complex Iceberg Cubes by Multiway Aggregation and Bounding

Computing Complex Iceberg Cubes by Multiway Aggregation and Bounding Computing Complex Iceberg Cubes by Multiway Aggregation and Bounding LienHua Pauline Chou and Xiuzhen Zhang School of Computer Science and Information Technology RMIT University, Melbourne, VIC., Australia,

More information

Mining Unusual Patterns by Multi-Dimensional Analysis of Data Streams

Mining Unusual Patterns by Multi-Dimensional Analysis of Data Streams Mining Unusual Patterns by Multi-Dimensional Analysis of Data Streams Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign Email: hanj@cs.uiuc.edu Abstract It has been popularly

More information

Association Rules Mining:References

Association Rules Mining:References Association Rules Mining:References Zhou Shuigeng March 26, 2006 AR Mining References 1 References: Frequent-pattern Mining Methods R. Agarwal, C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm

More information

PnP: Parallel And External Memory Iceberg Cube Computation

PnP: Parallel And External Memory Iceberg Cube Computation : Parallel And External Memory Iceberg Cube Computation Ying Chen Dalhousie University Halifax, Canada ychen@cs.dal.ca Frank Dehne Griffith University Brisbane, Australia www.dehne.net Todd Eavis Concordia

More information

Data Mining: Concepts and Techniques. Mining Frequent Patterns, Associations, and Correlations. Chapter 5.4 & 5.6

Data Mining: Concepts and Techniques. Mining Frequent Patterns, Associations, and Correlations. Chapter 5.4 & 5.6 Data Mining: Concepts and Techniques Mining Frequent Patterns, Associations, and Correlations Chapter 5.4 & 5.6 January 18, 2007 CSE-4412: Data Mining 1 Chapter 5: Mining Frequent Patterns, Association

More information

Mining for Data Cube and Computing Interesting Measures

Mining for Data Cube and Computing Interesting Measures International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Mining for Data Cube and Computing Interesting Measures Miss.Madhuri S. Magar Student, Department of Computer Engg.

More information

Ecient Computation of Iceberg Cubes with Complex Measures

Ecient Computation of Iceberg Cubes with Complex Measures Ecient Computation of Iceberg Cubes with Complex Measures Jiawei Han y Jian Pei y Guozhu Dong z Ke Wang y y School of Computing Science, Simon Fraser University, B.C., Canada, fhan, peijian, wangkg@cs.sfu.ca

More information

CS 412 Intro. to Data Mining

CS 412 Intro. to Data Mining CS 412 Intro. to Data Mining Chapter 4. Data Warehousing and On-line Analytical Processing Jiawei Han, Computer Science, Univ. Illinois at Urbana -Champaign, 2017 1 2 3 Chapter 4: Data Warehousing and

More information

Building Fuzzy Blocks from Data Cubes

Building Fuzzy Blocks from Data Cubes Building Fuzzy Blocks from Data Cubes Yeow Wei Choong HELP University College Kuala Lumpur MALAYSIA choongyw@help.edu.my Anne Laurent Dominique Laurent LIRMM ETIS Université Montpellier II Université de

More information

Data mining: Hmm, what is it?

Data mining: Hmm, what is it? Data mining: Hmm, what is it? Data warehousing Examples Discussions The extraction of implicit, previously unknown and potentially useful information from large bodies of data often accumulated for other

More information

SQL Server 2005 Analysis Services

SQL Server 2005 Analysis Services atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of SQL Server

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs

More information

Data Cubes in Dynamic Environments

Data Cubes in Dynamic Environments Data Cubes in Dynamic Environments Steven P. Geffner Mirek Riedewald Divyakant Agrawal Amr El Abbadi Department of Computer Science University of California, Santa Barbara, CA 9 Λ Abstract The data cube,

More information

Data Quality. Data Cleaning and Integration. Data Cleaning. Data Preprocessing. Handling Missing Values. Disguised Missing Data?

Data Quality. Data Cleaning and Integration. Data Cleaning. Data Preprocessing. Handling Missing Values. Disguised Missing Data? 2014-05-06 Data Quality Data Cleaning and Integration Accuracy Completeness Consistency Timeliness Believability Interpretability J. Pei: Big Data Analytics -- Data Cleaning and Integration 2 Data Preprocessing

More information

Lecture 2 Data Cube Basics

Lecture 2 Data Cube Basics CompSci 590.6 Understanding Data: Theory and Applica>ons Lecture 2 Data Cube Basics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Papers 1. Gray- Chaudhuri- Bosworth- Layman- Reichart- Venkatrao-

More information

Generating Data Sets for Data Mining Analysis using Horizontal Aggregations in SQL

Generating Data Sets for Data Mining Analysis using Horizontal Aggregations in SQL Generating Data Sets for Data Mining Analysis using Horizontal Aggregations in SQL Sanjay Gandhi G 1, Dr.Balaji S 2 Associate Professor, Dept. of CSE, VISIT Engg College, Tadepalligudem, Scholar Bangalore

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Lecture 3 Efficient Cube Computation CITS3401 CITS5504 Wei Liu School of Computer Science and Software Engineering Faculty of Engineering, Computing and Mathematics Acknowledgement:

More information

Mining Constrained Gradients in Large Databases

Mining Constrained Gradients in Large Databases Submission to IEEE TKDE April 10, 2003 Mining Constrained Gradients in Large Databases Guozhu Dong, Jiawei Han, Joyce Lam, Jian Pei, Ke Wang, Wei Zou Correspondence author: Department of Computer Science

More information

Data Warehousing & On-Line Analytical Processing

Data Warehousing & On-Line Analytical Processing Data Warehousing & On-Line Analytical Processing Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ s.manegold@liacs.leidenuniv.nl e.m.bakker@liacs.leidenuniv.nl

More information

Using Tiling to Scale Parallel Data Cube Construction

Using Tiling to Scale Parallel Data Cube Construction Using Tiling to Scale Parallel Data Cube Construction Ruoming in Karthik Vaidyanathan Ge Yang Gagan Agrawal Department of Computer Science and Engineering Ohio State University, Columbus OH 43210 jinr,vaidyana,yangg,agrawal

More information

An Empirical Comparison of Methods for Iceberg-CUBE Construction. Leah Findlater and Howard J. Hamilton Technical Report CS August, 2000

An Empirical Comparison of Methods for Iceberg-CUBE Construction. Leah Findlater and Howard J. Hamilton Technical Report CS August, 2000 An Empirical Comparison of Methods for Iceberg-CUBE Construction Leah Findlater and Howard J. Hamilton Technical Report CS-2-6 August, 2 Copyright 2, Leah Findlater and Howard J. Hamilton Department of

More information

Impact of Data Distribution, Level of Parallelism, and Communication Frequency on Parallel Data Cube Construction

Impact of Data Distribution, Level of Parallelism, and Communication Frequency on Parallel Data Cube Construction Impact of Data Distribution, Level of Parallelism, and Communication Frequency on Parallel Data Cube Construction Ge Yang Department of Computer and Information Sciences Ohio State University, Columbus

More information

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking

C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking Dong Xin Zheng Shao Jiawei Han Hongyan Liu University of Illinois at Urbana-Champaign, Urbana, IL 680, USA October 6, 005 Abstract

More information

Evaluation of Ad Hoc OLAP : In-Place Computation

Evaluation of Ad Hoc OLAP : In-Place Computation Evaluation of Ad Hoc OLAP : In-Place Computation Damianos Chatziantoniou Department of Computer Science, Stevens Institute of Technology damianos@cs.stevens-tech.edu Abstract Large scale data analysis

More information

Building Large ROLAP Data Cubes in Parallel

Building Large ROLAP Data Cubes in Parallel Building Large ROLAP Data Cubes in Parallel Ying Chen Dalhousie University Halifax, Canada ychen@cs.dal.ca Frank Dehne Carleton University Ottawa, Canada www.dehne.net A. Rau-Chaplin Dalhousie University

More information

DW Performance Optimization (II)

DW Performance Optimization (II) DW Performance Optimization (II) Overview Data Cube in ROLAP and MOLAP ROLAP Technique(s) Efficient Data Cube Computation MOLAP Technique(s) Prefix Sum Array Multiway Augmented Tree Aalborg University

More information

The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees

The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees Omar H. Karam Faculty of Informatics and Computer Science, The British University in Egypt and Faculty of

More information

Analyzing a Greedy Approximation of an MDL Summarization

Analyzing a Greedy Approximation of an MDL Summarization Analyzing a Greedy Approximation of an MDL Summarization Peter Fontana fontanap@seas.upenn.edu Faculty Advisor: Dr. Sudipto Guha April 10, 2007 Abstract Many OLAP (On-line Analytical Processing) applications

More information

Graph Cube: On Warehousing and OLAP Multidimensional Networks

Graph Cube: On Warehousing and OLAP Multidimensional Networks Graph Cube: On Warehousing and OLAP Multidimensional Networks Peixiang Zhao, Xiaolei Li, Dong Xin, Jiawei Han Department of Computer Science, UIUC Groupon Inc. Google Cooperation pzhao4@illinois.edu, hanj@cs.illinois.edu

More information

b1 b3 Anchor cell a Border cells b1 and b3 Border cells b2 and b4 Cumulative cube for PS

b1 b3 Anchor cell a Border cells b1 and b3 Border cells b2 and b4 Cumulative cube for PS Space-Ecient Data Cubes for Dynamic Environments? Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi, and Renato Pajarola Dept. of Computer Science, Univ. of California, Santa Barbara CA 9, USA fmirek,

More information

Data Warehousing & On-line Analytical Processing

Data Warehousing & On-line Analytical Processing Data Warehousing & On-line Analytical Processing Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/dbdm/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ Chapter 4: Data Warehousing and On-line

More information

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery : Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj}@cs.uiuc.edu,

More information

The Multi-Relational Skyline Operator

The Multi-Relational Skyline Operator The Multi-Relational Skyline Operator Wen Jin 1 Martin Ester 1 Zengjian Hu 1 Jiawei Han 2 1 School of Computing Science Simon Fraser University, Canada {wjin,ester,zhu}@cs.sfu.ca 2 Department of Computer

More information

An Empirical Comparison of Methods for Iceberg-CUBE Construction

An Empirical Comparison of Methods for Iceberg-CUBE Construction From: FLAIRS-01 Proceedings. Copyright 2001, AAAI (www.aaai.org). All rights reserved. An Empirical Comparison of Methods for Iceberg-CUBE Construction Leah Findlater and Howard J. Hamilton Department

More information

An Overview of various methodologies used in Data set Preparation for Data mining Analysis

An Overview of various methodologies used in Data set Preparation for Data mining Analysis An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Evaluation of Top-k OLAP Queries Using Aggregate R trees

Evaluation of Top-k OLAP Queries Using Aggregate R trees Evaluation of Top-k OLAP Queries Using Aggregate R trees Nikos Mamoulis 1, Spiridon Bakiras 2, and Panos Kalnis 3 1 Department of Computer Science, University of Hong Kong, Pokfulam Road, Hong Kong, nikos@cs.hku.hk

More information

Hybrid Query and Data Ordering for Fast and Progressive Range-Aggregate Query Answering

Hybrid Query and Data Ordering for Fast and Progressive Range-Aggregate Query Answering International Journal of Data Warehousing & Mining, 1(2), 23-43, April-June 2005 23 Hybrid Query and Data Ordering for Fast and Progressive Range-Aggregate Query Answering Cyrus Shahabi, University of

More information

Improving the Performance of OLAP Queries Using Families of Statistics Trees

Improving the Performance of OLAP Queries Using Families of Statistics Trees Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University

More information

DATA STREAMS: MODELS AND ALGORITHMS

DATA STREAMS: MODELS AND ALGORITHMS DATA STREAMS: MODELS AND ALGORITHMS DATA STREAMS: MODELS AND ALGORITHMS Edited by CHARU C. AGGARWAL IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 Kluwer Academic Publishers Boston/Dordrecht/London

More information

Association Rule Mining. Entscheidungsunterstützungssysteme

Association Rule Mining. Entscheidungsunterstützungssysteme Association Rule Mining Entscheidungsunterstützungssysteme Frequent Pattern Analysis Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set

More information

StreamOLAP. Salman Ahmed SHAIKH. Cost-based Optimization of Stream OLAP. DBSJ Japanese Journal Vol. 14-J, Article No.

StreamOLAP. Salman Ahmed SHAIKH. Cost-based Optimization of Stream OLAP. DBSJ Japanese Journal Vol. 14-J, Article No. StreamOLAP Cost-based Optimization of Stream OLAP Salman Ahmed SHAIKH Kosuke NAKABASAMI Hiroyuki KITAGAWA Salman Ahmed SHAIKH Toshiyuki AMAGASA (SPE) OLAP OLAP SPE SPE OLAP OLAP OLAP Due to the increase

More information

Item Set Extraction of Mining Association Rule

Item Set Extraction of Mining Association Rule Item Set Extraction of Mining Association Rule Shabana Yasmeen, Prof. P.Pradeep Kumar, A.Ranjith Kumar Department CSE, Vivekananda Institute of Technology and Science, Karimnagar, A.P, India Abstract:

More information

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 WHAT IS A DATA CUBE? The Data Cube or Cube operator produces N-dimensional answers

More information

A Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis

A Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 8, August 2013,

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

M. P. Ravikanth et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 3 (3), 2012,

M. P. Ravikanth et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 3 (3), 2012, An Adaptive Representation of RFID Data Sets Based on Movement Graph Model M. P. Ravikanth, A. K. Rout CSE Department, GMR Institute of Technology, JNTU Kakinada, Rajam Abstract Radio Frequency Identification

More information

Materialized View Selection for Multidimensional Datasets*

Materialized View Selection for Multidimensional Datasets* Materialized View Selection for Multidimensional Datasets* Amit Shukla amit@cs.wisc.edu Prasad M. Deshpande pmd@cs.wisc.edu Computer Sciences Department University of Wisconsin - Madison Madison, WI 53706

More information

ETL and OLAP Systems

ETL and OLAP Systems ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester

More information

Indexing OLAP data. Sunita Sarawagi. IBM Almaden Research Center. Abstract

Indexing OLAP data. Sunita Sarawagi. IBM Almaden Research Center. Abstract Indexing OLAP data Sunita Sarawagi IBM Almaden Research Center sunita@almaden.ibm.com Abstract In this paper we discuss indexing methods for On-Line Analytical Processing (OLAP) databases. We start with

More information

Computing Data Cubes Using Massively Parallel Processors

Computing Data Cubes Using Massively Parallel Processors Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University

More information

Project Participants

Project Participants Annual Report for Period:10/2004-10/2005 Submitted on: 06/21/2005 Principal Investigator: Yang, Li. Award ID: 0414857 Organization: Western Michigan Univ Title: Projection and Interactive Exploration of

More information

Data Mining: Concepts and Techniques. Chapter 4

Data Mining: Concepts and Techniques. Chapter 4 Data Mining: Concepts and Techniques Chapter 4 Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.edu/~hanj 2006 Jiawei Han and Micheline Kamber, All rights

More information

A Data-Warehouse / OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis

A Data-Warehouse / OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis A Data-Warehouse / OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis Qiming Chen, Meichun Hsu, Umesh Dayal Software Technology Laboratory HP Labs 1501 Page Mill Road, MS 1U4 Palo Alto,

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 3

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 3 Data Mining: Concepts and Techniques (3 rd ed.) Chapter 3 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2011 Han, Kamber & Pei. All rights

More information

A Data-Warehouse / OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis

A Data-Warehouse / OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis A Data-Warehouse / OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis Qiming Chen, Meichun Hsu, Umesh Dayal HP Labs 1501 Page Mill Road, MS 1U4 Palo Alto, California, CA 94303, USA {qchen,mhsu,dayal}@hpl.hp.com

More information

Data cube and OLAP. Selecting Views to Materialize. Aggregation view lattice. Selecting views to materialize. Limitations of static approach.

Data cube and OLAP. Selecting Views to Materialize. Aggregation view lattice. Selecting views to materialize. Limitations of static approach. Data cube and OLAP Selecting Views to Materialize CPS 296.1 Topics in Database Systems Example data cube schema: Sale(store, product, customer, quantity) Store, product, customer are dimension attributes

More information

Loglinear-Based Quasi Cubes

Loglinear-Based Quasi Cubes Journal of Intelligent Information Systems, 16, 255 276, 2001 c 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Loglinear-Based Quasi Cubes DANIEL BARBARÁ XINTAO WU ISE Department, George

More information

A REVIEW DATA CUBE ANALYSIS METHOD IN BIG DATA ENVIRONMENT

A REVIEW DATA CUBE ANALYSIS METHOD IN BIG DATA ENVIRONMENT A REVIEW DATA CUBE ANALYSIS METHOD IN BIG DATA ENVIRONMENT Dewi Puspa Suhana Ghazali 1, Rohaya Latip 1, 2, Masnida Hussin 1 and Mohd Helmy Abd Wahab 3 1 Department of Communication Technology and Network,

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo

More information

FlowCube: Constructing RFID FlowCubes for Multi-Dimensional Analysis of Commodity Flows

FlowCube: Constructing RFID FlowCubes for Multi-Dimensional Analysis of Commodity Flows FlowCube: Constructing RFID FlowCubes for Multi-Dimensional Analysis of Commodity Flows Hector Gonzalez Jiawei Han Xiaolei Li University of Illinois at Urbana-Champaign, IL, USA {hagonzal, hanj, xli10}@uiuc.edu

More information

Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials *

Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials * Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials * Galina Bogdanova, Tsvetanka Georgieva Abstract: Association rules mining is one kind of data mining techniques

More information

MAIDS: Mining Alarming Incidents from Data Streams

MAIDS: Mining Alarming Incidents from Data Streams MAIDS: Mining Alarming Incidents from Data Streams (Demonstration Proposal) Y. Dora Cai David Clutter Greg Pape Jiawei Han Michael Welge Loretta Auvil Automated Learning Group, NCSA, University of Illinois

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Mining Multi-dimensional Frequent Patterns Without Data Cube Construction*

Mining Multi-dimensional Frequent Patterns Without Data Cube Construction* Mining Multi-dimensional Frequent Patterns Without Data Cube Construction* Chuan Li 1, Changjie Tang 1, Zhonghua Yu 1, Yintian Liu 1, Tianqing Zhang 1, Qihong Liu 1, Mingfang Zhu 1, and Yongguang Jiang

More information

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process. MTAT.03.183 Data Mining Week 7: Online Analytical Processing and Data Warehouses Marlon Dumas marlon.dumas ät ut. ee Acknowledgment This slide deck is a mashup of the following publicly available slide

More information

Which Null-Invariant Measure Is Better? Which Null-Invariant Measure Is Better?

Which Null-Invariant Measure Is Better? Which Null-Invariant Measure Is Better? Which Null-Invariant Measure Is Better? D 1 is m,c positively correlated, many null transactions D 2 is m,c positively correlated, little null transactions D 3 is m,c negatively correlated, many null transactions

More information

Binomial Multifractal Curve Fitting for View Size Estimation in OLAP

Binomial Multifractal Curve Fitting for View Size Estimation in OLAP Binomial Multifractal Curve Fitting for View Size Estimation in OLAP Thomas P. NADEAU EECS, University of Michigan Ann Arbor MI, USA nadeau@engin.umich.edu Kanda RUNAPONGSA EECS, University of Michigan

More information

Compression of the Stream Array Data Structure

Compression of the Stream Array Data Structure Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In

More information

COMP 465: Data Mining Still More on Clustering

COMP 465: Data Mining Still More on Clustering 3/4/015 Exercise COMP 465: Data Mining Still More on Clustering Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Describe each of the following

More information

Effectiveness of Freq Pat Mining

Effectiveness of Freq Pat Mining Effectiveness of Freq Pat Mining Too many patterns! A pattern a 1 a 2 a n contains 2 n -1 subpatterns Understanding many patterns is difficult or even impossible for human users Non-focused mining A manager

More information

CS 1655 / Spring 2013! Secure Data Management and Web Applications

CS 1655 / Spring 2013! Secure Data Management and Web Applications CS 1655 / Spring 2013 Secure Data Management and Web Applications 03 Data Warehousing Alexandros Labrinidis University of Pittsburgh What is a Data Warehouse A data warehouse: archives information gathered

More information