StreamOLAP. Salman Ahmed SHAIKH. Cost-based Optimization of Stream OLAP. DBSJ Japanese Journal Vol. 14-J, Article No.
|
|
- Anne Oliver
- 5 years ago
- Views:
Transcription
1 StreamOLAP Cost-based Optimization of Stream OLAP Salman Ahmed SHAIKH Kosuke NAKABASAMI Hiroyuki KITAGAWA Salman Ahmed SHAIKH Toshiyuki AMAGASA (SPE) OLAP OLAP SPE SPE OLAP OLAP OLAP Due to the increase of stream data sources, many Stream Processing Engines (SPEs) have been developed and deployed. Besides simple data filtering, demands for more sophisticated analysis of streams such as multi-level aggregation and summarization are increasing. OLAP is one of the common methods for systematic analysis of static data and has been studied intensively. Some existing works proposed application of OLAP to stream data. However, they lack some required features and do not consider use of SPEs for stream OLAP. Moreover, cost-based analysis to improve efficiency of stream OLAP has not been studied before. In this paper we present a stream OLAP architecture consisting of an SPE and an OLAP engine as a general framework for stream OLAP. Then, we propose a cost-based optimization scheme. To the best of our knowledge, this is the first proposal for cost-based optimization of stream OLAP. Moreover, we have implemented a runnable prototype system, and nakabasami@kde.cs.tsukuba.ac.jp kitagawa@cs.tsukuba.ac.jp salman@kde.cs.tsukuba.ac.jp amagasa@cs.tsukuba.ac.jp evaluated the proposed optimization scheme. The experimental results prove the effectiveness of the proposal. 1. OLAP [1] OLAP OLAP lattice lattice ( OLAP ) OLAP (SPE) SPE STREAM [2]Storm [3]S4 [4]Borealis [5] SPE SQL CQL [6] OLAP Jiawei Han OLAP Stream Cube [7] SPE SPE OLAP OLAP OLAP StreamOLAP StreamOLAP lattice CQL SPE OLAP lattice SPE lattice OLAP SPE (On-demand Query) OLAP OLAP [1] OLAP 1 Vol. 14-J, Article No. 3,
2 µ d d d d d dee e ede edd dedd d d 1: d ede ded ded d ee edd ded ede d eee ded ded d de eee dee de edd dd eed ed dee ee eed edd ede ded dde dee eee edd ede edd edd dde eed edd edd ee µ 2: drill-down roll-up drill-down roll-up TPC-H customerpartsupplier lineitem customerpartsupplier lineitem lineitem sales customer roll-up customer nation supplier drill-down supplier 2. 2 (SPE) (SPE) SPE STREAM [2]Storm [3] S4 [4]Borealis [5] SPE CQL [6] ( ) SPE OLAP SPE StreamOLAP SPE OLAP SPE part (sales) customer (nation) SPE Select part_, customer_nation, Sum(sales) From lineitem [Range 1 hour] Group By part_, customer_nation 3. OLAP Aouiche View [8]Talebi View OLAP [9]Santos OLAP partitioning OLAP [10]Harinarayan OLAP lattice [11]Joslyn lattice View [12] OLAP Jiawei Han Stream Cube [7]Stream Cube OLAP OLAP OLAP OLAP SPE Rui Zhang IP SPE Gigascope [13] Gigascope IP Stream OLAP lattice lattice OLAP SPE 4. StreamOLAP OLAP 4. 1 OLAP. StreamOLAP lattice partsuppliertime 3(a) 3(b) 3(c) lattice root (p s minute) (p s hour) 2 Vol. 14-J, Article No. 3,
3 (a) part (b) supplier minute hour (c) time (p_, s_, minute) (p_, minute) (minute) (s_, minute) (p_, s_, hour) (p_, hour) (hour) (d) their lattice (s_, hour) 3: part, supplier time lattice (p minute) (p hour) lattice 3(d) lattice lattice OLAP OLAP 3(d) lattice (p minute) OLAP (p s minute) OLAP OLAP OLAP OLAP OLAP (Interval of Interest (IoI)) IoI OLAP IoI OLAP StreamOLAP 4 StreamOLAP SPE OLAP SPE OLAP SPE IoI OLAP (SPE) SPE CQL 4 (part, supplier, time) (sales) 3 OLAP 4 CQL [6] SPE 4 CQL (s, minute) OLAP s minute sales SPE window 4 1 window 1 OLAP (s, minute) RSTREAM [6] window 4 s minute 1 CQL 1 StreamOLAP SPE OLAP OLAP a) Registered Query b) On-demand Query OLAP Registered Query Registered Query OLAP CQL SPE CQL OLAP OLAP CQL OLAP IoI IoI Registered Query OLAP OLAP 4 (s, minute) Registered Query IoI :00 9:59 10:00 9:00 IoI On-demand Query OLAP lattice Registered Query CQL SPE OLAP 3 Vol. 14-J, Article No. 3,
4 Stream Data (part, supplier, time, sales) 1 min d, A, 10:00:47, 300 c, B, 10:00:25, 500 a, A, 10:00:08, 100 Aggregation B, 10:00, 500 A, 10:00, 400 Window Stream Processing Engine Register Query RSTREAM [1 minute] ( SELECT s_, minute, SUM(sales) FROM stream [RANGE 1 minute] GROUP BY s_, minute New result preserved (p_, s_, minute) (p_, minute) (s_, minute) registered query (minute) (p_, s_, hour) (p_, hour) (s_, hour) (hour) C, 09:59, 720 B, 09:59, 280 C, 09:58, 300 A, 09:59, 100 A, 09:58, 560 Aggregation C, 09, B, 10, 500 B, 09, A, 10, 400 A, 09, OLAP Engine 1 hour Data Buffer C, 09:00, 120 B, 09:00, 500 A, 09:00, 400 Result Result Old result deleted User on-demand query 4: Stream OLAP On-demand Query OLAP Registered Query OLAP OLAP lattice () Registered Query 4 (s, hour) OLAP Ondemand Query Registered Query (s, minute) minute IoI 30 hour IoI 1 (s, hour) (s, minute) lattice IoI IoI OLAP Registered Query 4 (p, s, minute) (p, s, hour) Registered Query Ondemand Query Registered Query OLAP Registered Query On-demand Query OLAP q i ( ) p i OLAP q i p i q i Registered Query On-demand Query Registered Query Registered Query q i OLAP C R 1 OLAP q i CQL window w i CQL k i w i IoI k i k i Registered Query q i C R k i w i (1) On-demand Query On-demand Query q i Registered Query q r q ri 1 C O q ri IoI I ri q ri Ir i kr i w ri q i C O Ir i kr i w ri q ri CQL w ri k ri On-demand Query q i p i q i 4 Vol. 14-J, Article No. 3,
5 C O Ir i k ri p i w ri (2) On-demand Query q i Registered Query q ri Registered Query q r OLAP Registered Query n On-demand Query m StreamOLAP C C = C R n i=0 k i w i + C O m i=0 I ri k ri p i w ri (3) SPE OLAP SPE Registered Query q i 1 s i S S = n i=0 I i k i s i w i (4) 5. 3 lattice OLAP Registered Query On-demand Query lattice Q lattice root () r On-demand Query Registered Query r Registered Query Registered Query OLAP lattice Q out Q r Q o Registered Query On-demand Query Q r lattice root r Q o C pre processcost(q r, Q o ) ( 3) Q o 2 Registered Query Registered Query Registered Query Q r On-demand Query Q o C C C pre Bene f it Q o Bene f it Q r Q r C mins eqc Q r S storagecost(q r ) ( 4) Registered Query S max Q r Q o Registered Query OLAP mins eqc (root r) C Q out Algorithm1 Algorithm 1: Optimization Algorithm Input: a set Q of vertices of the lattice, the root vertex r of the lattice (r Q) Output: a set Q out of vertices (Q out Q) begin mins eqc newarray() S 0, Q r {r}, Q o Q\Q r while sums > S max do C pre processcost(q r, Q o ) Bene f it max, C min +, q min null for q Q o do Q r Q r {q}, Q o Q\Q r C processcost(q r, Q o) Bene f it C pre C if Bene f it > Bene f it max then Bene f it max Bene f it C min C, q min q mins eqc.add ((q min, C)) Q r q min, S spacecost(q r ) min cost minimum cost in mins eqc, Q out {r} for i = 0 to mins eqc.length() do if mins eqc[i].c = min cost then Q out Q out {mins eqc[i].d} break Q out Q out {mins eqc[i].d} 5. 4 partsuppliertimesales 4 partsuppliertime sales 3 lattice 1 1: OLAP Aggregate Dimension w i k i p i s i p, s, minute p, minute s, minute minute p, s, hour p, hour s, hour hour I (IoI ) OLAP 3600 S max 1,000,000,000 OLAP (lattice ) Registered Query root (p, s, minute) Registered Query Q o 5 Vol. 14-J, Article No. 3,
6 (p, minute) Registered Query C C R = 2C O Registered Query (p, s, minute) On-demand Query (s, minute) (p, s, hour) (s, hour) Registered Query (p, minute) On-demand Query C = / / / / / / / / 1 = On-demand Query Registered Query C 2 2: 1 C p, minute s, minute minute p, s, hour p, hour s, hour hour C (p, s, hour) root Registered Query Registered Query (s, minute) (p, minute) Registered Query (p, minute) Registered Query S 1,235,280,000 S max Registered Query Registered Query Registered Query (p, s, minute) (p, s, hour) (s, minute) 35,646,000 15,396,0836,401,083 Registered Query Registered Query StreamOLAP 4.2 StreamOLAP SPE OLAP SPE SPE () ucosminexus Stream Data Platform [14] OLAP Java Registered Query OLAP On-demand Query OLAP CUSTOMER custkey nationkey nation regionkey region TIME time minute hour LINEORDER orderkey custkey partkey suppkey linenumber datetime extedprice PART partkey mfgr brand type SUPPLIER suppkey nationkey nation regionkey region 5: nation region (a) customer mfgr brand (b) part type nation region (c) supplier minute hour (d) time 6: TPC-H [15] Patrick O Neil [16] TPC-H lineitem order lineorder lineorder lineorder 6018 SPE 5 6 Registered Query Ondemand Query 2 Frequency-based: Registered Query q i OLAP p i Random: Registered Query OLAP q i p i 6 Rand: OLAP q i [0, 1] p i ( p i = 1 1 ) AllHigh: OLAP [0.8, 1] p i AllLow: OLAP [0, 0.2] p i CoarseHigh: OLAP OLAP p i = 1 lattice Vol. 14-J, Article No. 3,
7 Processing time (s) Processing time (s) : (I (1)) 8: (I (2)) Proposed Frequency_based Random Proposed Frequency_based Random FineHigh: OLAP root p i = 1 root OneDimHigh: root 3 (p, s, minute) (p, s, hour) p i = 1.0 (p, minute) (s, minute) (p, hour) (s, hour) p i = 0.8 (minute) (hour) p i = 0.6 customer I (IoI ) 2 I I (1): OLAP I = 60 I (2): I = 60 I = 240 ( 3) Registered Query On-demand Query 1 C R C O C R = 5C O Registered Query Ondemand Query Registered Query OS: CentOS 6.5 (Linux el6.x86 64) : 31.1GB CPU: Intel (R) Core (TM) i : 3.40GHz StreamOLAP lineorder tuple/s S max 1GB I (1) 8 I (2) Frequency based Random x 6 y Registered Query ( ) On-demand Query Proposed Frequency basedrandom Frequency basedrandom CoarseHighFineHigh Registered Query SPE 1GB 7. SPE OLAP StreamOLAP SPE OLAP OLAP OLAP OLAP SPE SPE IoI OLAP [] [] [1] S. Chaudhuri et al., An Overview of Data Warehousing and OLAP Technology, ACM SIGMOD Record, 26 (1), pages 65-74, March [2] A. Arasu et al., STREAM: The Stanford Data Stream Management System, Technical Report, Department of Computer Science, Stanford University, [3] Storm, [4] L. Neumeyer et al., S4: Distributed Stream Computing Platform, In Data Mining Workshops (ICDMW), 2010 IEEE conference on, pages , 13 Dec [5] D. J. Abadi et al., The Design of the Borealis Stream Processing Engine, In Second Biennial Conference on 7 Vol. 14-J, Article No. 3,
8 Innovative Data Systems Research (CIDR 2005), Asilomar, CA, Jan [6] A. Arasu et al., The CQL continuous query language: semantic foundations and query execution, The VLDB Journal, 15 (2): , [7] J. Han et al., Stream Cube: An Architecture for Multi- Dimensional Analysis of Data Streams, Distributed and Parallel Databases, 18 (2): , Sep [8] K. Aouiche et al., Data Mining-based Materialized View and Index Selection in Data Warehouses, Journal of Intelligent Information Systems, Vol. 33, Issue 1, pages 65-93, August, [9] Z. A. Talebi et al., Exact and Inexact Methods for Selecting Views and Indexes for OLAP Performance Improvement, Proc. 11th International Conference on Exting Database Technology (EDBT 2008), pages , Nantes, France, March, [10] R. J. Santos et al., PIN: A Partitioning & Indexing Optimization Method for OLAP, Proc. 9th International Conference on Enterprise Information Systems (ICEIS 2007), Funchal, Madeira, Portugal, June, [11] V. Harinarayan et al., Implementing Data Cubes Efficiently, Proc ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1996), pages , ACM, Montreal, Canada, June [12] C. Joslyn et al., View Discovery in OLAP Databases through Statistical Combinatorial Optimization, Scientific and Statistical Database Management Lecture Notes in Computer Science, Vol.5566, 2009, pages [13] R. Zhang et al., Multiple Aggregations Over Data Streams, Proc ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 2005), pages , ACM, Baltimore, Maryland, USA, June 14-16, [14] ucosminexus stream data platform, sdp/. [15] TPC-H, [16] P. O Neil et al., The Star Schema Benchmark and Augmented Fact Table Indexing, First TPC Technology Conference (TPCTC 2009), Lyon, France, Aug [17] J. Han et al., Efficient Computation of Iceberg Cubes with Complex Measures, In Proc ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 01), pages 1-12, Santa Barbara, CA, May Kosuke NAKABASAMI OLAP Hiroyuki KITAGAWA ACMIEEE Salman Ahmed SHAIKH 2005 Mehran University of Engineering and Technology 2008 Communication Systems and Networks 2014 (DBSJ)International Association of Computer Science and Information Technology (IACSIT)Pakistan Engineering Council (PEC) Toshiyuki AMAGASA Web ACM IEEE 8 Vol. 14-J, Article No. 3,
A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture
A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture By Gaurav Sheoran 9-Dec-08 Abstract Most of the current enterprise data-warehouses
More informationNovel Materialized View Selection in a Multidimensional Database
Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationAn Efficient Execution Scheme for Designated Event-based Stream Processing
DEIM Forum 2014 D3-2 An Efficient Execution Scheme for Designated Event-based Stream Processing Yan Wang and Hiroyuki Kitagawa Graduate School of Systems and Information Engineering, University of Tsukuba
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More informationPreparation of Data Set for Data Mining Analysis using Horizontal Aggregation in SQL
Preparation of Data Set for Data Mining Analysis using Horizontal Aggregation in SQL Vidya Bodhe P.G. Student /Department of CE KKWIEER Nasik, University of Pune, India vidya.jambhulkar@gmail.com Abstract
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic
More informationDynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering
Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of
More informationData Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 7: Schemas Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database schema A Database Schema captures: The concepts represented Their attributes
More informationAn Overview of various methodologies used in Data set Preparation for Data mining Analysis
An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of
More informationImproving the Performance of OLAP Queries Using Families of Statistics Trees
Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University
More informationQuotient Cube: How to Summarize the Semantics of a Data Cube
Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo) * Jiawei Han (Univ. of Illinois at Urbana-Champaign)
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationLa Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes
National Engineering School of Mechanic & Aerotechnics 1, avenue Clément Ader - BP 40109-86961 Futuroscope cedex France La Fragmentation Horizontale Revisitée: Prise en Compte de l Interaction de Requêtes
More informationColumn-Stores vs. Row-Stores How Different Are They Really?
Column-Stores vs. Row-Stores How Different Are They Really? Volodymyr Piven Wilhelm-Schickard-Institut für Informatik Eberhard-Karls-Universität Tübingen 2. Januar 2 Volodymyr Piven (Universität Tübingen)
More informationA Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 8, August 2013,
More informationSQL Server Analysis Services
DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, SQL Server 2005 Analysis Services SQL Server 2005 Analysis Services - 1 Analysis Services Database and
More informationModelling Data Warehouses with Multiversion and Temporal Functionality
Modelling Data Warehouses with Multiversion and Temporal Functionality Waqas Ahmed waqas.ahmed@ulb.ac.be Université Libre de Bruxelles Poznan University of Technology July 9, 2015 ITBI DC Outline 1 Introduction
More informationDesign and Implementation of Bit-Vector filtering for executing of multi-join qureies
Undergraduate Research Opportunity Program (UROP) Project Report Design and Implementation of Bit-Vector filtering for executing of multi-join qureies By Cheng Bin Department of Computer Science School
More informationMaterializing Baseline Views for Deviation Detection Exploratory OLAP
Materializing Baseline Views for Deviation Detection Exploratory OLAP Pedro Furtado 1, Sergi Nadal 3, Veronika Peralta 2, Mahfoud Djedaini 2, Nicolas Labroche 2, and Patrick Marcel 2 1 University of Coimbra,
More informationData Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems
Data Warehousing & Mining CPS 116 Introduction to Database Systems Data integration 2 Data resides in many distributed, heterogeneous OLTP (On-Line Transaction Processing) sources Sales, inventory, customer,
More information4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)
4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) Benchmark Testing for Transwarp Inceptor A big data analysis system based on in-memory computing Mingang Chen1,2,a,
More informationDocument-oriented Models for Data Warehouses NoSQL Document-oriented for Data Warehouses
Document-oriented Models for Data Warehouses NoSQL Document-oriented for Data Warehouses Max Chevalier 1, Mohammed El Malki 1,2, Arlind Kopliku 1, Olivier Teste 1 and Ronan Tournier 1 1 Université de Toulouse,
More informationDW Performance Optimization (II)
DW Performance Optimization (II) Overview Data Cube in ROLAP and MOLAP ROLAP Technique(s) Efficient Data Cube Computation MOLAP Technique(s) Prefix Sum Array Multiway Augmented Tree Aalborg University
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
More informationAcknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.
MTAT.03.183 Data Mining Week 7: Online Analytical Processing and Data Warehouses Marlon Dumas marlon.dumas ät ut. ee Acknowledgment This slide deck is a mashup of the following publicly available slide
More informationData Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationParallel Processing of Multi-join Expansion_aggregate Data Cube Query in High Performance Database Systems
Parallel Processing of Multi-join Expansion_aggregate Data Cube Query in High Performance Database Systems David Taniar School of Business Systems Monash University, Clayton Campus Victoria 3800, AUSTRALIA
More informationMost database operations involve On- Line Transaction Processing (OTLP).
Data Warehouse 1 Data Warehouse Most common form of data integration. Copy data from one or more sources into a single DB (warehouse) Update: periodic reconstruction of the warehouse, perhaps overnight.
More informationChapter 1, Introduction
CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from
More informationInternational Journal of Computer Sciences and Engineering. Research Paper Volume-6, Issue-1 E-ISSN:
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-6, Issue-1 E-ISSN: 2347-2693 Precomputing Shell Fragments for OLAP using Inverted Index Data Structure D. Datta
More informationData Warehousing and Data Mining
Data Warehousing and Data Mining Lecture 3 Efficient Cube Computation CITS3401 CITS5504 Wei Liu School of Computer Science and Software Engineering Faculty of Engineering, Computing and Mathematics Acknowledgement:
More informationDta Mining and Data Warehousing
CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz Nov 10, 2016 Class Announcements n Database Assignment 2 posted n Due 11/22 The Database Approach to Data Management The Final Database Design
More informationGPU ACCELERATION FOR OLAP. Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018
GPU ACCELERATION FOR OLAP Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018 A TYPICAL ANALYTICS QUERY From a business question to SQL Business question (TPC-H query 4) Determines how well the order
More informationComputing Appropriate Representations for Multidimensional Data
Computing Appropriate Representations for Multidimensional Data Yeow Wei Choong LI - Université F Rabelais HELP Institute - Malaysia choong yw@helpedumy Dominique Laurent LI - Université F Rabelais Tours
More informationWhat is a Data Warehouse?
What is a Data Warehouse? COMP 465 Data Mining Data Warehousing Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Defined in many different ways,
More informationDatabase design View Access patterns Need for separate data warehouse:- A multidimensional data model:-
UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationLectures for the course: Data Warehousing and Data Mining (IT 60107)
Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationData Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems
Data Warehousing and Data Mining CPS 116 Introduction to Database Systems Announcements (December 1) 2 Homework #4 due today Sample solution available Thursday Course project demo period has begun! Check
More informationA Benchmarking Criteria for the Evaluation of OLAP Tools
A Benchmarking Criteria for the Evaluation of OLAP Tools Fiaz Majeed Department of Information Technology, University of Gujrat, Gujrat, Pakistan. Email: fiaz.majeed@uog.edu.pk Abstract Generating queries
More informationA Methodology for Integrating XML Data into Data Warehouses
A Methodology for Integrating XML Data into Data Warehouses Boris Vrdoljak, Marko Banek, Zoran Skočir University of Zagreb Faculty of Electrical Engineering and Computing Address: Unska 3, HR-10000 Zagreb,
More informationBig Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data. Fall 2012
Big Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data Fall 2012 Data Warehousing and OLAP Introduction Decision Support Technology On Line Analytical Processing Star Schema
More informationGraph Cube: On Warehousing and OLAP Multidimensional Networks
Graph Cube: On Warehousing and OLAP Multidimensional Networks Peixiang Zhao, Xiaolei Li, Dong Xin, Jiawei Han Department of Computer Science, UIUC Groupon Inc. Google Cooperation pzhao4@illinois.edu, hanj@cs.illinois.edu
More informationSSD. DEIM Forum 2014 D8-6 SSD I/O I/O I/O HDD SSD I/O
DEIM Forum 214 D8-6 SSD, 153 855 4-6-1 135 8548 3-7-5 11 843 2-1-2 E-mail: {keisuke,haya,yokoyama,kitsure}@tkl.iis.u-tokyo.ac.jp, miyuki@sic.shibaura-it.ac.jp SSD SSD HDD 1 1 I/O I/O I/O I/O,, OLAP, SSD
More informationImproving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2
Improving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2 1 M.tech Computer Engineering OITM Hissar, GJU Univesity Hissar
More informationAn Overview of Cost-based Optimization of Queries with Aggregates
An Overview of Cost-based Optimization of Queries with Aggregates Surajit Chaudhuri Hewlett-Packard Laboratories 1501 Page Mill Road Palo Alto, CA 94304 chaudhuri@hpl.hp.com Kyuseok Shim IBM Almaden Research
More informationData Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders
Data Warehousing Conclusion Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders Motivation for the Course Database = a piece of software to handle data: Store, maintain, and query Most ideal system
More informationOracle #1 RDBMS Vendor
Oracle #1 RDBMS Vendor IBM 20.7% Microsoft 18.1% Other 12.6% Oracle 48.6% Source: Gartner DataQuest July 2008, based on Total Software Revenue Oracle 2 Continuous Innovation Oracle 11g Exadata Storage
More informationOLAP Introduction and Overview
1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata
More informationCarnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Data mining - detailed outline. Problem
Faloutsos & Pavlo 15415/615 Carnegie Mellon Univ. Dept. of Computer Science 15415/615 DB Applications Lecture # 24: Data Warehousing / Data Mining (R&G, ch 25 and 26) Data mining detailed outline Problem
More informationHorizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator
Horizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator R.Saravanan 1, J.Sivapriya 2, M.Shahidha 3 1 Assisstant Professor, Department of IT,SMVEC, Puducherry, India 2,3 UG student, Department
More informationOptimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses
Optimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses Swathi Kurunji, Tingjian Ge, Xinwen Fu, Benyuan Liu, Cindy X. Chen Computer Science Department, University of Massachusetts
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationa linear algebra approach to olap
a linear algebra approach to olap Rogério Pontes December 14, 2015 Universidade do Minho data warehouse ETL OLTP OLAP ETL Warehouse OLTP Data Mining ETL OLTP Data Marts 2 olap Online analytical processing
More informationTwo-Phase Optimization for Selecting Materialized Views in a Data Warehouse
Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse Jiratta Phuboon-ob, and Raweewan Auepanwiriyakul Abstract A data warehouse (DW) is a system which has value and role for decision-making
More informationarxiv: v1 [cs.db] 16 Sep 2008
Frequent itemsets mining for database auto-administration arxiv:0809.2687v1 [cs.db] 16 Sep 2008 Kamel Aouiche, Jérôme Darmont ERIC/BDD, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex,
More informationSQL Server 2005 Analysis Services
atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of SQL Server
More informationComputing Data Cubes Using Massively Parallel Processors
Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University
More informationCS 1655 / Spring 2013! Secure Data Management and Web Applications
CS 1655 / Spring 2013 Secure Data Management and Web Applications 03 Data Warehousing Alexandros Labrinidis University of Pittsburgh What is a Data Warehouse A data warehouse: archives information gathered
More informationXWeB: the XML Warehouse Benchmark
XWeB: the XML Warehouse Benchmark CEMAGREF Clermont-Ferrand -- Université de Lyon (ERIC Lyon 2) hadj.mahboubi@cemagref.fr -- jerome.darmont@univ-lyon2.fr September 17, 2010 XWeB: CEMAGREF the XML Warehouse
More informationPerformance Problems of Forecasting Systems
Performance Problems of Forecasting Systems Haitang Feng Supervised by: Nicolas Lumineau and Mohand-Saïd Hacid Université de Lyon, CNRS Université Lyon 1, LIRIS, UMR5205, F-69622, France {haitang.feng,
More informationData Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa
ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Data Warehousing
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing
More informationcollection of data that is used primarily in organizational decision making.
Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred
More informationSAMOA. A Platform for Mining Big Data Streams. Gianmarco De Francisci Morales Yahoo Labs
SAMOA! A Platform for Mining Big Data Streams Gianmarco De Francisci Morales Yahoo Labs Barcelona 1 gdfm@apache.org @gdfm7 Agenda Streams Applications, Model, Tools SAMOA Goal, Architecture, Avantages
More informationData Warehouse Design Using Row and Column Data Distribution
Int'l Conf. Information and Knowledge Engineering IKE'15 55 Data Warehouse Design Using Row and Column Data Distribution Behrooz Seyed-Abbassi and Vivekanand Madesi School of Computing, University of North
More informationImplementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP
324 Implementation of Aggregate Function in Multi Dimension Privacy Preservation Algorithms for OLAP Shivaji Yadav(131322) Assistant Professor, CSE Dept. CSE, IIMT College of Engineering, Greater Noida,
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing
More informationCS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University
CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,
More informationThe query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database
query processing Query Processing The query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database from high level queries
More informationSAP CERTIFIED APPLICATION ASSOCIATE - SAP HANA 2.0 (SPS01)
SAP EDUCATION SAMPLE QUESTIONS: C_HANAIMP_13 SAP CERTIFIED APPLICATION ASSOCIATE - SAP HANA 2.0 (SPS01) Disclaimer: These sample questions are for self-evaluation purposes only and do not appear on the
More informationb1 b3 Anchor cell a Border cells b1 and b3 Border cells b2 and b4 Cumulative cube for PS
Space-Ecient Data Cubes for Dynamic Environments? Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi, and Renato Pajarola Dept. of Computer Science, Univ. of California, Santa Barbara CA 9, USA fmirek,
More informationTrajectory Data Warehouses: Proposal of Design and Application to Exploit Data
Trajectory Data Warehouses: Proposal of Design and Application to Exploit Data Fernando J. Braz 1 1 Department of Computer Science Ca Foscari University - Venice - Italy fbraz@dsi.unive.it Abstract. In
More informationCoarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining
Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining Frank Dehne 1,ToddEavis 2, and Andrew Rau-Chaplin 2 1 Carleton University, Ottawa, Canada, frank@dehne.net, WWW home page: http://www.dehne.net
More informationInformation Integration
Chapter 11 Information Integration While there are many directions in which modern database systems are evolving, a large family of new applications fall undei the general heading of information integration.
More informationPerformance Analysis of Apriori Algorithm with Progressive Approach for Mining Data
Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Shilpa Department of Computer Science & Engineering Haryana College of Technology & Management, Kaithal, Haryana, India
More informationData mining - detailed outline. Carnegie Mellon Univ. Dept. of Computer Science /615 DB Applications. Problem.
Faloutsos & Pavlo 15415/615 Carnegie Mellon Univ. Dept. of Computer Science 15415/615 DB Applications Data Warehousing / Data Mining (R&G, ch 25 and 26) C. Faloutsos and A. Pavlo Data mining detailed outline
More informationOn-Line Application Processing
On-Line Application Processing WAREHOUSING DATA CUBES DATA MINING 1 Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming,
More informationABSTRACT. GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized
ABSTRACT GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized Views (Under the direction of Dr. Rada Chirkova) The purpose of this research is to show the use of derived data
More informationAdvanced Data Management Technologies
ADMT 2017/18 Unit 10 J. Gamper 1/37 Advanced Data Management Technologies Unit 10 SQL GROUP BY Extensions J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements: I
More informationExam Datawarehousing INFOH419 July 2013
Exam Datawarehousing INFOH419 July 2013 Lecturer: Toon Calders Student name:... The exam is open book, so all books and notes can be used. The use of a basic calculator is allowed. The use of a laptop
More informationThe Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees
The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees Omar H. Karam Faculty of Informatics and Computer Science, The British University in Egypt and Faculty of
More informationDATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843
DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 WHAT IS A DATA CUBE? The Data Cube or Cube operator produces N-dimensional answers
More informationAdvanced Data Management Technologies
ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More informationA Simple and Efficient Method for Computing Data Cubes
A Simple and Efficient Method for Computing Data Cubes Viet Phan-Luong Université Aix-Marseille LIF - UMR CNRS 6166 Marseille, France Email: viet.phanluong@lif.univ-mrs.fr Abstract Based on a construction
More informationEfficient Keyword Search over Relational Data Streams
DEIM Forum 2016 A3-4 Abstract Efficient Keyword Search over Relational Data Streams Savong BOU, Toshiyuki AMAGASA, and Hiroyuki KITAGAWA Graduate School of Systems and Information Engineering, University
More informationFMC: An Approach for Privacy Preserving OLAP
FMC: An Approach for Privacy Preserving OLAP Ming Hua, Shouzhi Zhang, Wei Wang, Haofeng Zhou, Baile Shi Fudan University, China {minghua, shouzhi_zhang, weiwang, haofzhou, bshi}@fudan.edu.cn Abstract.
More informationImproving the Performance of OLAP Queries Using Families of Statistics Trees
To appear in Proceedings of the 3rd International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2001), Technical University of München, München, Germany, September 2001. Improving the Performance
More informationFast Computation on Processing Data Warehousing Queries on GPU Devices
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 6-29-2016 Fast Computation on Processing Data Warehousing Queries on GPU Devices Sam Cyrus University of South
More informationTeradata Aggregate Designer
Data Warehousing Teradata Aggregate Designer By: Sam Tawfik Product Marketing Manager Teradata Corporation Table of Contents Executive Summary 2 Introduction 3 Problem Statement 3 Implications of MOLAP
More informationManagement Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT
MANAGING THE DIGITAL FIRM, 12 TH EDITION Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT VIDEO CASES Case 1: Maruti Suzuki Business Intelligence and Enterprise Databases
More informationHow Achaeans Would Construct Columns in Troy. Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte
How Achaeans Would Construct Columns in Troy Alekh Jindal, Felix Martin Schuhknecht, Jens Dittrich, Karen Khachatryan, Alexander Bunte Number of Visas Received 1 0,75 0,5 0,25 0 Alekh Jens Health Level
More informationData Warehousing and Data Mining SQL OLAP Operations
Data Warehousing and Data Mining SQL OLAP Operations SQL table expression query specification query expression SQL OLAP GROUP BY extensions: rollup, cube, grouping sets Acknowledgements: I am indebted
More informationSemantic Event Correlation Using Ontologies
Semantic Event Correlation Using Ontologies Thomas Moser 1, Heinz Roth 2, Szabolcs Rozsnyai 3, Richard Mordinyi 1, and Stefan Biffl 1 1 Complex Systems Design & Engineering Lab, Vienna University of Technology
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 02 Lifecycle of Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationChapter 4, Data Warehouse and OLAP Operations
CSI 4352, Introduction to Data Mining Chapter 4, Data Warehouse and OLAP Operations Young-Rae Cho Associate Professor Department of Computer Science Baylor University CSI 4352, Introduction to Data Mining
More information