Data Warehouses for Decision Support
|
|
- Kristopher Toby McCormick
- 5 years ago
- Views:
Transcription
1 Data Warehouses for Decision Support Vera Goebel Department of Informatics, University of Oslo
2 What and Why of Data Warehousing Database Database Datastore Database System Database System Data System Data Extraction and Load Data Warehouse System Datacubes Queries. DSS app workstations What: A very large database containing materialized views of multiple, independent source databases. The views generally contain aggregation data (aka datacubes). Why: The data warehouse (DW) supports read-only queries for new applications, e.g., DSS, OLAP & data mining. 2
3 Data Warehouse (DW) Life Cycle The Life Cycle: Global Schema Definition Data Extraction and Load Query Processing Data Update General Problems: Heavy user demand Problems with source data ownership, format, heterogeneity Underestimating complexity & resources for all phases Boeing Computing Services DW for DSS in airplane repair DW size: 2-3 terabytes Online query services: 24 7 service Data life cycle: retain data for 70+ years (until the airplane is retired) Data update: No nighttime ; concurrent refresh is required Access paths: Support new and old methods for 70+ years 3
4 Global Schema Design Base Tables Fact Table Stores basic facts from the source databases (often denormalized) Data about past events (e.g., sales, deliverings, factory outputs,...) Have a time (or time period) associated with them Data is very unlikely to change at a data source; no updates Very large tables (up to 1 TB) Dimension Table Attributes of one dimension of a fact table (typically denormalized) A chain of dimension tables to describe attributes on other dimension tables, (normalized or denormalized) Data can change at a data source; updates executed occasionally ProductID SupplierID PurchaseDate DeliveryDate CustYrs Fact Table ProductID ProdName ProdDesc ProdStyle ManufSite SupplierID SuppName SuppAddr SuppPhone Date1stOrder Dimension Tables TimeID Quarter Year AuditName AuditID AuditComp Addr AcctName Phone ContractYr 4
5 Schema Design Patterns Star Schema F D3 D1 D2 D1, D2, D3 are denormalized Starflake Schema F D3.1 D3.2 D1 D2 D3 may be normalized or denormalized Snowflake Schema D1.1 D1.2 F1 F D2.1 D1 D3.1 F2 D2.2 D1, D2, D3 are normalized D3.2 Constellation Schema D1 stores attributes about a relationship between F1 and F2 5
6 Summary Tables aka datacubes or multidimensional tables Store precomputed query results for likely queries Reduce on-the-fly join operations Reduce on-the-fly aggregation functions, e.g., sum, avg Stores denormalized data Aggregate data from one or more fact tables and/or one or more dimension tables Discard and compute new summaries as the set of likely queries changes Dim Table#2 Fact Table Dim Table#1 Summary Table Fact Table 6
7 Summary Tables = Datacubes Total Expenses for Parts by Product, Supplier and Quarter Average Price paid to all suppliers of parts for Product P11 in the 1 st quarter GROUP BY product, quarter Supplier Fiscal Quarter All-S S1 S2 Q1 Q2 Q3 Q4 All-Q 1.2M 1.0M 0.2M 0.4M 0.6M 1.0M 2.2M P11 P14 P19 P27 P33 All-P Product Total Expenses paid to all suppliers of parts for Product P19 in the 1 st quarter GROUP BY product, quarter 2.7M 4.6M Total Expenses paid to Supplier S1 for parts for Product P33 in 2nd quarter GROUP BY supplier, product, quarter Total Expenses paid to Supplier S2 for parts for all products in all quarters GROUP BY supplier Typical, pre-computed Measures are: Sum, percentage, average, std deviation, count, min-value, max-value, percentile 7
8 Too Many Summary Tables The Schema Design Problem: Given a finite amount of disk storage, what views (summaries) will you pre-compute in the data warehouse? Factors to be considered: What queries must DW support? What source data are available? What is the time-space trade-off to store versus re-compute joins and measures? Cost to acquire and update the data? An NP-complete optimization problem Use heuristics and approximation algorithms Benefit Per Unit Space (BPUS) } Use a derivation lattice to analyze Pick By Size (PBS) possible materialized views Pick By Size Use (PBS-U) D B G A E ALL/None C H Derivation Lattice of materialized views F 8
9 A Lattice of Summary Tables Derivation Lattice Nodes: The set of attributes that would appear in the group by clause to construct this view Edges: Connect view V2 to view V1 if V1 can be used to answer queries over V2 MetaData: estimated # of records in each view Determine cost and benefit of each view Select a subset of the possible views Typical simplifying assumptions: Query cost # of records scanned to answer the query I/O costs are much larger than CPU cost to compute measures Ignore cost reductions due to using indexes to access records All queries are equally likely to occur PC P 6M PSC PS S ALL/None SC C Derivation Lattice for parts, supplier, & customers 0.1M 9
10 Benefit Per Unit Space (BPUS) D S = { A } B G S = S U {B} S = S U {F} S = S U {D} A E ALL/None C H Derivation Lattice of materialized views F Benefit- round#1 B 50 * 5 = 250 C 25 * 5 = 125 D 80 * 2 = 160 E 70 * 3 = 210 F 60 * 2 = 120 G 99 * 1 = 99 H 90 * 1 = 90 View #MRecs View #MRecs A 100 E 30 B 50 F 40 C 75 G 1 D 20 H 10 S is the set of views we will materialize bf(u,v,s) = min(#w-#v w S and w covers u) Benefit(v, S) = SUM (bf(u,v,s) u=v or v covers u) Savings: read 420M records, not 800M Benefit- round#2 B C 25 * 2 = 50 D 30 * 2 = 60 E 20 * 3 = 60 F = 70 G 49 * 1 = 49 H 40 * 1 = 40 Benefit- round#3 B C 25 * 1 = 25 D 30 * 2 = 60 E 20 * = 50 F G 49 * 1 = 49 H 30 * 1 = 30 10
11 Pick By Size (PBS) Table sizes (in millions of records) Parts+Suppls+Custs (6M) Parts+Custs (6M) Parts+Suppls (0.8M) Suppls+Custs (6M) S = {PSC} While (space > 0) Do v = smallest views If (space - v > 0) Then space = space - v S = S U {v} views = views {v} Else space = 0 Parts (0.2M) Suppls (0.01M) Custs (0.1M) PC P PSC PS S ALL/None SC C Derivation Lattice Storage Savings: Reduced from 19.2M records to 7.2M records Query Savings: Read 19.11M records, not 42M 11
12 Pick By Size-Use (PBS-U) Extends the Pick By Size algorithm to consider the frequency of queries on each possible view Table sizes (#Mrecs) & query frequency (probabilities) Parts+Suppls+Custs (6M, 0.05) Parts (0.2M, 0.1) Parts+Custs (6M, 0.3) Suppls (0.01M, 0.1) Parts+Suppls (0.8M, 0.3) Custs (0.1M, 0.1) Suppls+Custs (6M, 0.05) While (space > 0) Do v = smallest { v / prob(v), where v views } If (space - v > 0) Then space = space - v S = S U {v} views = views {v} Else space = 0 PSC PC PS SC P S C ALL/None Derivation Lattice This query frequency did not change the selected views same savings 12
13 Comparing Schema Design Algorithms All Proposed Algorithms Produce only a near optimal solution Best known is within (0.63 f ) of optimal, where f is the fraction of space consumed by the largest table Make (unrealistic) assumptions e.g., ignore indexed data access Rely heavily on having good metadata e..g., table size and query frequency Algorithmic Performance Benefit Per Unit Space (BPUS) Pick By Size (PBS) O(n log n) runtime O(n 3 ) runtime Limited applicability for PBS? Finds the near optimal solution only for SR-Hypercube lattices A lattice forms an SR-Hypercube when for each v in the lattice, except v = DBT v ((# of direct children of v) * (# of records in the child of v)) 13
14 Data Extraction and Load Step1: Extract and clean data from all sources Select source, remove data inconsistencies, add default values Step2: Materialize the views and measures Reformat data, recalculate data, merge data from multiple sources, add time elements to the data, compute measures Step3: Store data in the DW Create metadata and access path data, such as indexes Major Issue: Failure during extraction and load Approaches: UNDO/REDO logging Too expensive in time and space Incremental Checkpointing When to checkpoint? Modularize and divide the long-running tasks Must use UNDO/REDO logs also; Need high/performance logging 14
15 Materializing Summary Tables Scenario: CompsiQ has factories in 7 cities. Each factory manufactures several of CompsiQ s 30 hardware products. Each factory has 3 types of manufacturing lines: robotic, hand-assembly, and mixed-line. Schema for source data from Factory-A: YieldInfo ProductCode RoboticYield Hand-AssemYield MixedLineYield Week Year ProductInfo ProductCode ProductName ProductType FCS-Date EstProductLife Target summary query: What is last year s yield from Factory-A by product type? 15
16 Materialization using SchemaSQL What is last year s yield from Factory-A by product type? select p.producttype, sum(y.lt) from Factory-A::YieldInfo lt, Factory-A::YieldInfo y, Factory-A::ProductInfo p where lt < > ProductCode and lt < > Week and lt < > Year and y.productcode = p.productcode and y.year = 01 group by p.producttype At execution time, lt ranges over the attribute names in relation YieldInfo YieldInfo ProductCode RoboticYield Hand-AssemYield MixedLineYield Week Year ProductInfo ProductCode ProductName ProductType FCS-Date EstProductLife 16
17 Aggregation Over Irregular Blocks YieldInfo P P P P P P P P ProductInfo P11 ATMCard Net P12 SMILCard Video P13 ATMHub Net P14 MPEGCard Video P15 MP3 Audio YieldInfo ProductCode RoboticYield Hand-AssemYield MixedLineYield Week Year ProductInfo ProductCode ProductName ProductType FCS-Date EstProductLife 17
18 User Queries Retrieve pre-computed data or formulate new measures not materialized in the DW. New user operations on logical datacubes: Roll-up, Drill-down, Pivot/Rotate Slicing and Dicing with a data blade Sorting Selection Derived Attributes Supplier Fiscal Quarter All-S S1 S2 Q1 Q2 Q3 Q4 All-Q P11 P14 P19 P27 P33 Product 18
19 Query Processing Traditional query transformations Index intersection and union Advanced join algorithms Piggy-backed scans Multiple queries with different selection criteria SQL extensions => new operators Red Brick Systems has proposed 8 extensions, including: MovingSum and MovingAvg Rank When RatioToReport Tertiles Create Macro 19
20 Data Update Data sources change over time Must refresh the DW Adds new historical data to the fact tables Updates descriptive attributes in the dimension tables Forces recalculation of measures in summary tables Issues: 1. Monitoring/tracking changes at the data sources 2. Recalculation of aggregated measures 3. Refresh typically forces a shutdown for DW query processing 20
21 Monitoring Data Sources Approaches: 1. Value-deltas - Capture before and after values of all tuples changed by normal DB operations and store them in differential relations. Issues: must take the DW offline to install the modified values 2. Operation-deltas Capture SQL updates from the transaction log of each data source and build a new log of all transactions that effect data in the DW. Advantages: DW can remain online for query processing while executing data updates (using traditional concurrency control) 3. Hybrid use value-deltas and operation-deltas for different data sources or a subset of the relations from a data source. 21
22 Creating a Differential Relation Approaches at the Data Source: 1.Execute the update query 3 times (1) Select and record the before values; (2) Execute the update; (3) Select and record the after values Issues: High cost in time & space; reduces autonomy of the data sources 2. Define and insert DB triggers Triggers fire on insert, delete, and update operations; Log the before and after values Issues: Not all data sources support triggers; reduces autonomy of the data sources 22
23 Creating Operation-Deltas The process: Scan the transaction log at each data source Select pertinent transactions and delta-log them Advantage: Op-delta is much smaller than the value-delta Issues: Must transform the update operation on the data source schema into an update operation on the DW schema not always possible. Hence can not be used in all cases. 23
24 Recalculating Aggregated Measures Delta Tables Assume we have differential relations for the base facts in the data sources (i.e., value deltas) Two processing phases (Propagation & Refresh): 1) Propagation pre-compute all new tuples and all replacement tuples and store them in a delta table Differential Relations Global DW Schema Propagation Process Delta Tables 24
25 Recalculating Aggregated Measures 2) Refresh Scan the DW tuples, replace existing tuples with the pre-computed tuple values, insert new tuples from the delta tables Delta Tables DW Tables Refresh Process Updated DW Tables Issue: Can not pre-compute Delta Table for non-commutative measures Ex: average (without #records), percentiles Must compute these during the refresh phase. 25
26 Data Marting What: Stores a second copy of a subset of a DW Data Warehouse System datacubes Data Extraction and Load Data Mart System Datacubes Data Mart System Queries. Why build a data mart? Datacubes A user group with special needs (dept.) Better performance accessing fewer records To support a different user access tool Queries To enforce access control over different subsets To segment data over different hardware platforms DSS app workstations 26
27 Costs and Benefits of Data Marting System costs: More hardware (servers and networks) Define a subset of the global data model More software to: Extract data from the warehouse Load data into the mart Update the mart (after the warehouse is updated) User benefits: Define new measures not stored in the DW Better performance (mart users and DW users) Support a more appropriate user interface Ex: a browser with forms versus SQL queries Company achieves more reliable access control 27
28 Commercial DW Products Short list of companies with DW products: Informix/Red Brick Systems Oracle Prism Solutions Software AG Typical Products and Tools Specially tuned DB Server DW Developer Tools: data extraction, incremental update, index builder User Tools: ad hoc query and spreadsheet tools for DSS and post-processing (creating graphs, pie-charts, etc.) Application Developer Tools (toolkits for OLAP and DSS): spreadsheet components, statistics packages, trend analysis and forecasting components 28
29 Ongoing Research Problems How to incorporate domain and business rules into DW creation and maintenance Replacing manual tasks with intelligent agents Data acquisition, data cleaning, schema design, DW access paths analysis and index construction Separate (but related) research areas: Tools for data mining and OLAP Providing active database services in the DW 29
Data Warehouses. Vera Goebel. Fall Department of Informatics, University of Oslo
Data Warehouses Vera Goebel Department of Informatics, University of Oslo Fall 2014 1! Warehousing: History & Economics 1990 IBM, Business Intelligence : process of collecting and analyzing 1993 Bill Inmon,
More informationData Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
More informationAdvanced Data Management Technologies
ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationDecision Support Systems aka Analytical Systems
Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis
More informationData Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke
Data Warehouses Yanlei Diao Slides Courtesy of R. Ramakrishnan and J. Gehrke Introduction v In the late 80s and early 90s, companies began to use their DBMSs for complex, interactive, exploratory analysis
More informationSummary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse
Principles of Knowledge Discovery in bases Fall 1999 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999 Principles of Knowledge Discovery in bases University
More informationData Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong
Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationBasics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension
More informationDecision Support, Data Warehousing, and OLAP
Decision Support, Data Warehousing, and OLAP : Contents Terminology : OLAP vs. OLTP Data Warehousing Architecture Technologies References 1 Decision Support and OLAP Information technology to help knowledge
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationData Warehousing & OLAP
CMPUT 391 Database Management Systems Data Warehousing & OLAP Textbook: 17.1 17.5 (first edition: 19.1 19.5) Based on slides by Lewis, Bernstein and Kifer and other sources University of Alberta 1 Why
More informationOverview. DW Performance Optimization. Aggregates. Aggregate Use Example
Overview DW Performance Optimization Choosing aggregates Maintaining views Bitmapped indices Other optimization issues Original slides were written by Torben Bach Pedersen Aalborg University 07 - DWML
More informationData Warehousing and OLAP
Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient
More informationData Warehouses and OLAP. Database and Information Systems. Data Warehouses and OLAP. Data Warehouses and OLAP
Database and Information Systems 11. Deductive Databases 12. Data Warehouses and OLAP 13. Index Structures for Similarity Queries 14. Data Mining 15. Semi-Structured Data 16. Document Retrieval 17. Web
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More information1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar
1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1) What does the term 'Ad-hoc Analysis' mean? Choice 1 Business analysts use a subset of the data for analysis. Choice 2: Business analysts access the Data
More informationData Warehousing ETL. Esteban Zimányi Slides by Toon Calders
Data Warehousing ETL Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders 1 Overview Picture other sources Metadata Monitor & Integrator OLAP Server Analysis Operational DBs Extract Transform Load
More informationDecision Support. Chapter 25. CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1
Decision Support Chapter 25 CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support
More informationImproving the Performance of OLAP Queries Using Families of Statistics Trees
Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University
More informationThis tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
More informationData Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa
ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Data Warehousing
More informationOverview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?
Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely
More informationcollection of data that is used primarily in organizational decision making.
Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection
More informationCS 245: Database System Principles. Warehousing. Outline. What is a Warehouse? What is a Warehouse? Notes 13: Data Warehousing
Recall : Database System Principles Notes 3: Data Warehousing Three approaches to information integration: Federated databases did teaser Data warehousing next Mediation Hector Garcia-Molina (Some modifications
More informationData warehouses Decision support The multidimensional model OLAP queries
Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing
More informationData Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders
Data Warehousing Conclusion Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders Motivation for the Course Database = a piece of software to handle data: Store, maintain, and query Most ideal system
More informationWhat is a Data Warehouse?
What is a Data Warehouse? COMP 465 Data Mining Data Warehousing Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Defined in many different ways,
More informationIntroduction to Data Warehousing
ICS 321 Spring 2012 Introduction to Data Warehousing Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/23/2012 Lipyeow Lim -- University of Hawaii at Manoa
More informationDatabase design View Access patterns Need for separate data warehouse:- A multidimensional data model:-
UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to
More informationEvolution of Database Systems
Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic
More informationData Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20
Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 Introduction Increasingly,
More informationSyllabus. Syllabus. Motivation Decision Support. Syllabus
Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationThe Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing
The Evolution of Data Warehousing Data Warehousing Concepts Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationDATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data
More informationSQL Server Analysis Services
DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, SQL Server 2005 Analysis Services SQL Server 2005 Analysis Services - 1 Analysis Services Database and
More informationOracle 1Z0-515 Exam Questions & Answers
Oracle 1Z0-515 Exam Questions & Answers Number: 1Z0-515 Passing Score: 800 Time Limit: 120 min File Version: 38.7 http://www.gratisexam.com/ Oracle 1Z0-515 Exam Questions & Answers Exam Name: Data Warehousing
More informationThe strategic advantage of OLAP and multidimensional analysis
IBM Software Business Analytics Cognos Enterprise The strategic advantage of OLAP and multidimensional analysis 2 The strategic advantage of OLAP and multidimensional analysis Overview Online analytical
More informationOne Size Fits All: An Idea Whose Time Has Come and Gone
ICS 624 Spring 2013 One Size Fits All: An Idea Whose Time Has Come and Gone Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/9/2013 Lipyeow Lim -- University
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationCHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI
CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)
More informationData Mining. Vera Goebel. Department of Informatics, University of Oslo
Data Mining Vera Goebel Department of Informatics, University of Oslo 2012 1 Lecture Contents Knowledge Discovery in Databases (KDD) Definition and Applications OLAP Architectures for OLAP and KDD KDD
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction 2 Data warehousing
More informationData Mining & Data Warehouse
Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?
More informationCSE 544 Principles of Database Management Systems. Fall 2016 Lecture 14 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Fall 2016 Lecture 14 - Data Warehousing and Column Stores References Data Cube: A Relational Aggregation Operator Generalizing Group By, Cross-Tab, and
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More informationA Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective
A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India
More informationData Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa
Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationSTRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa
STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS LECTURE: 05 (A) DATA WAREHOUSING (DW) By: Dr. Tendani J. Lavhengwa lavhengwatj@tut.ac.za 1 My personal quote:
More informationAggregating Knowledge in a Data Warehouse and Multidimensional Analysis
Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com Objectives Explain the basics of: 1. Data
More informationREPORTING AND QUERY TOOLS AND APPLICATIONS
Tool Categories: REPORTING AND QUERY TOOLS AND APPLICATIONS There are five categories of decision support tools Reporting Managed query Executive information system OLAP Data Mining Reporting Tools Production
More informationData Warehouse and Mining
Data Warehouse and Mining 1. is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions. A. Data Mining. B. Data Warehousing. C. Web Mining. D. Text
More informationData Warehousing & Data Mining
Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last Week: Optimization - Indexes
More informationDesigning Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses
Designing Data Warehouses To begin a data warehouse project, need to find answers for questions such as: Data Warehousing Design Which user requirements are most important and which data should be considered
More informationOLAP Introduction and Overview
1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata
More informationDATA WAREHOUING UNIT I
BHARATHIDASAN ENGINEERING COLLEGE NATTRAMAPALLI DEPARTMENT OF COMPUTER SCIENCE SUB CODE & NAME: IT6702/DWDM DEPT: IT Staff Name : N.RAMESH DATA WAREHOUING UNIT I 1. Define data warehouse? NOV/DEC 2009
More informationData Warehousing. Overview
Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake
More informationALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22
ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS CS121: Relational Databases Fall 2017 Lecture 22 E-R Diagramming 2 E-R diagramming techniques used in book are similar to ones used in industry
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 07 Terminologies Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Database
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 14 : 18/11/2014 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationSql Fact Constellation Schema In Data Warehouse With Example
Sql Fact Constellation Schema In Data Warehouse With Example Data Warehouse OLAP - Learn Data Warehouse in simple and easy steps using Multidimensional OLAP (MOLAP), Hybrid OLAP (HOLAP), Specialized SQL
More informationIT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS
PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such
More informationData Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)
Data Warehouse Logical Design Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato) Data Mart logical models MOLAP (Multidimensional On-Line Analytical Processing) stores data
More informationThe University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory
Warehousing Outline Andrew Kusiak 2139 Seamans Center Iowa City, IA 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel. 319-335 5934 Introduction warehousing concepts Relationship
More informationAnalytical data bases Database lectures for math
Analytical data bases Database lectures for mathematics students May 14, 2017 Decision support systems From the perspective of the time span all decisions in the organization could be divided into three
More informationBig Data 13. Data Warehousing
Ghislain Fourny Big Data 13. Data Warehousing fotoreactor / 123RF Stock Photo 2 The road to analytics Aurelio Scetta / 123RF Stock Photo 3 Another history of data management (T. Hofmann) 1970s 2000s Age
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing
More informationHANA Performance. Efficient Speed and Scale-out for Real-time BI
HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business
More informationData Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 31 Table of contents 1 Introduction 2 Data warehousing
More informationQ1) Describe business intelligence system development phases? (6 marks)
BUISINESS ANALYTICS AND INTELLIGENCE SOLVED QUESTIONS Q1) Describe business intelligence system development phases? (6 marks) The 4 phases of BI system development are as follow: Analysis phase Design
More informationAnnouncements. Course Outline. CS/INFO 330 Data Warehousing and OLAP. Mirek Riedewald
CS/INFO 330 Data Warehousing and OLAP Mirek Riedewald mirek@cs.cornell.edu Announcements Don t forget to let me know about the demo sessions next Monday Who does not have a laptop for the demo? CS/INFO
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationBig Data 13. Data Warehousing
Ghislain Fourny Big Data 13. Data Warehousing fotoreactor / 123RF Stock Photo The road to analytics Aurelio Scetta / 123RF Stock Photo Another history of data management (T. Hofmann) 1970s 2000s Age of
More informationIntroduction to Data Warehousing, Profiling and Cleansing. Aims. Plan COMP33111, 2011/ COMP33111 Lecture 2
COMP33111 Lecture 2 Introduction to Data Warehousing, Profiling and Cleansing Goran Nenadic School of Computer Science 2 Aims Understand the need for data warehousing Learn basic principles of data warehousing
More informationIn-Memory Data Management Jens Krueger
In-Memory Data Management Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute OLTP vs. OLAP 2 Online Transaction Processing (OLTP) Organized in rows Online Analytical Processing
More informationSQL Server 2005 Analysis Services
atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of SQL Server
More informationDHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI
DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI Department of Information Technology IT6702 Data Warehousing & Data Mining Anna University 2 & 16 Mark Questions & Answers Year / Semester: IV / VII Regulation:
More informationOracle Database 11g: Data Warehousing Fundamentals
Oracle Database 11g: Data Warehousing Fundamentals Duration: 3 Days What you will learn This Oracle Database 11g: Data Warehousing Fundamentals training will teach you about the basic concepts of a data
More informationRocky Mountain Technology Ventures
Rocky Mountain Technology Ventures Comparing and Contrasting Online Analytical Processing (OLAP) and Online Transactional Processing (OLTP) Architectures 3/19/2006 Introduction One of the most important
More informationData Warehousing & Mining. Data integration. OLTP versus OLAP. CPS 116 Introduction to Database Systems
Data Warehousing & Mining CPS 116 Introduction to Database Systems Data integration 2 Data resides in many distributed, heterogeneous OLTP (On-Line Transaction Processing) sources Sales, inventory, customer,
More informationCSPP 53017: Data Warehousing Winter 2013! Lecture 7! Svetlozar Nestorov! Class News!
CSPP 53017: Data Warehousing Winter 2013! Lecture 7! Svetlozar Nestorov! Class News! Make-up class on Saturday, Mar 9 in Gleacher 203 10:30am 1:30pm.! Last 15 minute in-class quiz (6:30pm) on Mar 5.! Covers
More information~ Ian Hunneybell: DWDM Revision Notes (31/05/2006) ~
Useful reference: Microsoft Developer Network Library (http://msdn.microsoft.com/library). Drill down to Servers and Enterprise Development SQL Server SQL Server 2000 SDK Documentation Creating and Using
More informationCreate Cube From Star Schema Grouping Framework Manager
Create Cube From Star Schema Grouping Framework Manager Create star schema groupings to provide authors with logical groupings of query Connect to an OLAP data source (cube) in a Framework Manager project
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 02 Introduction to Data Warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing
More informationProceedings of the IE 2014 International Conference AGILE DATA MODELS
AGILE DATA MODELS Mihaela MUNTEAN Academy of Economic Studies, Bucharest mun61mih@yahoo.co.uk, Mihaela.Muntean@ie.ase.ro Abstract. In last years, one of the most popular subjects related to the field of
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationBUSINESS INTELLIGENCE. SSAS - SQL Server Analysis Services. Business Informatics Degree
BUSINESS INTELLIGENCE SSAS - SQL Server Analysis Services Business Informatics Degree 2 BI Architecture SSAS: SQL Server Analysis Services 3 It is both an OLAP Server and a Data Mining Server Distinct
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationCT75 DATA WAREHOUSING AND DATA MINING DEC 2015
Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers
More informationSAP NetWeaver BW Performance on IBM i: Comparing SAP BW Aggregates, IBM i DB2 MQTs and SAP BW Accelerator
SAP NetWeaver BW Performance on IBM i: Comparing SAP BW Aggregates, IBM i DB2 MQTs and SAP BW Accelerator By Susan Bestgen IBM i OS Development, SAP on i Introduction The purpose of this paper is to demonstrate
More information