Seminars of Software and Services for the Information Society. Data Warehousing Design Issues

Similar documents
Basics of Dimensional Modeling

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model

Data warehouse design

Decision Support Systems aka Analytical Systems

Data Warehousing. Overview

Real-World Performance Training Dimensional Queries

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Data Warehouse Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table

Data Warehouse Logical Design. Letizia Tanca Politecnico di Milano (with the kind support of Rosalba Rossato)

Data warehouse design

An Overview of Data Warehousing and OLAP Technology

Data Mining Concepts & Techniques

Decision Support Systems

Data Mining & Data Warehouse

Oracle Database 11g: Data Warehousing Fundamentals

1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.

The Data Organization

Logical design DATA WAREHOUSE: DESIGN Logical design. We address the relational model (ROLAP)

Summary. The Dimensional Fact Model. Goals and benefits Basic and advanced constructs. Logical design with the DFM Best practices for design

Data Warehousing & Mining Techniques

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

Data warehouse architecture consists of the following interconnected layers:

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1)

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses

Data Warehousing on the MPE Platform Presentation #272

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

2. Summary. 2.1 Basic Architecture. 2. Architecture. 2.1 Staging Area. 2.1 Operational Data Store. Last week: Architecture and Data model

Data Warehousing & Mining Techniques

Exam /Course 20767B: Implementing a SQL Data Warehouse

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Sql Fact Constellation Schema In Data Warehouse With Example

Data Warehouse and Data Mining

Data Warehouse and Data Mining

Testing Masters Technologies

Data Strategies for Efficiency and Growth

Modern Software Engineering Methodologies Meet Data Warehouse Design: 4WD

Information Management course

Logical Design A logical design is conceptual and abstract. It is not necessary to deal with the physical implementation details at this stage.

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

Data Warehouse and Data Mining

Advanced Modeling and Design

Extended TDWI Data Modeling: An In-Depth Tutorial on Data Warehouse Design & Analysis Techniques

Data Warehouse and Data Mining

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong

What is a Data Warehouse?

Dta Mining and Data Warehousing

The COSMIC Functional Size Measurement Method Version 4.0.1

Data Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders

A Multi-Dimensional Data Model

ETL and OLAP Systems

Data Warehousing Concepts

Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g

Informatica Power Center 10.1 Developer Training

Business Intelligence. You can t manage what you can t measure. You can t measure what you can t describe. Ahsan Kabir

Data Warehouse and Mining

Data Modeling Online Training

Rocky Mountain Technology Ventures

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Implementing a SQL Data Warehouse

SAMSTAR. A Semi-Automated lexical Method for generating STAR schemas using an ER diagram

Introduction to Data Warehousing, Profiling and Cleansing. Aims. Plan COMP33111, 2011/ COMP33111 Lecture 2

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong

Data Warehousing Introduction. Toon Calders

Information Management course

Adnan YAZICI Computer Engineering Department

Evolution of Database Systems

Proposal of a new Data Warehouse Architecture Reference Model

20767B: IMPLEMENTING A SQL DATA WAREHOUSE

ETL TESTING TRAINING

Decision Support, Data Warehousing, and OLAP

Factors in the Design and Development of a Data Warehouse for Academic Data

MIS2502: Data Analytics Dimensional Data Modeling. Jing Gong

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

Star Schema מחסני נתונים. Star Schema Example 1. Star Schema

Seminars of Software and Services for the Information Society

Call: SAS BI Course Content:35-40hours

Data warehousing in telecom Industry

DATABASE DEVELOPMENT (H4)

Business Intelligence An Overview. Zahra Mansoori

Introduction to Data Warehousing

OPEN LAB: HOSPITAL. An hospital needs a DM to extract information from their operational database with information about inpatients treatments.

Data Warehousing and Business Intelligence. Improve strategic decision making David Diaz Diaz CERN GS-AIS

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):

Working with the Business to Build Effective Dimensional Models

Implement a Data Warehouse with Microsoft SQL Server

Data-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives

Data Warehouse and Data Mining

Version: 1. Designing Microsoft SQL Server 2005 Databases

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

DATA MINING TRANSACTION

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus

Hierarchies in a multidimensional model: From conceptual modeling to logical representation

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.

1. Inroduction to Data Mininig

ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22

Transcription:

DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society Data Warehousing Design Issues 1

Data Warehouse Design Setting targets and planning feasibility (border, size, sources,...) team operating plan Design of the infrastructure choice of architecture choice of technologies Design of Data Marts analysis with domain experts 2

Lifecycle (Kimball, 1998) planning definition of requirements project management technology design of architecture selection and installation of products data dimensional modelling physical design feeding (ETL) design and implementation application specification of applications applications development release maintenance 3

Data Flow & Project evolution DW Data flow Design 4

Design of a Data Mart: phases 1. Analysis and reconciliation of data sources schema of sources reconciled schema 2. Requirements analysis reconciled schema facts, work load 3. Conceptual Design reconciled schema, facts, work load fact schema 4. Logical design fact schema, work load logical schema of Data Marts 5. Feeding (ETL) Design Fact schema Star-schema, Snowflakes entity-relationship schema of sources, reconciled schema, logical schema of DM ETL procedures 6. Physical design logical schema of DM, work load, DBMS physical schema of DM 5

Reconciling data sources schema integration: one step steps balanced iterative 6

Operational data source: ER-Schema date DATE number amount issued-on INVOICE position contains units amount INV-ROW ay refers-to p_iva shop SHOP in ARTICLE article code CITY 7

Operational data source: logical schema date DATE number amount issued-on INVOICE position contains units amount INV-ROW ay refers-to p_iva shop SHOP in ARTICLE article code CITY 8

Operational data source: logical schema (rev.) simplification DROP ATTRIBUTES: delete uninteresting attributes date shop DATE issued at SHOP in CITY denormalization JOIN TABLES: e.g.: nobody is interested to market basket analysis, i.e., only product sales are relevant id SALE quantity sales refers-to ARTICLE article 9

Fact Schema (preliminary) DIMENSIONS FACT date (TIME) shop (SPACE) MEASURES SALES -units -amount article (PRODUCT) 10

Dimensional Hierarchies year TIME dimension zone region SPACE dimension quarter manager month week district shop date details based on user requirements article subtype brand PRODUCT dimension type brand- 11

Fact Schema: DFM (Dimensional Fact Model) quarter year month week date zone region manager district shop SALES units amount article subtype brand type brand- 12

Fact Schema (an interpretation for OLAP) ALL quarter year Es: montly sales by and brand month date week ALL zone region manager district shop SALES units amount article subtype brand type brand- ALL 13

ER schema week WEEK year quarter month date YEAR QUARTER MONTH DATE manager ZONE zone SALES-MANAGER quantity on amount negozio REGION CITY in SHOP at INVOICE-ROW region district DISTRICT refers-to CITTA_MARCA BRAND ARTICLE citta_marca TYPE brand SUBTYPE articolo belongs-to type subtype 14

Classify the information week WEEK year quarter month date YEAR QUARTER MONTH DATE manager ZONE zone SALES-MANAGER quantity on negozio amount REGION CITY in SHOP at INVOICE-ROW region district Legenda DISTRICT refers-to easy to build operational data somewhere in our organization hard tofindpossibly to buy BRAND_CITY citta_marca TYPE type BRAND brand SUBTYPE subtype ARTICLE articolo belongs-to 15

ER schema (rev.) week WEEK year quarter month data YEAR QUARTER MONTH DATE manager ZONE zone SALES-MANAGER quantity on negozio amount REGION CITY in SHOP at INVOICE-ROW region percent district DISTRICT refers-to COMMISSION home BRAND ARTICLE TYPE brand SUBTYPE articolo belongs-to type subtype 16

Multiple arcs (n:n relation) year quarter month week a single shop may have (had) more than one sales manager sales-manager district shop date sales units amount article subtype type shop_ brand brand_ area region 17

Cross-dimensional Attributes year quarter month week date sales-manager shop sales units amount article subtype type district shop_ brand area region brand_ commission_perc a commission percentage may depend on both the brand and the shop 18