FedDW Global Schema Architect

Similar documents
Oracle BI 11g R1: Build Repositories

Using SLE for creation of Data Warehouses

Data Warehousing and OLAP

Data Science. Data Analyst. Data Scientist. Data Architect

Oracle BI 11g R1: Build Repositories Course OR102; 5 Days, Instructor-led

After completing this course, participants will be able to:

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

Oracle BI 11g R1: Build Repositories

Table of Contents. Knowledge Management Data Warehouses and Data Mining. Introduction and Motivation

Knowledge Management Data Warehouses and Data Mining

Data Warehousing and OLAP Technologies for Decision-Making Process

Adnan YAZICI Computer Engineering Department

Oracle BI 12c: Build Repositories

An Overview of Data Warehousing and OLAP Technology

SQL Server Analysis Services

Hyperion Interactive Reporting Reports & Dashboards Essentials

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

IMPLEMENTING STATISTICAL DOMAIN DATABASES IN POLAND. OPPORTUNITIES AND THREATS. Central Statistical Office in Poland

XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

Data-Transformation on historical data using the RDF Data Cube Vocabulary

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

Data Warehousing & OLAP

Basics of Dimensional Modeling

Developing SQL Data Models

Getting Started enterprise 88. Oracle Warehouse Builder 11gR2: operational data warehouse. Extract, Transform, and Load data to

Data Warehouse and Data Mining

The GOLD Model CASE Tool: an environment for designing OLAP applications

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Call: SAS BI Course Content:35-40hours

The Eclipse Modeling Framework and MDA Status and Opportunities

SAS Data Integration Studio 3.3. User s Guide

Advantages of UML for Multidimensional Modeling

A Framework for Semi-Automated Implementation of Multidimensional Data Models

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata

Evolution of Database Systems

Decision Support Systems aka Analytical Systems

Microsoft End to End Business Intelligence Boot Camp

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus

SQL Server 2005 Analysis Services

1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.

Visual Modelling of Data Warehousing Flows with UML Profiles

Practical Database Design Methodology and Use of UML Diagrams Design & Analysis of Database Systems

Data Integration and ETL with Oracle Warehouse Builder

6234A - Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

MS-55045: Microsoft End to End Business Intelligence Boot Camp

Course Contents: 1 Business Objects Online Training

Module 2: Creating Multidimensional Analysis Solutions

Improving the Performance of OLAP Queries Using Families of Statistics Trees

METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE

Oracle BI 11g R1: Build Repositories

Microsoft SQL Server Training Course Catalogue. Learning Solutions

Reading Sample. Creating New Documents and Queries Creating a Report in Web Intelligence Contents. Index. The Authors

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):

MSBI. Business Intelligence Contents. Data warehousing Fundamentals

MOLAP Data Warehouse of a Software Products Servicing Call Center

Index. Symbols = (equal) operator, 87

Chapter 3. Architecture and Design

A Data Warehouse Solution for e-government

Data Mining Concepts & Techniques

This Oracle BI 11g R1: Build Repositories training is

Building a Data Warehouse step by step

Physical Modeling of Data Warehouses using UML

Chapter 13 Business Intelligence and Data Warehouses The Need for Data Analysis Business Intelligence. Objectives

From Federated Databases to a Federated Data Warehouse System

A Methodology for Integrating XML Data into Data Warehouses

Querying Microsoft SQL Server

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20

Hierarchies in a multidimensional model: From conceptual modeling to logical representation

Feature Matrix GENERAL DATA MODELING ENTERPRISE DATA ARCHITECT

Enterprise Architect Training Courses

Interactive PMCube Explorer

DATAWAREHOUSING AND ETL PROCESSES: An Explanatory Research

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

ETL and OLAP Systems

Managing Changes to Schema of Data Sources in a Data Warehouse

The process of metadata modelling in industrial data warehouse environments

Oracle Database: SQL and PL/SQL Fundamentals NEW

OBIEE Course Details

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke

<Insert Picture Here> Oracle SQL Developer Data Modeler 3.0: Technical Overview

CA ERwin Data Modeler r7.3

Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Guide Users along Information Pathways and Surf through the Data

DSS based on Data Warehouse

Data Warehousing & Mining Techniques

CHAKRA IT SOLUTIONS TO LEARN ABOUT OUR UNIQUE TRAINING PROCESS:

Deccansoft Software Services. SSIS Syllabus

DATA MINING AND WAREHOUSING

Automatically Generating OLAP Schemata from Conceptual Graphical Models

Welcome to the topic of SAP HANA modeling views.

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

Techno Expert Solutions An institute for specialized studies!

Teiid Designer User Guide 7.5.0

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

BUSINESS INTELLIGENCE AND OLAP

Transcription:

UML based Design Tool for the Integration of Data Mart Schemas Dr. Stefan Berger Department of Business Informatics Data & Knowledge Engineering Johannes Kepler University Linz ACM 15 th DOLAP 12 November 2, 2012

Outline 1 FedDW Approach 2 Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 2 / 31

1 FedDW Approach General overview of FedDW Integrating heterogeneous multidimensional schemata 2 Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 3 / 31

General overview of FedDW Problem definition; our contribution Problem: similar autonomous data marts/dws, but heterogeneous schemata and/or data Business collaboration Mergers and acquisitions Preexisting DW data across autonomous organizations Contribution: comprehensive tool suite for integration of autonomous data marts/dws Visual integration of multidimensional schemas OLAP front-end prototype, based on SQL-MDi [Berger and Schrefl, 2006] Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 4 / 31

General overview of FedDW Problem definition; our contribution Problem: similar autonomous data marts/dws, but heterogeneous schemata and/or data Business collaboration Mergers and acquisitions Preexisting DW data across autonomous organizations Contribution: comprehensive tool suite for integration of autonomous data marts/dws Visual integration of multidimensional schemas OLAP front-end prototype, based on SQL-MDi [Berger and Schrefl, 2006] Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 4 / 31

General overview of FedDW Motivating example Telecommunications sector sample, heterogeneous conceptual data mart schemas: year month date date/hr red blue date month quarter year date products product prod_name regular_fee connections duration tn_tel tn_misc customer customer p_name contract_type base_fee category products product prod_name connections dur_min turnover promotion promo promo_type dates customers customer cust_name age_grp contract_type base_fee Dimensionality (extra dimensionblue.promotion) Hierarchy ofdate dimensions Decorations of product dimensions Measures of connections facts Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 5 / 31

General overview of FedDW Motivating example Telecommunications sector sample, heterogeneous conceptual data mart schemas: year month date date/hr red blue date month quarter year date products product prod_name regular_fee connections duration tn_tel tn_misc customer customer p_name contract_type base_fee category products product prod_name connections dur_min turnover promotion promo promo_type dates customers customer cust_name age_grp contract_type base_fee Dimensionality (extra dimensionblue.promotion) Hierarchy ofdate dimensions Decorations of product dimensions Measures of connections facts Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 5 / 31

Integrating heterogeneous multidimensional schemata Conflict classification I Modeling Scope Ins stance Schema Dimension Instance ( Members ) Conflicts Dimension Schema Conflicts Schema- Instance Conflicts Cube Instance ( Cells ) Conflicts Cube Schema Conflicts Model Entity Dimension Cube Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 6 / 31

Integrating heterogeneous multidimensional schemata Conflict classification II Facts: conflicts Relevant operator of FedDW Schema-instance Dimensionality Different measures Merge measures: PIVOT MEASURES (Fact) Split measures: PIVOT SPLIT MEASURES (Fact) Choose attributes: add DIM reference (Cube) Choose measures: add MEASURE reference (Cube) Domain (measures) Convert domain: CONVERT MEASURES APPLY... (Measure) Naming of attributes Rename attributes: operator >... (Measure, Dimension) Base levels Roll-up dimension attributes: ROLLUP TO LEVEL... (Dimension) Cube cells (fact extensions) Join cubes: MERGE CUBES (n-ary) Derive measure values: AGGREGATE MEASURE (n-ary) Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 7 / 31

Integrating heterogeneous multidimensional schemata Conflict classification III Dimensions: conflicts Relevant operator of FedDW Hierarchies Domain (levels / decorations) Naming (levels) Naming (decorations) Members (dim. extensions) Roll-up functions Decoration values Map corresponding levels: add level reference [...] (Dimension) Convert domain: CONVERT ATTRIBUTES APPLY... (Dimension) Rename attributes: operator >... (Level) Map decorations: MATCH ATTRIBUTES (under Merge Dimensions n-ary) Merge sets of members: MERGE DIMENSIONS (n-ary) Overwrite hierarchies: RELATE Expression (under Merge Dimensions clause n-ary) Correct values: add RENAME function (under Merge Dimensions clause n-ary) Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 8 / 31

Integrating heterogeneous multidimensional schemata Integration workflow Establish a federation of autonomous data marts: 1 Import data mart schemas (CWM supported) (Optional: enrich roll-up hierarchies minimum match integration strategy) 2 Design global multidimensional schema (canonical model) 3 Define semantic mappings both-as-view paradigm [see McBrien and Poulovassilis, 2003] (a) Resolve schema instance conflicts (b) Intensional integration map conceptual schemata Fact tables Dimension tables + hierarchies (c) Extensional integration consolidate data Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 9 / 31

1 FedDW Approach 2 FedDW Query Tool Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 10 / 31

Overview of FedDW tool support I Java- and Eclipse-based interactive tool suite (EMF, GMF, UML2) Visual data mart integration: (GSA) OLAP front-end prototype: FedDW Query Tool [Berger and Schrefl, 2009] Auxiliary components: Metadata Dictionary, Dimension Repository Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 11 / 31

Overview of FedDW tool support II Global Schema Architect? User Query (SQL) OLAP Application Federated DW System Query Tool Global schema Meta-data dictionary SQL-MDi Parser Mappings (SQL-MDi) Import Schemas Dimension repository SQL-MDi Processor DM 1 DM 2 DM n Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 12 / 31

Overview of FedDW GSA Visual design environment for multidimensional schemas Schema Editor nested UML diagrams Import schemas Global schema Mapping Editor graphical, high-level code editor (Master Detail layout) Import mappings: unary operators (Fact, Dimension entities) intensional Global mappings: n-ary operators extensional Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 13 / 31

Sample GSA Workflow 1 Import local, autonomousconnections schemas 2 Design global connections schema 3 Create import mappings 4 Create one global mapping file 5 Export the mappings to metadata repository 6 Export fact and dimension metadata Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 14 / 31

GSA: Step 1, Import Wizard I Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 15 / 31

GSA: Step 1, Import Wizard II Wizard suggests appropriate UML stereotypes (based on PK/FK constraints): Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 16 / 31

GSA: Step 1, Import Wizard III Initialized class diagram ofred.connections: Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 17 / 31

GSA: Step 2, Global Schema Editor Global Schema wizard: Comfortably create global schema as copy of one import schema Edit the schema later on Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 18 / 31

GSA Schema Editors: edit dimension User-friendly editing of UML diagram possible (context menus, UML palette): Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 19 / 31

GSA: Step 3, Import Mappings I Recall the heterogeneities amongred andglobal: year month date date/hr red global date month year date category products product connections duration tn_tel tn_misc customer customer p_name category products connections dur_min turnover dates customers customer cust_name prod_name regular_fee contract_type base_fee product prod_name contract_type base_fee Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 20 / 31

GSA: Step 3, Import Mappings II Repair red.connections import schema: Dimensionality: add references to all three dimensions Date hierarchy: roll-up to LEVEL [date] Product decorations: delete regular_fee from Red s import schema Measures (schema instance conflict): PIVOT MEASURES tn_tel, tn_misc Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 21 / 31

Import Mappings: dimensionality Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 22 / 31

Import Mappings: hierarchy Prerequisite: delete level [date/hr] fromred.date (see Schema Editor, slide 21) Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 23 / 31

Import Mappings: pivot measures Merge measures tn_tel, tn_misc into turnover, extracting values tn_tel, tn_misc as members of the new red.category dimension: Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 24 / 31

GSA Step 4, Global Mapping I Merge Dimensions: Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 25 / 31

GSA Step 4, Global Mapping II Merge Cubes: Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 26 / 31

GSA Step 5 6, export project metadata Step 5: Export mapping file export wizard Starts generation of SQL-MDi code Static syntax check Interface to FedDW Query Tool: file system Step 6: populate Metadata Dictionary Facts + dimensions conceptual and physical metadata Later accessed by FedDW Query Tool Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 27 / 31

GSA Step 5 6, export project metadata Step 5: Export mapping file export wizard Starts generation of SQL-MDi code Static syntax check Interface to FedDW Query Tool: file system Step 6: populate Metadata Dictionary Facts + dimensions conceptual and physical metadata Later accessed by FedDW Query Tool Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 27 / 31

GSA: Summary Intelligent features: Import heuristics: analyzes PK/FK constraints in import schemas to suggest adequate UML stereotypes Create global schema as copy of one import schema User-friendly and intuitive UML notation Visual conversion modeling avoids cheap SQL-MDi syntax errors Automatically populates Dimension Repository from the exported metadata Supports the CWM standard [Poole, 2003] Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 28 / 31

FedDW Query Tool Query Tool in a Nutshell Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 29 / 31

Literatur Thanks for your attention! Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 30 / 31

Literatur References Berger, S. and Schrefl, M. (2006). Analysing multi-dimensional data across autonomous data warehouses. In Tjoa, A. M. and Tho, N., editors, DaWaK, pages 120 133. Berger, S. and Schrefl, M. (2009). FedDW: A tool for querying federations of data warehouses. In ICEIS (1). McBrien, P. and Poulovassilis, A. (2003). Data integration by bi-directional schema transformation rules. In Dayal, U., Ramamritham, K., and Vijayaraman, T. M., editors, ICDE, pages 227 238. IEEE Computer Society. Poole, J. M. (2003). Common Warehouse Metamodel Developer s Guide. John Wiley & Sons, Inc., New York, NY, USA. Stefan Berger (Univ. Linz) DOLAP Nov. 2, 2012 31 / 31