Data Warehousing. Adopted from Dr. Sanjay Gunasekaran

Similar documents
Oracle Database 11g: Data Warehousing Fundamentals

CHAPTER 3 Implementation of Data warehouse in Data Mining

Information Management course

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

Question Bank. 4) It is the source of information later delivered to data marts.

DKMS Brief No. Five: Is Data Staging Relational? A Comment

Overview of Reporting in the Business Information Warehouse

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

Fig 1.2: Relationship between DW, ODS and OLTP Systems

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

Oracle BI 12c: Build Repositories

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

Data warehouse architecture consists of the following interconnected layers:

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Full file at

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

Hyperion Data Integration Management Adapter for Essbase. Sample Readme. Release

Data Warehouse and Data Mining

A Data Warehouse Implementation Using the Star Schema. For an outpatient hospital information system

Data Warehouse and Data Mining

Oracle BI 11g R1: Build Repositories Course OR102; 5 Days, Instructor-led

DATA MINING AND WAREHOUSING

Data Mining & Data Warehouse

Call: SAS BI Course Content:35-40hours

IMPLEMENTING STATISTICAL DOMAIN DATABASES IN POLAND. OPPORTUNITIES AND THREATS. Central Statistical Office in Poland

After completing this course, participants will be able to:

Oracle BI 11g R1: Build Repositories

Data Warehouse and Data Mining

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing

Benefits of Automating Data Warehousing

Oracle BI 11g R1: Build Repositories

Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

CT75 (ALCCS) DATA WAREHOUSING AND DATA MINING JUN

STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa

ETL and OLAP Systems

Getting Started enterprise 88. Oracle Warehouse Builder 11gR2: operational data warehouse. Extract, Transform, and Load data to

Course 40045A: Microsoft SQL Server for Oracle DBAs

Step-by-step data transformation

Data warehouse and its Applications in Agriculture Dr. Anil Rai

Data Warehousing and OLAP

Microsoft End to End Business Intelligence Boot Camp

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Managing Data Resources

Topics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)

SAMPLE. Preface xi 1 Introducting Microsoft Analysis Services 1

MS-55045: Microsoft End to End Business Intelligence Boot Camp

Hyperion Interactive Reporting Reports & Dashboards Essentials

DATA MINING TRANSACTION

6+ years of experience in IT Industry, in analysis, design & development of data warehouses using traditional BI and self-service BI.

Data Warehousing. Overview

Processing Techniques. Chapter 7: Design and Development and Evaluation of Systems. Online Processing. Real-time Processing

Data Mining and Data Warehousing Introduction to Data Mining

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

Sql Fact Constellation Schema In Data Warehouse With Example

Evolving To The Big Data Warehouse

UNIT -1 UNIT -II. Q. 4 Why is entity-relationship modeling technique not suitable for the data warehouse? How is dimensional modeling different?

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus

Data Stage ETL Implementation Best Practices

Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT

MCSA SQL SERVER 2012

Course Description. Audience. Prerequisites. At Course Completion. : Course 40074A : Microsoft SQL Server 2014 for Oracle DBAs

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa

by Prentice Hall

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

1. SQL Server Integration Services. What Is Microsoft BI? Core concept BI Introduction to SQL Server Integration Services

OBT Global presents. SAP Business Information Warehouse. -an overview -

Syllabus. Syllabus. Motivation Decision Support. Syllabus

Deccansoft Software Services. SSIS Syllabus

Decision Support, Data Warehousing, and OLAP

Chris Claterbos, Vlamis Software Solutions, Inc.

Data Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology

Page 1. Oracle9i OLAP. Agenda. Mary Rehus Sales Consultant Patrick Larkin Vice President, Oracle Consulting. Oracle Corporation. Business Intelligence

TIM 50 - Business Information Systems

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

Data Management Glossary

Data Warehouse and Data Mining

Rocky Mountain Technology Ventures

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Data Warehouse Testing. By: Rakesh Kumar Sharma

resources, 56 sample questions, 3 Business Intelligence Development Studio. See BIDS

PREFACE INTRODUCTION MULTI-DIMENSIONAL MODEL. Dan Vlamis, Vlamis Software Solutions, Inc.

Oracle Financial Analyzer Oracle General Ledger

DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting.

Using Oracle9i Warehouse Builder and Oracle 9i to create OLAP ready Warehouses

Managing Data Resources

Data Mining and Warehousing

D Daaatta W Waaarrreeehhhooouuusssiiinng B I R L A S O F T

SwatCube An OLAP approach for Managing Swat Model results

BaanBIS Decision Manager 2.0. Modeler User's Guide

Course Contents: 1 Business Objects Online Training

QUALITY MONITORING AND

Q1) Describe business intelligence system development phases? (6 marks)

Data Management Framework

Data Mining. ❸Chapter 3 Data warehouse, ETL and OLAP. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

Data Mining Concepts & Techniques

Data Warehouse and Data Mining

Transcription:

Data Warehousing Adopted from Dr. Sanjay Gunasekaran

Main Topics Overview of Data Warehouse Concept of Data Conversion Importance of Data conversion and the steps involved Common Industry Methodology Outline and Analysis done in the Alternate Plan paper

Data warehousing It is a concept and not a product A method to analyze massive amounts of data to make better business decisions. Helpful in analyzing Sales data(e.g..) and make decisions that affect the company s performance. A Data warehouse in general contains Summarized, De-normalized and Replicated data that is infrequently updated and is optimized for decision support applications.

Comparison between Operational Environment and Data Warehouse Operational Environment Data Warehouse Detailed Current Transaction Driven Minimum redundancy Static Structure Small amount of data Constantly updated Summarized Variable over time Analysis driven Some redundancy Flexible structure Huge volumes of data Infrequently Updated

Data Warehouse Concepts Multidimensional Model a) Facts - Table containing aggregate information required for analysis. b) Dimensions - Classes of descriptors of the facts. c) Hierarchies - Level of Aggregation of data. Databases a) Relational i) Oracle b) Multi-Dimensional i) Oracle Express ii) Essbase iii) Gentium

Implementation Steps Analyze user requirements for the Data warehouse. Analyze existing transaction Processing Data. Design the Data warehouse (Multi-dimensional Model) Create the Data warehouse (Relational or Multi-dimensional) Extract and clean the operational data. Migrate and load the data into the warehouse. Do decision support analysis on the warehouse data using OLAP tools. Create reports for reporting purposes.

Data Warehouse Architecture OLTP SYSTEMS MetaData End User Extraction Cleaning Loading Data Warehouse Staging Area General Ledger Terminology's Accounts Payable Purchase Order a) OLTP systems d) Staging Area b) Metadata e) Extraction, Loading & Migration c) Data Warehouse f) External Data Enternal Data From Legacy Systems

Data Warehouse Architecture (Contd..) OLTP Systems Online Transaction Processing Systems, Production Systems. Systems used to manage and run the business. Metadata consists of information about the data that feeds, gets transformed and exists in the Data Warehouse Data Warehouse Core of the Architecture supports informational processing by providing a solid platform of integrated, historical data from which to do analysis

Data Warehouse Architecture (Contd..) Staging Area Data Warehouse workbench the place where raw data is brought in, cleaned, combined, archived and eventually exported to either the Data Warehouse or to one or more Data Marts Extraction, Cleaning & Loading Known as the Data Conversion process. The process by which data from the operational systems are moved to the Warehouse One of the most important steps in the implementation of a Data Warehouse. External Data

Data Conversion Loading of data from the operational system to the Data warehouse. Process wherein data is extracted, cleaned, combined, archived and eventually loaded into the Data warehouse. Complex, time-consuming and unglamorous. Comprises of the following processes: a) Extraction b) Cleaning c) Loading Very, Very important section of the Data warehousing process.

Importance of Data Conversion The Data warehouse holds the information that is the key to a corporation s decision making process. Unreliable and Dirty data can effect the performance of the corporation. Examples a) Marketing communications. b) Retail Sales c) Medical records

Steps in Data Conversion Extract data from the operational systems to intermediate schema (Staging area). - Staging area is the Data warehouse workbench where the data is cleaned, combined, archived and eventually exported to the Data warehouse.. It has the same schema structure as the operational system. Convert the intermediate schema to load data. Aggregate the load data. Migrate the load data from the staging area to the Data Warehouse server (if the staging area is not on the same server as the warehouse). Load the data into the Data warehouse.

Data Conversion Process Quality Assurance of Data Plan Conversion Create Conversion Specifiactions Extract Source Data to Intermediate Schemas Condition Data Transform Data Clean Data Integrate Data Aggregate Load Data Move and Load Data

Data Conversion Extraction - Routines are created to read source data and move it to an intermediate staging area. - Staging Area has the same schema as the source. It is important as the data is cleaned before it is uploaded into the warehouse. Convert intermediate Schemas to Load Data - Data cleaning process. It comprises of: - Data examination - Data parsing - Data correction - Record matching - Data transformation

Data Conversion (Contd..) Aggregate Load data - Load data is aggregated by executing a series of sorts externally. Move the Load data from the staging area onto the Data warehouse server - Done if the Data warehouse server is different Load the data onto the Data warehouse - Done using SQL routines or bulk-load utilities.

Paper Outline Brief explanation of Data warehousing concept Data warehouse architecture Data conversion Importance of data conversion Common Industry methodology Analysis of Data conversion process using an example: - Sales Order System

Overall Analysis Concept of the paper was to outline the Data Conversion process. Design a Relational Database, Staging Area and Data Warehouse. Move Data from the Relational database to the Staging Area Move Data from the Staging area to the Warehouse.

In-depth Analysis Designed the Relational Database to reflect the Transactional processing system of a common Organization. Designed the Staging Area to reflect only the Sales system. Designed the Data Warehouse for the Sales system. Built the relational database(source system) for the quoted example (Sales System) in Oracle Built the Staging Area in Oracle. Built the Data Warehouse in Oracle (Multi Dimensional Design in a relational Database). Created Views for the source tables(transparency) Created synonyms for the views (as source tables were in a different server)

In-depth Analysis (Contd..) Wrote SQL scripts to first move data from the synonyms created, to the Staging area. Wrote SQL scripts and procedures to move data from the Staging Area to the Data Warehouse. Data was moved first from the Staging area tables to the dimension tables namely Product, Location and Customer. Time dimension table was populated with 10 years of data. Additional scripts were written to populate the time dimension with data every year. Data was moved from the Staging area to the fact table (Core Table). Wrote scripts to check for the consistency of data. These scripts checked the total records moved from the Source system to the Satging area and from the Staging area to the Data Warehouse. Additionally, they checked for the total amount moved from the database to the Data Warehouse.

Conclusion The importance of the Data warehouse can only be achieved by OLAP analysis and Data Mining. Data Conversion is one of the most critical process in implementing a Data warehouse Warehouse holds the information that is of great value to the enterprise Data conversion process must be done effectively and efficiently