Unit

Similar documents
1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

Getting Started enterprise 88. Oracle Warehouse Builder 11gR2: operational data warehouse. Extract, Transform, and Load data to

Data Integration and ETL with Oracle Warehouse Builder

Techno Expert Solutions An institute for specialized studies!

Roll No: Subject: Dataware Housing Date:

SQL Server Analysis Services

Designing the staging area contents

1 Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Peak ETA Developers Guide

OBIEE. Oracle Business Intelligence Enterprise Edition. Rensselaer Business Intelligence Finance Author Training

III. Answer any two of the following: 10

Table of Contents. Table of Contents

Deccansoft Software Services. SSIS Syllabus

Proje D2K. CMM (Capability Maturity Model) level Project Standard:- Corporate Trainer s Profile

Software Development & Education Center. Oracle D2K

Design Studio Data Flow performance optimization

STIDistrict Query (Basic)

Sql Server Syllabus. Overview

11G ORACLE DEVELOPERS Training Program

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

Management Reports Centre. User Guide. Emmanuel Amekuedi

Lookup Transformation in IBM DataStage Lab#12

MASTER-DETAIL FORMS. In this Chapter, you will learn about: Master-Detail Forms Page 108

INTRODUCTION. Chris Claterbos, Vlamis Software Solutions, Inc. REVIEW OF ARCHITECTURE

Reporter Tutorial Pivot Reports

Module 4: Creating Content Lesson 4: Creating Charts Learn

APEX Times Ten Berichte. Tuning DB-Browser Datenmodellierung Schema Copy & Compare Data Grids. Extension Exchange.

Oracle Fusion Middleware 11g: Build Applications with Oracle Forms

QUERY USER MANUAL Chapter 7

Designing your BI Architecture

Teradata SQL Features Overview Version

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 8 Advanced SQL

COURSE OUTLINE MOC 20461: QUERYING MICROSOFT SQL SERVER 2014

Querying Data with Transact SQL

Oracle Fusion Middleware 11g: Build Applications with Oracle Forms

UNIT

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 8 Advanced SQL

Oracle Warehouse Builder. Oracle Warehouse Builder. Quick Start Guide. Jean-Pierre Dijcks, Igor Machin, March 9, 2004

Access Review. 4. Save the table by clicking the Save icon in the Quick Access Toolbar or by pulling

Intellicus Enterprise Reporting and BI Platform

City College of San Francisco Argos Training Documentation

Oracle Forms Developer 10g: Build Internet Applications

Oracle Syllabus Course code-r10605 SQL

Learn about the Display options Complete Review Questions and Activities Complete Training Survey

Microsoft Office 2003: Features, Strategies, and Trends

High Speed ETL on Low Budget

SQL Server 2005 Analysis Services

eschoolplus+ Cognos Query Studio Training Guide Version 2.4

SmartView. User Guide - Analysis. Version 2.0

A Practical Introduction to SAS Data Integration Studio

Example 1 Simple Broadcast: A broadcast is sent to a target audience. The may contain a link directing the user to a web page.

Oracle Fusion Middleware 11g: Build Applications with Oracle Forms

20461: Querying Microsoft SQL Server 2014 Databases

Please note that this process must be repeated if a new PC is used.

Oracle General Navigation Overview

Introduction to IBM Rational HATS For IBM System i (5250)

Intelligence. Peachtree Business Intelligence Report Writing Best Practices

Getting Started Manual. SmartList To Go

Objective 1: Familiarize yourself with basic database terms and definitions. Objective 2: Familiarize yourself with the Access environment.

Instructor : Dr. Sunnie Chung. Independent Study Spring Pentaho. 1 P a g e

Service Minder Plus Features/Helpful Hints

This job aid details the process for reviewing GL Balances with the Account Inspector.

20461: Querying Microsoft SQL Server

Data Warehousing. Syllabus. An Introduction to Oracle Warehouse Builder. Index

Oracle Database: SQL and PL/SQL Fundamentals NEW

After completing this course, participants will be able to:

Querying Microsoft SQL Server (MOC 20461C)

Performance Optimization for Informatica Data Services ( Hotfix 3)

Importing source database objects from a database

Using the Palladium Business Intelligence Functionality

MaintScape Training Course Table of Contents

Oracle Database: SQL and PL/SQL Fundamentals Ed 2

DB2 SQL Class Outline

Querying Data with Transact-SQL

Chapter 4. Microsoft Excel

11 th COMPUTER APPLICATION CHAPTER 5 WINDOWS AND UBUNTU

Session 41660: Using Hyperion Data Integration Management with Hyperion Planning and Hyperion Essbase

DATABASES 1.0 INTRODUCTION 1.1 OBJECTIVES

Creating a target user and module

Excel 2013 PivotTables and PivotCharts

Advanced Excel for EMIS Coordinators

SAS Data Integration Studio 3.3. User s Guide

Query Studio Training Guide Cognos 8 February 2010 DRAFT. Arkansas Public School Computer Network 101 East Capitol, Suite 101 Little Rock, AR 72201

5. Single-row function

Unit 2.

Querying Data with Transact-SQL

Sage Pastel Online Registration Guide

$99.95 per user. Writing Queries for SQL Server (2005/2008 Edition) CourseId: 160 Skill level: Run Time: 42+ hours (209 videos)

Help on Metadata Manager

Oracle Database 10g: Introduction to SQL

Sage Intelligence: Report Writing Methodology Agenda

OLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube

Implementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463)

UP L11 Using IT Analytics as an Alternative Reporting Platform Hands-On Lab

Product Overview. Technical Summary, Samples, and Specifications

MCSA SQL SERVER 2012

Tutorial 1; Multi-Section Class List

Querying Data with Transact SQL Microsoft Official Curriculum (MOC 20761)

PeopleSoft Query/BI Publisher Power Combo Rel 8.53

Oracle Database 11g: SQL and PL/SQL Fundamentals

Transcription:

Unit 4.1

DATAWAREHOUSING UNIT 4 CHAPTER 1

1. Extract, Transform, and Load Basics: ETL, Manual ETL processes, Staging,To stage or not to stage, Configuration of a staging area, Mappings and operators in OWB,The canvas layout, OWB operators, Source and target operators, Data flow operators, Pre/postprocessing operators.

ETL is the first step in building the mappings from source to target.we have sources and targets defined and now we need to: Work on extracting the data from our sources. Perform any transformations on that data (to clean it up or modify it). Load it into our target data warehouse structure.

Oracle Database provides various methods to load data into target system an application that Oracle provides called SQL*Loader, which is a utility to load data from flat files. The data will have to be manipulated when it is copied This means we need to develop code that can perform this rather complex task, depending on the manipulations that need to be done.

In a nutshell, this is the process of extract, transform, and load.we have to: 1. Extract the data from the source system by some method. 2. Load flat files using SQL*Loader or via a direct database link.then we have to transform that data with SQL or PL/SQL code in the database to match and fit it into the target structure. 3. Finally, we have to load it into the target structure.

Staging Staging is the process of copying the source data temporarily into a table(s) in our target database. we can perform any transformations that are required before loading the source data into the final target tables. The source and target designations will be affected during the intermediate steps of staging. So we'll need to decide on a staging strategy, if any, before designing the ETL in OWB.

To stage or not to stage The points to consider to keep the process flowing as fast as possible are The amount of source data we will be dealing with The amount of manipulations of the source data that will be required If the source data is in another database other than an Oracle Database, the reliability of the connection to the database and the performance of the link while pulling data across

If a failure occurs during an intermediate step of the ETL process, we will have to restart the process. If such a failure occurs, we will have to consider the severity of the impact, as in the following cases: Going back again to the source system to pull data if the first attempt failed. The source data is changing while we are trying to load it into the warehouse, meaning that whatever data we pull the second time might be different from what we started with (and which caused the failure).this condition will make it difficult to debug the error that caused this failure.

Configuration of a staging area A staging area is clearly an advantage when designing our ETL. So we'll want to create one, but we will need to decide where we want to create it in the database or outside the database.

Mappings and operators in OWB features for designing and building our ETL process. OWB handles this with what are called mappings A mapping is composed of a series of operators that describe the sources, targets, and a series of operations that flow from source to target to load the data. It is all designed in a graphical manner using the Mapping Editor, which is available from the Design Center.

Mappings are created in the Mappings node.we can find it under the module we created to hold our data warehouse design under the Databases Oracle node in our project. Expand that module, which we called ACME_DWH, and then expand the Mappings node underneath it. The DATE_DIM_MAP we see under Mappings is the mapping that was created for us automatically by the Time Dimension wizard.

double-click on the DATE_DIM_MAP mapping. It will launch the Mapping Editor and load the DATE_DIM_MAP into it. Click on the Auto Layout button in the toolbar to spread everything out

The windows of the Mapping Editor Mapping are The Mapping window is the main working area on the right where we will design the mapping. This window is also referred to as the canvas. Explorer This window is similar to the Project Explorer window from the Design Center, and is the same as the window in the Data Object Editor. It has the same two tabs the Available Objects tab and the Selected Objects tab Selected Objects:This tab displays all the objects currently defined in our mapping.when an object is selected in the canvas, the Selected Objects window will scroll to that object and highlight it. Available Objects:This tab is almost like the Project Explorer. It displays objects defined in our project elsewhere, and they can be dragged and dropped into this mapping

Mapping properties The mapping properties window displays the various properties that can be set for objects in our mapping. Toinvestigate the properties window a little closely, let's select the DATE_INPUTS operator.we can scroll the Explorer window until we see the operator and then click on it, or we can scroll the main canvas until we see it and then click on the top portion of the frame to select it. It is the first object on the left and defines inputs into DATE_DIM_MAP. It is visible in the previous image.after clicking on it, all the properties will be displayed in the mapping properties window.

click on one of the attributes YEAR_START_DATE within the DATE_INPUTS operator.

In the canvas, we'll take a look at the operator that is on the far left of the canvas called DATE_INPUTS.This operator happens to be a Mapping Input Parameter operator The canvas layout

DATE_INPUTS operator called DAY_TABLE_FUNCTION. It has both input and output attributes as shown input attributes in the INGRP1 group as parameters to the function.(input) returns the value indicated in the OUTGRP1 group as a return value from the function. (OUTPUT)

OWB operators Types of operators Source andtargetoperators Data Flow Operators,and Pre/Post Processing Operators All of the operators are available to us from the Palette window in the Mapping Editor

Source and target operators Cube Operator an operator that represents a cube Dimension Operator an operator that represents previously defined dimensions External Table Operator this operator represents external tables, used to access data stored in flat files Table Operator this operator represents a table in the database Constant represents a constant value that is needed. View Operator represents a database view. Source data is frequently retrieved via a view in the source database that can pull data from multiple sources into a single, easily accessible view. Sequence Operator is an automatic generator of sequential unique numbers and is most often used for populating a primary key field. Construct Object this operator can be used to actually construct an object in our mapping.

We can see three Construct Object operators in DATE_DIM_MAP for a calendar month (CONSTRUCT_OBJECT_CAL_MONTH), a calendar quarter (CONSTRUCT_OBJECT_CAL_QUARTER), and a calendar year object (CONSTRUCT_OBJECT_CAL_YEAR) If we click on the attribute in the OUTGRP1 of one of those construct operators, we can see in the Attribute Properties window on the left that it is of type SYS_REFCURSOR. A SYS_REFCURSOR is a PL/SQL type that represents a cursor in PL/SQL. A cursor is used to point to the row of the result of the query that is defined for that cursor.

Data flow operators need to transform the source data into a new structure we use data flow operators Aggregator there are times when source data is at a finer level of detail than we need. (SQL group by clause with an aggregation SQL function ) Deduplicator loading only unique combinations of data (distinct SQL function ) Expression this represents an SQL expression that can be applied to the output to produce the desired result. Filter this will limit the rows from an output set to criteria that we specify. (Where clause) Joiner this operator will implement an SQL join on two or more input sets of data. (Sql join) Key Lookup a Key Lookup operator looks up data in a table based on some input criteria (the key) to return some information required by our mapping. Pivot this operator can be useful if we have source records that contain multiple columns of data that is spread across columns instead of rows.

Set Operation this operator will allow us to perform an SQL set operation (union, intersect) Splitter this operator is the opposite of the Joiner operator Transformation Operator this operator can be used to invoke a PL/SQL function or procedure with some of our source data as input to provide a transformation of data. For instance, the SQL trim() function Table Function Operator Table Function, which is defined in PL/SQL and is a function that can be queried like a table to return rows of information.

Pre/post-processing operators There is a small group of operators that allow us to perform operations before the mapping process begins, or after the mapping process ends. Mapping Input Parameter-this operator allows us to pass a parameter(s) into a mapping process. Mapping Output Parameter -provides a value as output from our mapping. Post-Mapping Process-allows us to invoke a function or procedure after the mapping completes its processing (cleanup, deleting all the records ) Pre-Mapping Process-It allows us to invoke a function or procedure before the mapping process begins.

DATAWAREHOUSING UNIT 4 END OF CHAPTER 1