Data Integration Best Practices

Size: px
Start display at page:

Download "Data Integration Best Practices"

Transcription

1 (Healthy Habits for SAS Data Integration Studio Users)

2 Abstract: Version 9 of the SAS System offers tools to help developers and business users manage and organise the wealth of data and processes that face SAS professionals today. SAS Data Integration Studio benefits from many features that support healthy habits for data integration, but they can only 'be of use' if they are 'being used'. DI Studio allows customisation of the custom tree, error monitoring, job status handling, data validation, conformed data model support, selfdocumentation, and role assignment. Identification of the benefits behind using these functions is often enough to motivate users into controlled and organised methods of working. This paper describes examples of best practice for developing data integration suites to ensure quality, efficiency and resilience is built into the heart of your enterprises information estate.

3 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

4 Data Integration Structure Challenge: How can you best deliver Business Intelligence from a variety of source systems across a diverse consumer base? Solution: Employ a Data Integration flow structure.

5 Data Integration Structure Challenge: How can you best deliver Business Intelligence from a variety of source systems across a diverse consumer base? Solution: Employ a Data Integration flow structure. Source Systems

6 Data Integration Structure Challenge: How can you best deliver Business Intelligence from a variety of source systems across a diverse consumer base? Solution: Employ a Data Integration flow structure. Source Systems Detailed Data Model

7 Data Integration Structure Challenge: How can you best deliver Business Intelligence from a variety of source systems across a diverse consumer base? Solution: Employ a Data Integration flow structure. Source Systems Detailed Data Model Subject Specific Data Marts

8 Data Integration Structure Challenge: How can you best deliver Business Intelligence from a variety of source systems across a diverse consumer base? Solution: Employ a Data Integration flow structure. Source Systems Detailed Data Model Subject Specific Data Marts Subject Specific Business Intelligence

9 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

10 Data Integration Organisation Challenge: How can you keep track of the thousands of jobs typically created in a data integration suite? Solution: Utilise the custom tree in SAS Data Integration Studio.

11 Data Integration Organisation Create folders for each integration layer. Sub divide them by: Jobs Libraries Tables Number the folders preserve order. Stick to methodology: (e.g. don t transform in capture layer)

12 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

13 Capture Control Challenge: How can I perform incremental extracts from several source systems? Solution: Define Capture Control Tables for each source table. Status To ensure smooth running of DI suite. (Started, Failed, or Success) From/To Datetimes To extract against the last updated column in the database. Also useful to determine processing times as data increases day by day.

14 Capture Control Send Job Status to dataset with same name as the job.

15 Capture Control Only extract records which have updated since last run. Capture Job Source Systems Conformed Model

16 Capture Control Only extract records which have updated since last run. Capture Job Source Systems Conformed Model CoreInfo Tables

17 Capture Control Only extract records which have updated since last run. Pre Capture Job Post Source Systems Conformed Model CoreInfo Tables

18 Capture Control Pre-Processing Is this the first time the job has run successfully today? Yes No Warn that duplicate facts will occur. Did the previous run fail, or not finish? Yes No Warn that this is a replacement run. Update dates in CCT table for this source. (&source_table._cct)

19 Capture Control Post-Processing Did the job run successfully? Yes No Update CCT table with Status= Failed. Update dates in CCT table for this source. (&source_table._cct)

20 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

21 Error Monitoring Challenge: How can I keep my production support department informed of job failures/successes? Solution: job statistics to designated mailbox. Create User Transform called _Stats. Add _Stats transform to each job.

22 Error Monitoring Add _Stats transform to Job.

23 Error Monitoring Drag Target table to one input. Drag _Stats to other input. ( _Stats table contains addresses of recipients). Don t hard-code addresses. What happens when people leave? Different recipients for dev/prod.

24 Error Monitoring _Stats transform properties. Only s if job has failed.

25 Error Monitoring Last job in flow always sends to Admin & Support. Set Last Job to Yes.

26 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

27 Data Validation Challenge: How can I ensure only clean data gets loaded into the warehouse? Solution: Use the Data Validation transformation.

28 Data Validation Challenge: How can I ensure only clean data gets loaded into the warehouse? Solution: Use the Data Validation transformation. Use the standard Invalid, Missing, Duplicate tabs. Employ custom validation and apply a severity rating: 1 = Exclusion 2 = Correction 3 = Improvement Store exceptions in permanent dataset for further analysis.

29 Data Validation e.g. Check for Truncation of Key columns

30 Data Validation 1) Create each condition

31 Data Validation 1) Create each condition 2) Determine validation

32 Data Validation 1) Create each condition 2) Determine validation 3) Define corrective action if required

33 Data Validation 1) Create each condition 2) Determine validation 3) Define corrective action if required 4) This gets written to temp dataset ETLS_EXCEPTIONS.

34 Data Validation 1) Create each condition 2) Determine validation 3) Define corrective action if required 4) This gets written to temp dataset ETLS_EXCEPTIONS. 5) Run %Append_Data_Quality Macro in post-processing.

35 Data Validation 1) Create each condition 2) Determine validation 3) Define corrective action if required 4) This gets written to temp dataset ETLS_EXCEPTIONS. 5) Run %Append_Data_Quality Macro in post-processing. 6) Use BI tools to investigate Data Quality issues (e.g. Particular source system requires cleansing)

36 Data Validation %Append_Data_Quality Macro Logic. Does ETLS_EXCEPTIONS exist? Yes No Halt macro as no errors to process. Append exceptions to permanent table DQ_Error_Event.

37 Data Validation Table Properties for DQ_ERROR_EVENT. Column name Description Type Length Row_Extraction_Date Date-timestamp when the row was exported or extracted from the source system. Num (8) Exception_Event_Date Date-timestamp when the exception was identified by the data warehouse processes. Num (8) Job_Name The name of the ETL job which identified the exception. Char (64) Table_Name The library and table name which contains the row and column containing the exception. Char (41) Row_Number The row number containing the exception. Num (8) Column_Name The column name containing the datum of the exception. Char (32) Screen_Description The screen (data quality test) description. Char (256) Exception_Description Standardised description of the exception. Char (256) Exception_Action Automated data conform action (if any). Char (256) Exception_Severity The severity level of the DQ Error Event (1=Exclusion, 2=Correction, 3=Improvement ). Num (8) Unconformed_ValueN Original value (numeric) before conforming. Num (8) Conformed_ValueN Conformed (numeric) value. Num (8) Unconformed_ValueC Original value (character) before conforming. Char (256) Conformed_ValueC Conformed (character) value. Char (256)

38 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

39 Data Scrambling Challenge: How can I ensure I m not holding sensitive production data on development/test systems. Solution: Use Data Scrambling routines in non-production environments. Often development source systems are created using production data, and warehouses can propagate the risk of breaching the data protection act.

40 Data Scrambling Custom Transform The %data_scrambler macro allows for columns to be scrambled or passed through normally.

41 Data Scrambling Custom transform Edit Paramters: Select Pass don t scramble key fields! Scramble method: Ranuni Function MD5 Function Translate Function

42 Data Scrambling What about Production? %let liveenvironment = PROD; %let thisenvironment= %sysfunc(substr(%sysfunc(upcase(%sysfunc(getoption(metaserver)))),1,4); Don t perform scramble routine if thisenvironment = liveenvironment. When runnning in Dev the METASERVER option should be different. Could set up a table with environment value in.

43 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

44 Conformed Model Challenge: How can I track trends in my data when the source systems don t hold history. Solution: Use a conformed data model in a warehouse, using slowly changing dimensions where appropriate. Re-Useable Dimensions Fact Tables

45 Conformed Model In the Integrate layer use the SCD Type II Loader transform to make use of effective date processing.

46 Conformed Model In the Integrate Layer use the Surrogate Key Generator to determine keys for dimension tables.

47 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

48 SQL Optimisation Challenge: How can I ensure the best possible SQL performance is achieved through my SQL Join transform. Solution: Use the undocumented _Method option on the SQL procedure to determine processing.

49 SQL Optimisation: _Method Option (SAS Note 33604)

50 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

51 Self Documentation Challenge: How can I ensure the executed warehouse code is documented to an acceptable standard? Solution: DI Studio self documents the code, based on descriptions in in the job and transform properties.

52 Self Documentation Meaningful Job names Descriptions of why not just what.

53 Self Documentation Use Notes and Document Attachments.

54 Self Documentation Descriptions & Notes are propagated through to the executable code, benefitting production support teams.

55 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

56 Role Assignment Challenge: How can I address who is responsible for which job / entity? Solution: Use Role Assignment in DI studio.

57 Role Assignment Allocate names and roles where required.

58 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

59 Rename Standard Transforms Challenge: How can I keep track of processing in a job which has a lot of transformations. Solution: Don t use the default transform names, but rename the default to something meaningful. E.g. Rename SQL Join to Merge Agent_Dim with Broker_Dim

60 Subjects: Data Integration Structure Data Integration Organisation Capture Control (CCT Tables) Error Monitoring Data Validation Data Protection (Scrambler) Conformed Modelling SQL Optimisation Self Documentation Role Assignment Rename Standard Transforms SAS DI Studio Version 3.4 under SAS Intelligence Platform 9.1.3

61 Contributors Mick Collington Jethro Day Steve Morton Nick Treadgold

62 Contributors Mick Collington Jethro Day Steve Morton Nick Treadgold Data Integration Developer Group (SAS Professionals) Julien Heijster John Robertson forum/topics/data-integration-best

63 Contributors Mick Collington Jethro Day Steve Morton Nick Treadgold Data Integration Developer Group (SAS Professionals) Julien Heijster John Robertson forum/topics/data-integration-best SAS.COM

Call: SAS BI Course Content:35-40hours

Call: SAS BI Course Content:35-40hours SAS BI Course Content:35-40hours Course Outline SAS Data Integration Studio 4.2 Introduction * to SAS DIS Studio Features of SAS DIS Studio Tasks performed by SAS DIS Studio Navigation to SAS DIS Studio

More information

Best Practice for Creation and Maintenance of a SAS Infrastructure

Best Practice for Creation and Maintenance of a SAS Infrastructure Paper 2501-2015 Best Practice for Creation and Maintenance of a SAS Infrastructure Paul Thomas, ASUP Ltd. ABSTRACT The advantage of using metadata to control and maintain data and access to data on databases,

More information

Best ETL Design Practices. Helpful coding insights in SAS DI studio. Techniques and implementation using the Key transformations in SAS DI studio.

Best ETL Design Practices. Helpful coding insights in SAS DI studio. Techniques and implementation using the Key transformations in SAS DI studio. SESUG Paper SD-185-2017 Guide to ETL Best Practices in SAS Data Integration Studio Sai S Potluri, Synectics for Management Decisions; Ananth Numburi, Synectics for Management Decisions; ABSTRACT This Paper

More information

Introduction to ETL with SAS

Introduction to ETL with SAS Analytium Ltd Analytium Ltd Why ETL is important? When there is no managed ETL If you are here, at SAS Global Forum, you are probably involved in data management or data consumption in one or more ways.

More information

Introduction to DWH / BI Concepts

Introduction to DWH / BI Concepts SAS INTELLIGENCE PLATFORM CURRICULUM SAS INTELLIGENCE PLATFORM BI TOOLS 4.2 VERSION SAS BUSINESS INTELLIGENCE TOOLS - COURSE OUTLINE Practical Project Based Training & Implementation on all the BI Tools

More information

Certkiller.A QA

Certkiller.A QA Certkiller.A00-260.70.QA Number: A00-260 Passing Score: 800 Time Limit: 120 min File Version: 3.3 It is evident that study guide material is a victorious and is on the top in the exam tools market and

More information

Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012 Implementing a Data Warehouse with Microsoft SQL Server 2012 Course 10777A 5 Days Instructor-led, Hands-on Introduction Data warehousing is a solution organizations use to centralize business data for

More information

Implementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463)

Implementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463) Implementing a Data Warehouse with Microsoft SQL Server 2012/2014 (463) Design and implement a data warehouse Design and implement dimensions Design shared/conformed dimensions; determine if you need support

More information

Microsoft Implementing a SQL Data Warehouse

Microsoft Implementing a SQL Data Warehouse 1800 ULEARN (853 276) www.ddls.com.au Microsoft 20767 - Implementing a SQL Data Warehouse Length 5 days Price $4290.00 (inc GST) Version C Overview This five-day instructor-led course provides students

More information

SESUG Paper SD ETL Load performance benchmarking using different load transformations in SAS Data Integration Studio.

SESUG Paper SD ETL Load performance benchmarking using different load transformations in SAS Data Integration Studio. SESUG Paper SD-188-2017 ETL Load performance benchmarking using different load transformations in SAS Data Integration Studio. Sai S Potluri, Synectics for Management Decisions. ABSTRACT This paper is

More information

Implementing a SQL Data Warehouse

Implementing a SQL Data Warehouse Implementing a SQL Data Warehouse Course 20767B 5 Days Instructor-led, Hands on Course Information This five-day instructor-led course provides students with the knowledge and skills to provision a Microsoft

More information

20767B: IMPLEMENTING A SQL DATA WAREHOUSE

20767B: IMPLEMENTING A SQL DATA WAREHOUSE ABOUT THIS COURSE This 5-day instructor led course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server

More information

Implement a Data Warehouse with Microsoft SQL Server

Implement a Data Warehouse with Microsoft SQL Server Implement a Data Warehouse with Microsoft SQL Server 20463D; 5 days, Instructor-led Course Description This course describes how to implement a data warehouse platform to support a BI solution. Students

More information

Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server 2012 10777 - Implementing a Data Warehouse with Microsoft SQL Server 2012 Duration: 5 days Course Price: $2,695 Software Assurance Eligible Course Description 10777 - Implementing a Data Warehouse with Microsoft

More information

20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile

20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile Course Content Course Description: This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with

More information

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Page 1 of 6 Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: 4 days; Instructor-Led Introduction This course

More information

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS

BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS. What is SAS History of SAS Modules available SAS SAS COURSE CONTENT Course Duration - 40hrs BASICS BEFORE STARTING SAS DATAWAREHOSING Concepts What is ETL ETL Concepts What is OLAP SAS What is SAS History of SAS Modules available SAS GETTING STARTED

More information

Implementing a SQL Data Warehouse

Implementing a SQL Data Warehouse Course 20767B: Implementing a SQL Data Warehouse Page 1 of 7 Implementing a SQL Data Warehouse Course 20767B: 4 days; Instructor-Led Introduction This 4-day instructor led course describes how to implement

More information

Exam /Course 20767B: Implementing a SQL Data Warehouse

Exam /Course 20767B: Implementing a SQL Data Warehouse Exam 70-767/Course 20767B: Implementing a SQL Data Warehouse Course Outline Module 1: Introduction to Data Warehousing This module describes data warehouse concepts and architecture consideration. Overview

More information

MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server

MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to implement a data warehouse with Microsoft SQL Server.

More information

An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio

An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Paper 2733-2018 An Introduction to Parallel Processing with the Fork Transformation in SAS Data Integration Studio Jeff Dyson, The Financial Risk Group ABSTRACT The SAS Data Integration Studio job is historically

More information

Rupinder Dhillon Dec 14, 2012 TASS-i

Rupinder Dhillon Dec 14, 2012 TASS-i Rupinder Dhillon Dec 14, 2012 TASS-i Agenda Using DI Studio at Bell Problem we faced in our DI Studio ETL jobs How we used Parameters and Looping in DI Studio Adding Looping and Parameters to a Sample

More information

20767: Implementing a SQL Data Warehouse

20767: Implementing a SQL Data Warehouse Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com

More information

SAS Data Integration Studio 3.3. User s Guide

SAS Data Integration Studio 3.3. User s Guide SAS Data Integration Studio 3.3 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Data Integration Studio 3.3: User s Guide. Cary, NC: SAS Institute

More information

COURSE 10977A: UPDATING YOUR SQL SERVER SKILLS TO MICROSOFT SQL SERVER 2014

COURSE 10977A: UPDATING YOUR SQL SERVER SKILLS TO MICROSOFT SQL SERVER 2014 ABOUT THIS COURSE This five-day instructor-led course teaches students how to use the enhancements and new features that have been added to SQL Server and the Microsoft data platform since the release

More information

Microsoft Implementing a Data Warehouse with Microsoft SQL Server 2014

Microsoft Implementing a Data Warehouse with Microsoft SQL Server 2014 1800 ULEARN (853 276) www.ddls.com.au Microsoft 20463 - Implementing a Data Warehouse with Microsoft SQL Server 2014 Length 5 days Price $4290.00 (inc GST) Version D Overview Please note: Microsoft have

More information

Call: Datastage 8.5 Course Content:35-40hours Course Outline

Call: Datastage 8.5 Course Content:35-40hours Course Outline Datastage 8.5 Course Content:35-40hours Course Outline Unit -1 : Data Warehouse Fundamentals An introduction to Data Warehousing purpose of Data Warehouse Data Warehouse Architecture Operational Data Store

More information

Techno Expert Solutions An institute for specialized studies!

Techno Expert Solutions An institute for specialized studies! Getting Started Course Content of IBM Cognos Data Manger Identify the purpose of IBM Cognos Data Manager Define data warehousing and its key underlying concepts Identify how Data Manager creates data warehouses

More information

Utilizing SQL with WindMilMap

Utilizing SQL with WindMilMap Utilizing SQL with WindMilMap Presented by Eric Kirkes, GIS Support Specialist This presentation will provide basic information on how to manage a SQL database tied to your Milsoft model. Schema and structure

More information

INDEPTH Network. Introduction to ETL. Tathagata Bhattacharjee ishare2 Support Team

INDEPTH Network. Introduction to ETL. Tathagata Bhattacharjee ishare2 Support Team INDEPTH Network Introduction to ETL Tathagata Bhattacharjee ishare2 Support Team Data Warehouse A data warehouse is a system used for reporting and data analysis. Integrating data from one or more different

More information

Pentaho 3.2 Data Integration

Pentaho 3.2 Data Integration Pentaho 3.2 Data Integration Beginner's Guide Explore, transform, validate, and integrate your data with ease Marfa Carina Roldan "- PUBLISHING - 1 BIRMINGHAM - MUMBAI Preface Chapter 1: Getting started

More information

IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2)

IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2) IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2) Dauer: 5 Tage Durchführungsart: Präsenztraining Zielgruppe: This course is intended for Developers. Nr.: 35231 Preis:

More information

Implementing a Data Warehouse with Microsoft SQL Server 2014

Implementing a Data Warehouse with Microsoft SQL Server 2014 Course 20463D: Implementing a Data Warehouse with Microsoft SQL Server 2014 Page 1 of 5 Implementing a Data Warehouse with Microsoft SQL Server 2014 Course 20463D: 4 days; Instructor-Led Introduction This

More information

IBM WEB Sphere Datastage and Quality Stage Version 8.5. Step-3 Process of ETL (Extraction,

IBM WEB Sphere Datastage and Quality Stage Version 8.5. Step-3 Process of ETL (Extraction, IBM WEB Sphere Datastage and Quality Stage Version 8.5 Step-1 Data Warehouse Fundamentals An Introduction of Data warehousing purpose of Data warehouse Data ware Architecture OLTP Vs Data warehouse Applications

More information

Course Contents: 1 Datastage Online Training

Course Contents: 1 Datastage Online Training IQ Online training facility offers Data stage online training by trainers who have expert knowledge in the Data stage and proven record of training hundreds of students. Our Data stage training is regarded

More information

Two Success Stories - Optimised Real-Time Reporting with BI Apps

Two Success Stories - Optimised Real-Time Reporting with BI Apps Oracle Business Intelligence 11g Two Success Stories - Optimised Real-Time Reporting with BI Apps Antony Heljula October 2013 Peak Indicators Limited 2 Two Success Stories - Optimised Real-Time Reporting

More information

Implementing a Data Warehouse with Microsoft SQL Server 2014 (20463D)

Implementing a Data Warehouse with Microsoft SQL Server 2014 (20463D) Implementing a Data Warehouse with Microsoft SQL Server 2014 (20463D) Overview This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create

More information

Shawn Dorward, MVP. Getting Started with Power Query

Shawn Dorward, MVP. Getting Started with Power Query Shawn Dorward, MVP Getting Started with Power Query Meet our Presenter InterDyn Artis specializes in the implementation, service and support of Microsoft Dynamics Enterprise Resource Planning (ERP) and

More information

Exploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software

Exploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software Eploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software Donna Torrence, SAS Institute Inc., Cary, North Carolina Juli Staub Perry, SAS Institute Inc., Cary, North Carolina

More information

Shawn Dorward, MVP. Getting Started with Power Query

Shawn Dorward, MVP. Getting Started with Power Query Shawn Dorward, MVP Getting Started with Power Query Shawn Dorward Microsoft MVP, Business Solutions Dynamics GP Practice Lead Shawn.Dorward@InterdynArtis.com Headquartered in Charlotte, NC Founded in 1989

More information

"Charting the Course... MOC B Updating Your SQL Server Skills to Microsoft SQL Server 2014 Course Summary

Charting the Course... MOC B Updating Your SQL Server Skills to Microsoft SQL Server 2014 Course Summary Course Summary Description This five-day instructor-led course teaches students how to use the enhancements and new features that have been added to SQL Server and the Microsoft data platform since the

More information

ETL (Extraction Transformation & Loading) Testing Training Course Content

ETL (Extraction Transformation & Loading) Testing Training Course Content 1 P a g e ETL (Extraction Transformation & Loading) Testing Training Course Content o Data Warehousing Concepts BY Srinivas Uttaravilli What are Data and Information and difference between Data and Information?

More information

Extending the Scope of Custom Transformations

Extending the Scope of Custom Transformations Paper 3306-2015 Extending the Scope of Custom Transformations Emre G. SARICICEK, The University of North Carolina at Chapel Hill. ABSTRACT Building and maintaining a data warehouse can require complex

More information

Perceptive Data Transfer

Perceptive Data Transfer Perceptive Data Transfer User Guide Version: 6.5.x Written by: Product Knowledge, R&D Date: September 2016 2015 Lexmark International Technology, S.A. All rights reserved. Lexmark is a trademark of Lexmark

More information

A detailed comparison of EasyMorph vs Tableau Prep

A detailed comparison of EasyMorph vs Tableau Prep A detailed comparison of vs We at keep getting asked by our customers and partners: How is positioned versus?. Well, you asked, we answer! Short answer and are similar, but there are two important differences.

More information

MeasureUp Notes. Contents

MeasureUp Notes. Contents MeasureUp Notes Contents Misc... 2 SSIS Catalog... 4 Options to run packages... 5 MDS... 6 CDC... 9 Compare Project Deployment and Package Deployment... 10 Features of Project Deployment Model... 11 SCHEMAS:...

More information

Paper HOW-06. Tricia Aanderud, And Data Inc, Raleigh, NC

Paper HOW-06. Tricia Aanderud, And Data Inc, Raleigh, NC Paper HOW-06 Building Your First SAS Stored Process Tricia Aanderud, And Data Inc, Raleigh, NC ABSTRACT Learn how to convert a simple SAS macro into three different stored processes! Using examples from

More information

Identifying Updated Metadata and Images from a Content Provider

Identifying Updated Metadata and Images from a Content Provider University of Iowa Libraries Staff Publications 4-8-2010 Identifying Updated Metadata and Images from a Content Provider Wendy Robertson University of Iowa 2010 Wendy C Robertson Comments Includes presenter's

More information

CHAPTER 7 CONCLUSION AND FUTURE WORK

CHAPTER 7 CONCLUSION AND FUTURE WORK CHAPTER 7 CONCLUSION AND FUTURE WORK 7.1 Conclusion Data pre-processing is very important in data mining process. Certain data cleaning techniques usually are not applicable to all kinds of data. Deduplication

More information

Quick Start Guide. Copyright 2016 Rapid Insight Inc. All Rights Reserved

Quick Start Guide. Copyright 2016 Rapid Insight Inc. All Rights Reserved Quick Start Guide Copyright 2016 Rapid Insight Inc. All Rights Reserved 2 Rapid Insight Veera - Quick Start Guide QUICK REFERENCE Workspace Tab MAIN MENU Toolbar menu options change when the Workspace

More information

Green Eggs And SAS. Presented To The Edmonton SAS User Group October 24, 2017 By John Fleming. SAS is a registered trademark of The SAS Institute

Green Eggs And SAS. Presented To The Edmonton SAS User Group October 24, 2017 By John Fleming. SAS is a registered trademark of The SAS Institute Green Eggs And SAS Presented To The Edmonton SAS User Group October 24, 2017 By John Fleming SAS is a registered trademark of The SAS Institute ESUG - October 24, 2017 1 How To Merge SAS Programming With

More information

A Practical Introduction to SAS Data Integration Studio

A Practical Introduction to SAS Data Integration Studio ABSTRACT A Practical Introduction to SAS Data Integration Studio Erik Larsen, Independent Consultant, Charleston, SC Frank Ferriola, Financial Risk Group, Cary, NC A useful and often overlooked tool which

More information

Updating your Business Intelligence Skills to Microsoft SQL Server 2012 Course 40009A; 3 Days, Instructor-led

Updating your Business Intelligence Skills to Microsoft SQL Server 2012 Course 40009A; 3 Days, Instructor-led Updating your Business Intelligence Skills to Microsoft SQL Server 2012 Course 40009A; 3 Days, Instructor-led Course Description This three-day instructor-led course provides existing SQL Server Business

More information

Data Management Glossary

Data Management Glossary Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative

More information

Intro to BI Architecture Warren Sifre

Intro to BI Architecture Warren Sifre Intro to BI Architecture Warren Sifre introduction Warren Sifre Principal Consultant Email: wa_sifre@hotmail.com Website: www.linkedin.com/in/wsifre Twitter: @WAS_SQL Professional History 20 years in the

More information

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples. Instructions to the Examiners: 1. May the Examiners not look for exact words from the text book in the Answers. 2. May any valid example be accepted - example may or may not be from the text book 1. Attempt

More information

My SAS Grid Scheduler

My SAS Grid Scheduler ABSTRACT Paper 1148-2017 My SAS Grid Scheduler Patrick Cuba, Cuba BI Consulting No Batch Scheduler? No problem! This paper describes the use of a SAS DI Studio job that can be started by a time dependent

More information

C Exam Code: C Exam Name: IBM InfoSphere DataStage v9.1

C Exam Code: C Exam Name: IBM InfoSphere DataStage v9.1 C2090-303 Number: C2090-303 Passing Score: 800 Time Limit: 120 min File Version: 36.8 Exam Code: C2090-303 Exam Name: IBM InfoSphere DataStage v9.1 Actualtests QUESTION 1 In your ETL application design

More information

Jet Data Manager 2014 SR2 Product Enhancements

Jet Data Manager 2014 SR2 Product Enhancements Jet Data Manager 2014 SR2 Product Enhancements Table of Contents Overview of New Features... 3 New Features in Jet Data Manager 2014 SR2... 3 Improved Features in Jet Data Manager 2014 SR2... 5 New Features

More information

Xpert BI General

Xpert BI General Xpert BI 2.5.0.2 - Added the SAP RFC Collection Module (licensed). - Added the SOAP Web Service Collection Module (licensed). - Added the REST Web Service Collection Module (licensed). - Added the Publication

More information

Duration: 5 Days. EZY Intellect Pte. Ltd.,

Duration: 5 Days. EZY Intellect Pte. Ltd., Implementing a SQL Data Warehouse Duration: 5 Days Course Code: 20767A Course review About this course This 5-day instructor led course describes how to implement a data warehouse platform to support a

More information

Data Warehousing. New Features in SAS/Warehouse Administrator Ken Wright, SAS Institute Inc., Cary, NC. Paper

Data Warehousing. New Features in SAS/Warehouse Administrator Ken Wright, SAS Institute Inc., Cary, NC. Paper Paper 114-25 New Features in SAS/Warehouse Administrator Ken Wright, SAS Institute Inc., Cary, NC ABSTRACT SAS/Warehouse Administrator 2.0 introduces several powerful new features to assist in your data

More information

ABOUT P2WARE PLANNER SUITE 2011 NEW P2WARE PLANNER SUITE 2011 LICENCES

ABOUT P2WARE PLANNER SUITE 2011 NEW P2WARE PLANNER SUITE 2011 LICENCES ABOUT P2WARE PLANNER SUITE 2011 P2ware Suite is a set of highly effective tools for project, programme and portfolio management, exploiting latest developments in project management and related areas.

More information

SQL Replication Project Update. Presented by Steve Ives

SQL Replication Project Update. Presented by Steve Ives SQL Replication Project Update Presented by Steve Ives SQL Replication Project Update Basic principles What, why, and how Project update What s new since the last conference Synergy App Reporting Analysis

More information

Accelerated SQL Server 2012 Integration Services

Accelerated SQL Server 2012 Integration Services 1 Accelerated SQL Server 2012 Integration Services 4 Days (BI-ISACL12-301-EN) Description This 4-day instructor led training focuses on developing and managing SSIS 2012 in the enterprise. In this course,

More information

Updating your Business Intelligence Skills to Microsoft SQL Server 2012

Updating your Business Intelligence Skills to Microsoft SQL Server 2012 Course 40009A: Updating your Business Intelligence Skills to Microsoft SQL Server 2012 Course Details Course Outline Module 1: Introduction to SQL Server 2012 for Business Intelligence This module provides

More information

MCSA SQL SERVER 2012

MCSA SQL SERVER 2012 MCSA SQL SERVER 2012 1. Course 10774A: Querying Microsoft SQL Server 2012 Course Outline Module 1: Introduction to Microsoft SQL Server 2012 Introducing Microsoft SQL Server 2012 Getting Started with SQL

More information

ETL Interview Question Bank

ETL Interview Question Bank ETL Interview Question Bank Author: - Sheetal Shirke Version: - Version 0.1 ETL Architecture Diagram 1 ETL Testing Questions 1. What is Data WareHouse? A data warehouse (DW or DWH), also known as an enterprise

More information

User Guide. Data Preparation R-1.0

User Guide. Data Preparation R-1.0 User Guide Data Preparation R-1.0 Contents 1. About this Guide... 4 1.1. Document History... 4 1.2. Overview... 4 1.3. Target Audience... 4 2. Introduction... 4 2.1. Introducing the Big Data BizViz Data

More information

A Examcollection.Premium.Exam.47q

A Examcollection.Premium.Exam.47q A2090-303.Examcollection.Premium.Exam.47q Number: A2090-303 Passing Score: 800 Time Limit: 120 min File Version: 32.7 http://www.gratisexam.com/ Exam Code: A2090-303 Exam Name: Assessment: IBM InfoSphere

More information

SUGI 29 Data Warehousing, Management and Quality

SUGI 29 Data Warehousing, Management and Quality Building a Purchasing Data Warehouse for SRM from Disparate Procurement Systems Zeph Stemle, Qualex Consulting Services, Inc., Union, KY ABSTRACT SAS Supplier Relationship Management (SRM) solution offers

More information

Vendor: Microsoft. Exam Code: Exam Name: Implementing a Data Warehouse with Microsoft SQL Server Version: Demo

Vendor: Microsoft. Exam Code: Exam Name: Implementing a Data Warehouse with Microsoft SQL Server Version: Demo Vendor: Microsoft Exam Code: 70-463 Exam Name: Implementing a Data Warehouse with Microsoft SQL Server 2012 Version: Demo DEMO QUESTION 1 You are developing a SQL Server Integration Services (SSIS) package

More information

Liberate, a component-based service orientated reporting architecture

Liberate, a component-based service orientated reporting architecture Paper TS05 PHUSE 2006 Liberate, a component-based service orientated reporting architecture Paragon Global Services Ltd, Huntingdon, U.K. - 1 - Contents CONTENTS...2 1. ABSTRACT...3 2. INTRODUCTION...3

More information

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services Implementing and Maintaining Microsoft SQL Server 2008 Integration Services Course 6235A: Three days; Instructor-Led Introduction This three-day instructor-led course teaches students how to implement

More information

Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1)

Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1) Data Mining Asso. Profe. Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of CS 2016 2017 (1) Points to Cover Problem: Heterogeneous Information Sources

More information

DISCOVERY HUB RELEASE DOCUMENTATION

DISCOVERY HUB RELEASE DOCUMENTATION DISCOVERY HUB 18.10 RELEASE DOCUMENTATION Contents Introduction... 3 New Features... 4 Operational Data Exchange (ODX) with support for Azure Data Lake... 4 Azure SQL Database Managed Instance... 4 Shared

More information

Self-documenting Data Processes with SAS

Self-documenting Data Processes with SAS Queensland Users Exploring SAS Technology Self-documenting Data Processes with SAS 26 Sept 2013 Dinu Corbu Suncorp - Credit Risk Reporting The Problem SAS Data Processes of High Complexity: The Details

More information

Data Integration and ETL with Oracle Warehouse Builder

Data Integration and ETL with Oracle Warehouse Builder Oracle University Contact Us: 1.800.529.0165 Data Integration and ETL with Oracle Warehouse Builder Duration: 5 Days What you will learn Participants learn to load data by executing the mappings or the

More information

Data Stage ETL Implementation Best Practices

Data Stage ETL Implementation Best Practices Data Stage ETL Implementation Best Practices Copyright (C) SIMCA IJIS Dr. B. L. Desai Bhimappa.desai@capgemini.com ABSTRACT: This paper is the out come of the expertise gained from live implementation

More information

Data Vault Brisbane User Group

Data Vault Brisbane User Group Data Vault Brisbane User Group 26-02-2013 Agenda Introductions A brief introduction to Data Vault Creating a Data Vault based Data Warehouse Comparisons with 3NF/Kimball When is it good for you? Examples

More information

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City The Complete Reference Christopher Adamson Mc Grauu LlLIJBB New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Contents Acknowledgments

More information

Data Science. Data Analyst. Data Scientist. Data Architect

Data Science. Data Analyst. Data Scientist. Data Architect Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &

More information

Quality Gates User guide

Quality Gates User guide Quality Gates 3.3.5 User guide 06/2013 1 Table of Content 1 - Introduction... 4 2 - Navigation... 5 2.1 Navigation tool bar... 5 2.2 Navigation tree... 5 2.3 Folder Tree... 6 2.4 Test history... 7 3 -

More information

My Journey with DataFlux - Garry D Lima Business Solutions Administrator December 13, 2013

My Journey with DataFlux - Garry D Lima Business Solutions Administrator December 13, 2013 My Journey with DataFlux - Garry D Lima Business Solutions Administrator December 13, 2013 Content Introduction Objectives set by the management My Learning s Our Success Recommendations and Best Practices

More information

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION Paper 268-2009 The interactive data warehouse Introducing transactional data marts and smart applications to interact with data warehouse data Stein Arve Finnestad, Capgemini Norge AS, Stavanger, Norway

More information

Oracle Fusion Middleware

Oracle Fusion Middleware Oracle Fusion Middleware Administrator's Guide for Oracle Business Intelligence Applications 11g Release 1 (11.1.1.7) E37988-01 April 2013 Provides topics of interest to system administrators, including

More information

Recommended Maintenance Plan for Siriusware Clients for SQL server 2005

Recommended Maintenance Plan for Siriusware Clients for SQL server 2005 Recommended Maintenance Plan for Siriusware Clients for SQL server 2005 PURPOSE The purpose of this document is to describe how to automate the periodic rebuilding of indexes for the SiriusSQL database.

More information

Microsoft Access Database How to Import/Link Data

Microsoft Access Database How to Import/Link Data Microsoft Access Database How to Import/Link Data Firstly, I would like to thank you for your interest in this Access database ebook guide; a useful reference guide on how to import/link data into an Access

More information

DATA WAREHOUSE Extras

DATA WAREHOUSE Extras DATA WAREHOUSE Extras September 2010 First Edition: October, 2004 Second Edition: May, 2005 Third Edition: March, 2007 Current Edition: September, 2010 Oregon State University, 2010 Table of Contents Super

More information

Customising SAS OQ to Provide Business Specific Testing of SAS Installations and Updates

Customising SAS OQ to Provide Business Specific Testing of SAS Installations and Updates Paper TS07 Customising SAS OQ to Provide Business Specific Testing of SAS Installations and Updates Steve Huggins, Amadeus Software Limited, Oxford, UK ABSTRACT The SAS Installation Qualification and Operational

More information

BI4Dynamics AX/NAV Integrate external data sources

BI4Dynamics AX/NAV Integrate external data sources BI4Dynamics AX/NAV Last update: November 2018 Version: 2.1 Abbreviation used in this document: EDS: External Data Source(s) are data that are not a part of Microsoft Dynamics AX/NAV. It can come from any

More information

Housekeeping...1 Introduction...1 Using folders...1 Archiving s...8

Housekeeping...1 Introduction...1 Using folders...1 Archiving  s...8 9742C: Use and Maintain Personal E-Mail Housekeeping...1 Introduction...1 Using folders...1 Archiving emails...8 Housekeeping Introduction In this section you will learn how develop good email housekeeping

More information

Oregon SQL Welcomes You to SQL Saturday Oregon

Oregon SQL Welcomes You to SQL Saturday Oregon Oregon SQL Welcomes You to SQL Saturday Oregon 2012-11-03 Introduction to SQL Server 2012 MDS and DQS Peter Myers Bitwise Solutions Presenter Introduction Peter Myers BI Expert, Bitwise Solutions BBus,

More information

User Guide. Data Preparation R-1.1

User Guide. Data Preparation R-1.1 User Guide Data Preparation R-1.1 Contents 1. About this Guide... 4 1.1. Document History... 4 1.2. Overview... 4 1.3. Target Audience... 4 2. Introduction... 4 2.1. Introducing the Big Data BizViz Data

More information

Designing your BI Architecture

Designing your BI Architecture IBM Software Group Designing your BI Architecture Data Movement and Transformation David Cope EDW Architect Asia Pacific 2007 IBM Corporation DataStage and DWE SQW Complex Files SQL Scripts ERP ETL Engine

More information

Access/SAS application for verification and correction: corporate tax returns

Access/SAS application for verification and correction: corporate tax returns Access/SAS application for verification and correction: corporate tax returns Federation of Tax Administrators Revenue Estimating and Research Conference September 17, 2007 Steps in our decision SAS verification

More information

Enterprise Data-warehouse (EDW) In Easy Steps

Enterprise Data-warehouse (EDW) In Easy Steps Enterprise Data-warehouse (EDW) In Easy Steps Data-warehouses (DW) are centralised data repositories that integrate data from various transactional, legacy, or external systems, applications, and sources.

More information

Data transformation guide for ZipSync

Data transformation guide for ZipSync Data transformation guide for ZipSync Using EPIC ZipSync and Pentaho Data Integration to transform and synchronize your data with xmatters April 7, 2014 Table of Contents Overview 4 About Pentaho 4 Required

More information

SAS CURRICULUM. BASE SAS Introduction

SAS CURRICULUM. BASE SAS Introduction SAS CURRICULUM BASE SAS Introduction Data Warehousing Concepts What is a Data Warehouse? What is a Data Mart? What is the difference between Relational Databases and the Data in Data Warehouse (OLTP versus

More information

Datastage Slowly Changing Dimensions

Datastage Slowly Changing Dimensions Datastage Slowly Changing Dimensions by Shradha Kelkar, Talentain Technologies Basics of SCD Shradha Kelkar Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly, rather than

More information