Metadata Based Impact and Lineage Analysis Across Heterogeneous Metadata Sources Presentation at the THE 9TH ANNUAL Wilshire Meta-Data Conference AND THE 17TH ANNUAL DAMA International Symposium by John R. Friedrich, II, Ph.D. jrfried@idworld.net
Outline Concepts Harvesting Mapping Reporting Version and Configuration Management Metadata Management Strategy Standardization Proof of Concept Conclusion Slide 2 or 41
Caveat This presentation is based upon consulting engagements with Meta Integration Technologies, Inc. (MITI) and facilitated using their tools and methods. Meta Integration is the leading "Metadata Component Provider" to major database, data integration, business intelligence, repository, and modeling tool vendors. Meta Integration OEM components & tools manage and analyze the complete enterprise metadata life cycle across heterogeneous vendors. Solutions range from reusable metadata movement components to repository. Over 50 Model Bridges integrate standards like CWM, and the most popular modeling, ETL/EII, and OLAP/BI tools. The Meta Integration components may be found already integrated with its partners tools including the Adaptive Repository, Cognos ReportNet, Embarcadero ERStudio, Informatica Power Center and SuperGlue, and more in the future. All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies. Slide 3 or 41
Concepts CASE Design Software CASE Design Software CASE Design Software Process & Code Models CASE Repository CASE Repository CASE Repository Data Models Data Schemas Mappings Process Metadata Data Schemas X-froms Process Metadata Data Schemas Mappings Cubes, Joins, etc. Process & Code Models ETL Repository EAI Repository EAI Repository Applications Applications Applications Data Specs Data Specs Enterprise Application Integration Business Intelligence Reporting Data Sources ETL ERP Solution Data Specs Data Sources Data Specs Staging ETL ETL ETL Data Mart Data Mart Data Mart OLAP Cubes OLAP Cubes OLAP Cubes Data Sources ETL Data Warehouse Data Feeds Data Feeds Data Feeds Data Feeds Data Feeds Data Feeds Message Schema XML Schema Slide 4 or 41
Source Systems ETL Data Warehouse Reporting / Business Intelligence M e t a D a t a CM Data Models Process & Code Models Mapping Source Data Schemas Mappings Lifecycle Metadata Target Data Schemas Mapping Staging Data Models DW Data Models Mapping Data Schemas Mappings Cubes, Joins, etc. Computer Aided Software Engineering CASE Design Extract Transform and Load Development ETL Design Computer Aided Software Engineering CASE Design Business Intelligence Design and Reporting BI Design CASE Repository ETL Repository CASE Repository EAI Repository Forward/Reverse Engineering ETL Engine Forward/Reverse Engineering BI Reporting Applications Applications Data Sources Data Sources Data Sources ETL Staging Data Warehouse Data Mart OLAP Cube Slide 5 or 41
GL Model Standards Mapping Standard Finance Logical Model Standards Mapping XML Models AR Model AR Model WS-API Model Data Movement Mapping XML ETL Source AR ETL Source AP ETL Source WS ETL Source ETL Process Metadata ETL Tool Mappings Standards Mapping ETL Target RDBMS Schema Data Movement Mapping Staging Data Model Meta-Data Repository Data Movement Mapping Warehouse Data Model Data Movement Mapping BI Source RDBMS Schema BI Tool Mappings BI Cube Views Queries Dims Reports DTD/ XSD Meta-Data Bridges ER1 MDL XMI PDM XMI Source Systems / ODS Computer Aided Software Engineering ERwin Model Mart Rose Forward / Reverse Engineering ETL Extract Transform and Load Development Designer Informatica Repository Data Warehouse Computer Aided Software Engineering PowerDesigner RDBMS Admin SQL Admin Business Intelligence Business Intelligence Design and Reporting Framework Manager Cognos Repository Accts Rcv l Accts Pay Informatica ETL Engine Forward/Reverse Engineering ReportNet AR AP AP RDBMS XML API Data Feeds GL Finance WS API AR RDBMS WS-API XML Staging Target Staging RDBMS Migrate / Replicate Data Warehouse Data Mart OLAP Cube Slide 6 or 41
Harvesting Metadata survey Source/Target survey Data process survey Reference and standards survey Metadata exchange methodology Repository Structure Extraction Slide 7 or 41
GL Model XML Models BI Cube AR Model AR Model WS-API Model XML ETL Source AR ETL Source AP ETL Source WS ETL Source ETL Process Metadata ETL Target RDBMS Schema Staging Data Model Meta-Data Repositoryh Warehouse Data Model BI Source RDBMS Schema BI Tool Mappings Views Queries Dims Reports DTD/ XSD Meta-Data Bridges ER1 MDL XMI PDM XMI Source Systems / ODS Computer Aided Software Engineering ERwin Model Mart Rose Forward / Reverse Engineering ETL Extract Transform and Load Development Designer Informatica Repository Data Warehouse Computer Aided Software Engineering PowerDesigner RDBMS Admin SQL Admin Business Intelligence Business Intelligence Design and Reporting Framework Manager Cognos Repository Accts Rcv l Accts Pay Informatica ETL Engine Forward/Reverse Engineering ReportNet AR AP AP RDBMS XML API Data Feeds GL Finance WS API AR RDBMS WS-API XML Staging Target Staging RDBMS Migrate / Replicate Data Warehouse Data Mart OLAP Cube Slide 8 or 41
Mapping Data process mapping EAI among models ODS to ETL to DW Staging Staging to DW DW to BI Metadata management mapping To/From standards and reference models Slide 9 or 41
GL Model Standards Mapping Standard Finance Logical Model Standards Mapping XML Models AR Model AR Model WS-API Model Data Movement Mapping XML ETL Source AR ETL Source AP ETL Source WS ETL Source ETL Process Metadata ETL Tool Mappings Standards Mapping ETL Target RDBMS Schema Data Movement Mapping Staging Data Model Meta-Data Repository Data Movement Mapping Warehouse Data Model Data Movement Mapping BI Source RDBMS Schema BI Tool Mappings BI Cube Views Queries Dims Reports DTD/ XSD Meta-Data Bridges ER1 MDL XMI PDM XMI Source Systems / ODS Computer Aided Software Engineering ERwin Model Mart Rose Forward / Reverse Engineering ETL Extract Transform and Load Development Designer Informatica Repository Data Warehouse Computer Aided Software Engineering PowerDesigner RDBMS Admin SQL Admin Business Intelligence Business Intelligence Design and Reporting Framework Manager Cognos Repository Accts Rcv l Accts Pay Informatica ETL Engine Forward/Reverse Engineering ReportNet AR AP AP RDBMS XML API Data Feeds GL Finance WS API AR RDBMS WS-API XML Staging Target Staging RDBMS Migrate / Replicate Data Warehouse Data Mart OLAP Cube Slide 10 or 41
Source Message Message Class: Employee Attribute: Employee Number DataType: Char(10) Attribute: Name DataType: Char(50) Attribute: Address DataType: Text() Attribute: Age DataType: Number(2) Attribute: Race DataType: Char(2) Table: Employee Attribute: Unique ID DataType: LongInt Attribute: Employee Number DataType: Varchar(10) Attribute: Date of Employment DataType: Datetime Attribute: Status DataType: Char(10) Target Schema Table: Person Attribute: Unique ID DataType: LongInt Attribute: Name DataType: Varchar(50) Attribute: DOB DataType: Datetime Attribute: Ethnicity DataType: Char Table: PersonAddress Attribute: Person Unique ID DataType: LongInt Attribute: Address Unique ID DataType: LongInt Attribute: Address Type DataType: Varchar(10) Table: Address Attribute: Unique ID DataType: LongInt Attribute: Address Line 1 DataType: Varchar(30) Attribute: Address Line 2 DataType: Varchar(30) Attribute: Address City DataType: Varchar(30) Attribute: Address State DataType: Varchar(2) Attribute: Address Zip DataType: Varchar(9) Slide 11 or 41
Reporting Lineage analysis Forward (Impact) analysis ODS DW Staging DW BI Design Reporting Reverse lineage analysis ODS DW Staging DW BI Design Reporting Slide 12 or 41
Version and Configuration Management Not one version of the truth Multiple versions of sources/targets Multiple mappings among multiple versions Multiple configuration of these versions Slide 13 or 41
Metadata Management Strategy Summary...1 2 Table of Contents...1 3 Table of Figures...1 4 Background...1 5 Meta-Data Architecture...1 5.1 Meta-Data Repository (MDR) Architecture...1 5.2 Meta-Data Sources and Targets...2 5.2.1 Meta-Data Accessibility...2 5.2.2 Other Databases...4 5.3 Server and Workstation Architecture...4 5.3.1 Repository Server...4 5.3.2 Repository Administration...4 5.3.3 Meta-Data Management...4 5.3.4 Meta-Data Source POC s...4 5.3.5 Business Users...4 6 Meta-data Management Process...5 6.1 General Meta-data Management Process...5 6.1.1 Meta-Data Repository Transaction Process...5 6.1.2 Criteria for Meta-Data Repository Access...6 6.2 Data Standards Management Process...7 6.2.1 DSM Process in Detail...7 6.2.2 Phase 1 Identify Data Management Requirements...9 6.2.3 Phase 2 Data Review Package Development...9 6.2.4 Phase 3 Execute Data Management Plan...9 6.2.5 Phase 4 Evaluate Effectiveness...9 6.2.6 Types of Data Standards Issues...10 6.3 Meta-Data Management Examples...12 6.3.1 Legacy to Oracle ERP ETL...13 6.3.2 ERP to Warehouse ETL...14 6.3.3 Warehouse to BI ETL...14 6.3.4 Impact Analysis...15 6.3.5 Updated Standard Code Set...12 7 Roles and Responsibilities...16 7.1 Meta-data Manager...16 7.2 Meta-data Repository Administrator...16 7.3 Data Steward...16 7.4 Data Proponent...17 7.5 Meta-data Integration POC...17 7.6 Data Standardization CM Roles and Responsibilities...17 7.6.1 Data Standards Board...17 7.6.2 Board Chair...18 7.6.3 Data Standards Manager...18 7.6.4 Support Staff...18 7.6.5 Training...18 Slide 14 or 41
Legacy to ERP Migration Populate Schemas in MDR Customize ERP Model Forward Engineer to Oracle ERP Develop ETL Execute Test ETL Run Evaluate Results Execute Production ETL Run Production Beta Test ERP to Warehouse Populate Schemas in MDR Customize DW Model Forward Engineer to DW Builder Develop ETL Execute Test ETL Run Evaluate Results Execute Production ETL Run Production Beta Test Warehouse to BI Populate Schemas in MDR Customize DW Model Forward Engineer to DW Builder Develop ETL Execute Test ETL Run Evaluate Results Execute Production ETL Run Production Beta Test ERP Flex Fields Change Impact Analysis Get Schema From Repository Reverse Engineer Changed Version Identify Mapped (Impacted) Packages Impacted POC s Update Packages Test Evaluate Results Execute Production Run Production Beta Test Slide 15 or 41
Extract Schema from Oracle ERP into MDR Export schema and transforms from Designer Retrieve Model from MDR Export Source / Target models and Transforms from MDR Extract Schema from Legacy into MDR Import New Product into MDR Edit in ERwin Export New ERP Design to ERwin Import into Informatica Designer Export Source /Target Models from ERwin Post in Model Mart Generate ERP Update DDL Store in Informatica Repository Import New Product into MDR Export complete Model from ERwin Export schema and transforms from Designer Compare Models in MDR Import New Version into MDR Import New Version into MDR Map Models in MDR Migrate Version from Prior ERP Schema Design Migrate Version from Prior ETL Design New ERP Schema Design Completed New ETL Design Completed Populate Repository with Source/ Target Models Customize Source/ Target Models Forward-Engineer to Oracle ERP Develop ETL in Informatica Slide 16 41
Schema or Transformation Change Required Run Web Query to Identify Impacted ERP Products Make Changes in ERwin Export Source / Target models and Transforms from MDR Retrieve Latest Version ERP Schema from MDR Post in Model Mart Run Web Query to Identify Other Tools Impacted Products Navigate to Impacted Schema using MIW Import into Informatica Designer Extract Schema from Oracle ERP into MDR Export complete Model from ERwin Retrieve Latest Impacted Schema from MDR Store in Informatica Repository Compare Models in MDR Import New Version into MDR Export schema and transforms from Designer Map Models in MDR Migrate Version from Prior ERP Schema Design Import New Version into MDR New ERP Schema Design Completed Migrate Version from Prior ETL Design Identify Package in Repository and Reverse Engineer Changed Version Generate New ERP Schema or Extensions Make Changes in ERwin Identify Mapped (Impacted) Packages New ETL Design Completed Impacted POC s Update Packages Slide 17 or 41
Identify New meta-data Import into Meta-Data Repository as first version of new package Extract metadata from source tool Export metadata from Repository into target tool format Editing of metadata attributes in ERwin Manage in Model Mart Integrate metadata Integrated metadata Updated metadata to be committed to physical instantiation Comparison and mapping of meta-data Updated metadata to interact with or impact another metadata source Import into Meta-Data Repository as new version of existing package Updated metadata from Standard metadata package approval Extract metadata from source tool Export metadata from Repository into target tool format Slide 18 or 41
Standardization Mapping to reference models Creation of a canonical form, I.e., form that may be transformed to any of the other reference forms. Standardization process Slide 19 or 41
New XML Message Structure Identify Identify SDO, SDO, Proponent Proponent & Steward & Steward Yes No Endorsed Endorsed Develop Data Develop Data Requirements, Processes Requirements, Processes & Policy & Policy New Version of XML Message Evaluate Proponent and Steward Evaluate Proponent and Steward Develop and Review Develop and Review Message Structure Message Structure and/or Extend SDO and/or Extend SDO Standard Standard Develop and Deploy No Needs Needs Endorsement Endorsement? Yes? No Endorsed Endorsed Yes DSCM Board and/or Chief Architect Data Proponent, Steward, and/or SDO New XML Message Identify Identify Lead PMO Lead PMO Evaluate Lead PMO Evaluate Lead PMO New Version of Existing XML Message Joint Medical Information Systems New New Message Message Change Change Control Control Existing Existing Message Message Change Change Control Control Lead Program Management Organization (PMO) License & License & Registry Registry Change Change Control Control Deployment Deployment Change Change Control Control Need XML Message Structure Propose Propose Use of Use of SDO SDO Message Message Yes No MHS Adopted MHS Adopted SDO SDO Standard? Standard? Check Check for SDO for SDO Standard Standard Yes No Propose Propose New New Message Message in Registry in Registry No In the In the Registry Registry?? Propose Propose Change to Change to SDO XML SDO XML Message Message Yes Modification Modification Needed to SDO Needed to SDO Standard? Standard? Yes No Yes Propose Propose Change to Change to Message Message in Registry in Registry Need Change Need Change to XML Message to XML Message in Registry? in Registry? Yes Need Need New New License License?? No No Program Management Program Management Design Design Using XML Using XML Message Message Structure Structure Program Management Organization(s) Deploy Deploy Using XML Using XML Message Message Structure Structure Developers and Deployment Slide 20 or 41
Identify Data Management Requirements Develop Data Review Package 4f Review Data Package/Plan Coordinate Data Management Plan Measure Effectiveness 2 Prioritize Functional Integration Working Group ( FIWG ) 4e Approve / Reject Data Review Packages 7c Evaluate Data Metrics Reports 1a Board Admin Support 1 Identify Data Issues/Needs 2c Present Preliminary Data Review Packages 2b Coordinate Support Staff Efforts 2a Staff Admin Support Data Standards Configuration Management Board ( Board ) 4d Present Data Review Packages Data Standards Configuration Management Board Chair ( Chair ) 4 5a Assemble Data Coordinate Data Review Management Packages Plan Data Standards Configuration Manager ( Manager ) 3b 4c 5d Staff Admin Staff Admin Staff Admin Support Support Support Data Admin Support Staff ( Admin Staff ) 4b 5c 3 Identify / Adopt Harmonization, Standards Impact Analysis Integration, and Develop Develop Procedures Agreements Data Analysis and Triage Support Staff ( Analysis Staff ) 3a 4a 5b Identify Functional and / Adopt Harmonization Standards Detailed Impact and Develop Analysis Functional Procedures Analysis Ad Hoc Support Staff ( Add Hoc Staff ) 5e Staffing of Policy, Guidance, Agreements, etc. 5e Staffing of Policy, Guidance, Agreements, etc. 5e Staffing of Policy, Guidance, Agreements, etc. 7b Present Data Metrics 6 & 7 Measure Process Effectiveness and Assemble Metrics 7a Staff Admin Support 6a Measure Impact Analysis Effectiveness 6b Measure Overall Effectiveness Slide 21 or 41
Proof of Concept Implementing An Effective Meta-data Management Environment at Worthington Industries, Inc. Prepared by John R. Friedrich In conjunction with Meta Integration Technologies, Inc. (MITI) and Worthington Industries, Inc. as part of a Meta Integration Based Proof of Concept Slide 22 or 41
POC Introduction Worthington Industries, Inc. Worthington Industries is a leading diversified metal processing company with annual sales of approximately $2 billion. Headquartered in Columbus, Ohio, Worthington employs over 8000 people and operates 61 facilities in 10 countries. Slide 23 or 41
Worthington Data Flows Worthington Industries Legacy Systems Informatica ETL Server Cognos ReportNet Oracle 11i Transactional ERP Database (HP UX) ETL Staging Area Oracle RDBMS (HP UX) Data Warehouse Oracle RDBMS (HP UX) Data Mart(s) Oracle Data RDBMS Mart(s) Oracle (HP Data UX) RDBMS Mart(s) Oracle (HP Data UX) RDBMS Mart(s) Oracle (HP UX) RDBMS (HP UX) Slide 24 or 41
Other Development Other Development ToolsOther Development Tools Tools Model Mart Repository Oracle RDBMS (HP UX) Erwin Worthington Industries Legacy Systems Meta-data is Everywhere Informatica Repository Oracle RDBMS (HP UX) Informatica Designer Cognos Framework Manager Cognos Repository Oracle RDBMS (HP UX) Informatica ETL Server Cognos ReportNet Oracle 11i Transactional ERP Database (HP UX) ETL Staging Area Oracle RDBMS (HP UX) Data Warehouse Oracle RDBMS (HP UX) Data Mart(s) Oracle Data RDBMS Mart(s) Oracle (HP Data UX) RDBMS Mart(s) Oracle (HP Data UX) RDBMS Mart(s) Oracle (HP UX) RDBMS (HP UX) Slide 25 or 41
Other Development Other Development ToolsOther Development Tools Tools Model Mart Repository Oracle RDBMS (HP UX) Erwin Worthington Industries Legacy Systems Meta Integration Model Bridge Meta Integration Works Meta Integration Repository Database Oracle (HP UX) Informatica ETL Server Meta-data Integration Informatica Repository Oracle RDBMS (HP UX) Informatica Designer Cognos Framework Manager Cognos Repository Oracle RDBMS (HP UX) Cognos ReportNet Oracle 11i Transactional ERP Database (HP UX) ETL Staging Area Oracle RDBMS (HP UX) Data Warehouse Oracle RDBMS (HP UX) Data Mart(s) Oracle Data RDBMS Mart(s) Oracle (HP Data UX) RDBMS Mart(s) Oracle (HP Data UX) RDBMS Mart(s) Oracle (HP UX) RDBMS (HP UX) Slide 26 or 41
Meta Data Web Interface Slide 27 or 41
Browse Structure Slide 28 or 41
Browse Meta-Data Slide 29 or 41
Impact Analysis Slide 30 or 41
Full MDR Search Slide 31 or 41
Search Slide 32 or 41
Mapping to Other (Impacted) Systems/Meta-Data Slide 33 or 41
Show Standard Vendor Slide 34 or 41
Show standard vendor with SOX certification added Slide 35 or 41
Mapping to other systems (impacted systems) Slide 36 or 41
Legacy to ERP ETL Example ERwin ERwin XML Meta-Data Repository ERwin XML ERwin OBDC OBDC OBDC OBDC Informatica Source Analyzer Informatica Target Analyzer ASN Oracle ERP PowerCenter Slide 37 or 41
Legacy to ERP ETL Example ERwin Meta-Data Repository ERwin ERwin XML ERwin XML OBDC OBDC OBDC OBDC Informatica Source Analyzer Informatica Target Analyzer ASN Oracle ERP PowerCenter Slide 38 or 41
Import From Informatica Slide 39 or 41
View Into MDR Slide 40 or 41
Export into ERwin Slide 41 or 41
Manipulate Model in ERwin Slide 42 or 41
Export back to Informatica as ETL Source Slide 43 or 41
Update Transformations in Informatica Slide 44 or 41
Subset Oracle ERP Model Slide 45 or 41
Model in MDR Slide 46 or 41
Subset ERP in ERwin Slide 47 or 41
Oracle ERP Subset Model as Target of Informatica ETL Slide 48 or 41
Standardization Support Other Meta-Data and Schema XML/XSD Standards Information Editing in ERwin ERwin XML Data Standards Packages in Meta-Data Repository ERwin XML Schema Forward Engineering Using ERwin OBDC OBDC OBDC Oracle DDL Create Select *... DW DDL Create Select *... Informatica Cognos ASN ASN Legacy Systems Oracle ERP Warehouse Slide 49 or 41
Metadata flows Standards Information Editing in ERwin Other Meta-Data and Schema Meta-Data Repository XML/XSD Schema Forward Engineering Using ERwin ERwin XML ERwin XML OBDC OBDC OBDC Oracle DDL Create Select *... DW DDL Create Select *... Informatica Cognos ASN ASN Legacy Systems Oracle ERP Warehouse Slide 50 or 41
Standardization Example: Vendor Meta-Data Package Slide 51 or 41
Edit in ERwin Slide 52 or 41
Migrate Standard Reference Mappings Slide 53 or 41
Export to Other Meta-Data Environments ERwin Cognos Informatica XML, Rose, Java, Legacy, ETC.. Slide 54 or 41
Conclusion