The Data Catalog The Key to Managing Data, Big and Small. April Reeve May

Similar documents
GOVERNING HADOOP (AND THE DATA LAKE)

Randy House Vice President of Health Informatics. Saint Luke s Health System. Lynsey McNeal Director Data of Governance. Saint Luke s Health System

April 17, Ronald Layne Manager, Data Quality and Data Governance

What s a BA to do with Data? Discover and define standard data elements in business terms

Data Stewardship Core by Maria C Villar and Dave Wells

DATA STEWARDSHIP BODY OF KNOWLEDGE (DSBOK)

DATA GOVERNANCE LEADS TO DATA QUALITY

Match data set availability to data resource requirements, including gap analysis and remediation assistance.

Data Governance Industrial Internet & Big Data

TDWI Data Governance Fundamentals: Managing Data as an Asset

Importance of the Data Management process in setting up the GDPR within a company CREOBIS

AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT

Solving the Enterprise Data Dilemma

How to Become a DATA GOVERNANCE EXPERT

Luncheon Webinar Series April 25th, Governance for ETL Presented by Beate Porst Sponsored By:

Data Governance. A short introduction. Gábor Gollnhofer DMS Consulting

Good analytics needs good data and that needs good metadata

Metadata Management as a Key Component to Data Governance, Data Stewardship, and Data Quality Management. Wednesday, July 20 th 2016

Risk: Security s New Compliance. Torsten George VP Worldwide Marketing and Products, Agiliance Professional Strategies - S23

CA ERwin Data Modeler r9 Rick Alaras N.A. Channel Account Manager

International. CDMP Overview. Presented by Patricia Cupoli, CCP, CDMP, CBIP

Data Management Glossary

COURSE BROCHURE CISA TRAINING

Data Governance Central to Data Management Success

Jelena Roljevic Assistant Vice President, Business Intelligence Ronald Layne Data Governance and Data Quality Manager

Data Governance Toolkit

Metadata and the Rise of Big Data Governance: Active Open Source Initiatives. October 23, 2018

IT Audit Process. Prof. Mike Romeu. January 30, IT Audit Process. Prof. Mike Romeu

Building Actionable Data Governance through Data Models & Metadata Donna Burbank Global Data Strategy Ltd.

ODPi and Data Governance Free Your MetaData! October 10, 2018

Certified Information Systems Auditor Training and Certification

Data Governance: Are Governance Models Keeping Up?

CISA Training.

IBM Industry Model support for a data lake architecture

IT General Controls and Why We Need Them -Dennis McLaughlin, CISA (Cyber AIT) Dennis McLaughlin - Cyber AIT 1

A Global Look at IT Audit Best Practices

Making the Impossible Possible

Implementing a Successful Data Governance Program

Microsoft certified solutions associate

ITIL V3 Service Lifecycle - Continual Service Improvement (CSI)

Fair data and open data: differences and consequences

Enabling Data Governance Leveraging Critical Data Elements

Best Practices in Enterprise Data Governance

Improving Your Business with Oracle Data Integration See How Oracle Enterprise Metadata Management Can Help You

From Conceptual to Physical Adjustments to Enterprise Models for the Real World. Myriad Solutions, Inc. erwin Premier Partner since 2000

Data Modeling Whitepaper DATA MODELING IS A FORM OF DATA GOVERNANCE BY ROBERT S. SEINER

Solutions that Work The Comprehensive Data Dictionary. Antonio Amorin

DataND Finance. A Journey into Enterprise Data Warehouse

Data Virtualization Implementation Methodology and Best Practices

The Data Governance Journey at Principal

Getting personal with your customers and GDPR

Advance Your Career. Be recognized as an industry leader. Get ahead of the competition. Validate your expertise with CBIP.

Analytics Fundamentals by Mark Peco

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

Deploying, Managing and Reusing R Models in an Enterprise Environment

Virtuoso Infotech Pvt. Ltd.

Oracle Identity Governance 11g R2: Develop Identity Provisioning

Microsoft Developer Day

ITIL - Lifecycle Service Transition Course

PERSPECTIVE. Effective Data Governance. Abstract

Guidance Solvency II data quality management by insurers

Data Management Framework

KENYA SCHOOL OF GOVERNMENT EMPLOYMENT OPORTUNITY (EXTERNAL ADVERTISEMENT)

Data governance and data quality: is it on your agenda or lurking in the shadows?

INFORMATION TECHNOLOGY AUDIT &

CISA Course. Course Details: iathena.com, a Navitus Education Venture

EXAM PREPARATION GUIDE

Data Governance in Mass upload processes Case KONE. Finnish Winshuttle User Group , Helsinki

Informatica Enterprise Information Catalog

The #1 Key to Removing the Chaos. in Modern Analytical Environments

Table of Contents. Preface xvii PART ONE: FOUNDATIONS OF MODERN INTERNAL AUDITING

Application Discovery and Enterprise Metadata Repository solution Questions PRIEVIEW COPY ONLY 1-1

Achieving effective risk management and continuous compliance with Deloitte and SAP

Data Governance: Data Usage Labeling and Enforcement in Adobe Cloud Platform

Realizing the Full Potential of MDM 1

We would like to announce to you a number of upcoming changes to the Certified Internal Auditor Exam:

2 The IBM Data Governance Unified Process

Effective COBIT Learning Solutions Information package Corporate customers

IS THE DATA CATALOG A METADATA MANAGEMENT RELOADED?

Business Continuity Planning

ISO9001:2015 LEAD IMPLEMENTER & LEAD AUDITOR

A Road Map for Advancing Your Career. Distinguish yourself professionally. Get an edge over the competition. Advance your career with CBIP.

Course # 55011A. The ITIL Foundation Certificate in IT Service Management

Fast Innovation requires Fast IT

Certified in Risk and Information Systems ControlTM Certification Training - Brochure

Certified in the Governance of Enterprise IT Training - Brochure

comprehensive guide toı ACCOUNTINGı CERTIFICATIONSı cpa, cfe, cia, cisa & more ı The Bean Counter, LLC All Rights Reservedı

Professional Qualifications for ITIL PRACTICES FOR SERVICE MANAGEMENT. The ITIL Foundation Certificate in IT Service Management SYLLABUS

BEST BIG DATA CERTIFICATIONS

Certified information Systems Security Professional(CISSP) Bootcamp

LEADING WITH GRC. Approaching Integrated GRC. Knute Ohman, VP, GRC Program Manager. GRC Summit 2017 All Rights Reserved

Data Governance Data Usage Labeling and Enforcement in Adobe Experience Platform

Designing a Modern Data Warehouse + Data Lake

for TOGAF Practitioners Hands-on training to deliver an Architecture Project using the TOGAF Architecture Development Method

Best Practices & Lesson Learned from 100+ ITGRC Implementations

Data Clairvoyance. A business approach to data. Real data practitioners, delivering real improvements to your enterprise data assets.

Les joies et les peines de la transformation numérique

The Rules of Subsurface Analytics Jane McConnell, Practice Partner Oil and Gas, Teradata DEJ KL, 4 October 2017

ACCA Diploma in. Starting this January! International Financial Reporting (DipIFR)

MMARS Financial, Labor Cost Management (LCM) and Commonwealth Information Warehouse (CIW) Reports

Transcription:

The Data Catalog The Key to Managing Data, Big and Small April Reeve May 18 2017

April Reeve Thirty years doing data oriented stuff Data Management disciplines Data Integration, Data Governance, Data Modeling, Data Quality, Business Intelligence, Master Data Management, Data Conversion, Data Warehousing, Enterprise Content Management, Big Data Management Currently Director of Enterprise Information Strategy and Architecture at Celgene Corporation in Summit, NJ Certifications Certified Data Management Professional (DAMA CDMP) Certified Data Governance and Stewardship Professional (ICCP DGSP) Certified Business Intelligence Professional (TDWI CBIP) Certified in Enterprise Governance of IT (ISACA CEGIT) Certified Information Systems Auditor (ISACA CISA) Masters degree in Financial Management (predictive modeling, risk management, derivatives, corporate finance) from Erasmus University, Rotterdam Book Managing Data in Motion Data Integration Best Practice Techniques and Technologies New chapters - Data Management Body of Knowledge (DMBoK) release 2 Data Integration and Big Data

Agenda Why is the data catalog now a critical data management component? Big Data Enterprise Data Creating the data inventory and the data catalog 3

Types of Metadata Business metadata - provides the meaning of the data. It helps define terms in every-day language, without regard to technical implementation. Examples: business rules, stewardship, business definitions, auditing terminology, glossaries, algorithms, and lineage using business language Audience: business users Technical metadata provides information on the format and structure of the data as needed by computer systems. Examples: definition of source and target systems, their table and fields structures and attributes, documentation for auditing derivations and dependencies Audience: specific tool users (BI, ETL, profiling, modeling) Operational metadata - refers to the metadata generated and captured when a process executes. It allows administrators to manage the system and ensure things are running smoothly. Includes audit trail information what was updated or accessed when and by whom. Examples: information about application runs, including their frequency, record counts, component-by-component analysis, and other statistics for auditing purposes Audience: operations, management and business users

Why is the Data Catalog an important component? Why now? WHY IS THE DATA CATALOG AN IMPORTANT COMPONENT? WHY NOW?

Big Data BIG DATA

Big Data Technology Limitations All of the analyst organizations, including Gartner Group and Forester, have predicted that the lack of built in metadata repositories in Big Data technology (such as Hadoop) will be one of the biggest risks of project success for Big Data projects. Few Big Data solution architects have the expertise to understand what functional capabilities they are missing: data access security solutions (usually by role and asset), audit trails of data update and access (operational metadata), inventory of data assets (technical and business metadata). Through 2018, 80% of data lakes will not include effective metadata management capabilities, making them inefficient - Gartner Group, September 2016

Data Lake Architecture

Data Governance of Big Data Big Data Governance Traditional Data Governance Data Issue Tracking Business Metadata / Business Glossary Technical Metadata Data Quality Improvement Metrics Tracking & Reporting Data Profiling & Monitoring All those plus: Where capture metadata? Business meaning Technical structure and format Audit trail Volume - Can metadata capture be automated? Variety / velocity - Automatically consolidate metadata from various repositories / data types? Variety - How profile and monitor data? Do data stewards need new technical skills?

Enterprise Data ENTERPRISE DATA

Providing Data as a Service IT organizations need major improvements in their data management processes especially in the basic process of providing data to users (people and systems) In order to manage something, the first step is to inventory it. How many organizations have an inventory of their data? There needs to be processes for data consumers to identify available data, request access, receive appropriate review and approval, and have data provisioned, without taking huge amounts of time and IT effort

Data Catalogs & Data Services Data Catalogs menu of available data Data Lake / Big Data Implementations must maintain an Inventory briefly, what is each file & who is responsible (business metadata) and, if possible, what is the technical layout A Data Catalog is a menu of what data is available from which a User selects and, if access approved, data is provisioned Data Services business functions to be improved Request access to data Request additional data added Request a report, analysis (root cause, data quality assessment, exploratory), a predictive model Request operational system changes based on a prediction

Creating the data inventory and the data catalog CREATING THE DATA INVENTORY AND THE DATA CATALOG

Integrating metadata into a central data inventory Every technology development tool has a metadata repository (except Hadoop?) Integrating the metadata data together should be as automated as possible Sourcing business metadata and linking to technical metadata may be a challenge Data models Data dictionaries Business glossaries Technical metadata must be updated on a periodic, automatic basis but business metadata may not be updated as frequently This is as challenging as any data integration project

Data Stewardship Who is the responsible business person(s) for each set of data? Subject Matter Expert to answer questions Business owner to approve access Who is the responsible technical person(s) for each set of data? Database administrator Application manager Access manager to provide access when receive business approval Automatic monitoring for missing information and sending alerts / requests to appropriate Data Stewards Data Steward front end

Using the data catalog to request access The integrated metadata is an inventory but not a catalog Need a business front end to: Provide a list of the data (and data services) Provide a way to request access to data Kicks off an automated work flow to request access from the data owner Kicks off an automated work flow to grant access by the appropriate technical person Provide access to the data in place? Provide a copy of the dataset? Schedule periodic copy/update to a data sandbox?

April Reeve areeve@celgene.com @Datagrrl on Twitter Book - Managing Data in Motion Data Integration Best Practice Techniques and Technologies