Data Quality Framework

Size: px
Start display at page:

Download "Data Quality Framework"

Transcription

1 #THETA2017 Data Quality Framework Mozhgan Memari, Bruce Cassidy The University of Auckland This work is licensed under a Creative Commons Attribution 4.0 International License

2 Two Figures from 2016 The cost of poor quality data, in the US alone, in 2016 $136 billion per year $3.1 trillion per year The size of the big data market, worldwide, in 2016 Bad Data Costs the U.S. $3 Trillion Per Year, Thomas C. Redman, SEPTEMBER 22,

3 Data, Information, Value Organizations collect more data these days Data-driven decision making to gain competitive advantage Delivery of crucial information at right time Transforming data into value 3

4 Data Quality Are data trustworthy? Are data fit for purpose? Data quality is an assessment of information's fitness to serve its purpose in a given context. 4

5 Agenda Data quality Framework: definition and objectives Methodology Quality Dimensions and Measures Technical design and Implementation in the data warehouse Wrap up 5

6 The Data Quality Framework An assessment and measurement tool, integrated into organisational process, providing a benchmark for the effectiveness of any future data quality improvement initiatives and a standardized template for information on data quality both for internal and external users. The development of a data quality framework and strategy for the New Zealand Ministry of Health, Dr. Karolyn Kerr 6

7 The Data Quality Framework Objectives: Specific guides on systems and integration efforts to supply Useful and Trustworthy data Measuring the level of confidence in data continuously Continuous data quality issue diagnosis Data quality governance A strategy for defining quality objectives measuring the quality enacting improvements 7

8 The Data Quality Framework Conceptual model Governance of the data quality in the organization Roles and responsibilities Service delivery Technical model Defines different metrics to calculate Implementation 8

9 Methodology Existing Methodologies 1 Total Data Quality Management (TDQM) 2 Total Information Quality Management (TIQM) 3 Data Warehouse Quality (DWQ) 4 Data Quality Assessment (DQA) 5 The University of Auckland Data Quality Framework Methodology: combines DQA & DWQ methodologies focuses on the process of assessment, maintenance and improvement of the quality of the data stored in data warehouse 1 Batini et al Wang et al English Jeusfeld et al Pipino et al

10 Methodology 10

11 Methodology The University of Auckland Data Quality Framework Methodology: combines DQA & DWQ methodologies focuses on the process of assessment, maintenance and improvement of the quality of the data stored in data warehouse Objective assessment: Different techniques are applied to define the data quality values and levels of each critical object and then the results are compared with the acceptance criteria and goals Subjective assessment: The perceptions, information needs and data quality requirements of the stakeholders and business users are defined and measured by subjective metrics 11

12 Quality Dimensions and Measures Data quality dimensions are defined to interpret data quality into practical and measurable concepts The University of Auckland Data Governance Policy proposes Completeness Validity Consistency Accuracy And. Interpretability Accessibility Timeliness Reasonableness 12

13 Quality Dimensions and Measures Completeness dimension: Data completeness indicates if all the data necessary to meet the current and future business information demand are available in the data resource. Validity dimension: Validity is to ensure correctness and reasonableness of data. Consistency dimension: The consistency refers to the validation of semantic rules defined over a set of data items. Accuracy dimension: Data accuracy defines if the data values stored in an object are the correct values. 13

14 Quality Dimensions and Measures Accessibility dimension: Accessibility refers to the physical conditions in which users can obtain and access data when required. Interpretability dimension: Interpretability is the extent to which the data are semantically meaningful for the specific analytical purposes. Reasonableness dimension: Reasonableness refers to some agreed value to the data items that help to the understanding of the aspect of the business activity being reported. Timeliness dimension: Timeliness of information reflects the length of time between the availability of data and the event or phenomenon they describe. 14

15 Quality Dimensions and Measures 15

16 Completeness Completeness is defined as "the degree to which data values are present in the attributes that require them." Completeness Definition The proportion of stored data against the potential of "100% complete" Reference Domain integrity and Business rules which define what "100% complete" represents. Measure 7 metrics Completeness Coherence completeness 1 Coherence completeness 2 Scope Unit of Measure Purpose of Measure Type of Measure Related data quality dimension Optionality 0-100% of critical data to be measured in any data item, record, data set or database in the UoA data warehouse Percentage, Ratio Assessment, repair Assessment, Continuous and Discrete Validity and Accuracy If a data item is mandatory, 100% completeness will be achieved, however validity and accuracy checks would need to be performed to determine if the data item has been completed correctly Tuple-based completeness (compares source to target) Column-Null-based completeness Tuple-Null-based completeness (each row at entity) Metadata level Null-based completeness Schema-based completeness 16

17 Quality Dimensions and Measures 1.Discuss the business problem 2.Define the goals, priorities and the acceptance limits Goal Accepted limit Priorities 3. Metrics for the measurements Metric What to test Output (Table level, column level, ) 17

18 Quality Dimensions and Measures Completeness Pilot study: 1. The business problem: DQ for Dim Employee 2.Define the goals, priorities and the acceptance limits Goal Accepted limit Priorities Total completeness 100% 95% completeness Employee info 3. Metrics for the measurement Metric What to test Output (table or column level) Coherence completeness 1 Coherence completeness 2 Tuple-based completeness (compares source to target) Column-Null-based completeness Tuple-Null-based completeness (each row at entity) Metadata level Null-based completeness Schema-based completeness 18

19 Objective Assessment Completeness In ETL process, or specific timeline based on application Metric What to test Output Coherence completeness 1 Coherence completeness 2 Tuple-based completeness Column-Null-based completeness Count and compare the number of unique records at source & target Unique values from PS_names to DIM_EMPLOYEE Compare the not-null values populated from source fields into the target table over the related fields For x-1 to x-5 in DIM_EMPLOYEE Count the required tuples from PS_names to DIM_EMPLOYEE Calculate the completeness over each field then over the entity Table level Column level Table level Table level Column level One value for each target table One value for each field in the target table Average of the fields completeness (CC3_Employee) One value for each field in the target table Completeness % over the fields (x_i) in target table DIM_EMPLOYEE Table level Average of the fields completeness 19

20 Completeness Pilot Study Metric What to test Output Coherence completeness 1 PS_names to DIM_EMPLOYEE dim_employee Results: 100% Completeness Coherence completeness 1 Coherence completeness 2 Actual Key PS_names DIM_EMPLOYEE Business key Tuple-based completeness (compares source to target) Column-Null-based completeness Tuple-Null-based completeness (each row at entity) Compares unique values over the unique identifiers (considering the business rules) ETL Process Coherence completeness 1 Metadata level Null-based completeness Schema-based completeness 20

21 CC2_employee_x_1 CC2_employee_x_2 CC2_employee_x_3 Completeness Pilot Study Metric What to test Output Coherence completeness 1 PS_names to DIM_EMPLOYEE dim_employee Results: 100% Coherence completeness 2 For x-1 to x-5 in DIM_EMPLOYEE Column level: Ave(CC2_employee_x_i) Results: 100% Completeness Coherence completeness 1 Coherence completeness 2 PS_names y-1 y-2 y-3 y-4 y-5 DIM_EMPLOYEE x-1 x-2 x-3 x-4 x-5 Tuple-based completeness (compares source to target) Column-Null-based completeness Tuple-Null-based completeness (each row at entity) Ave(CC2_employee_x_i) Metadata level Null-based completeness Schema-based completeness Compares the not-null values populated from source fields into the target table over the related fields ETL Process Coherence completeness 2 21

22 Completeness Pilot Study Metric What to test Output Coherence completeness 1 PS_names to DIM_EMPLOYEE dim_employee Results: 100% Coherence completeness 2 For x-1 to x-5 in DIM_EMPLOYEE Column level: Ave(CC2_employee_x_i) Results: 100% Tuple-based completeness PS_names Required tuples from PS_names to DIM_EMPLOYEE DIM_EMPLOYEE dim_employee Results: 100% Completeness Coherence completeness 1 Coherence completeness 2 Tuple-based completeness (compares source to target) Column-Null-based completeness Tuple-Null-based completeness (each row at entity) Metadata level Null-based completeness Compares the required tuples at source with the tuples that have been populated ETL Process Tuple-based completeness Schema-based completeness 22

23 Consistency Pilot Study Uniqueness checking for primary keys in source tables For STAGE_HR9.PS_PERSONAL_DATA Unqi_AC = 100 % Uniqueness checking For business keys in target tables For DSS.Dim_person Unqi_BR = 99.93% Consistency Meta data Duplication Uniqueness checking STAGE_HR9.PS_PERSONAL_DATA emplid Person_key DSS.Dim_person emplid Data profiling Dependency checking Referential integrity checking test For each entity and for each set of columns X that should be unique as per the business requirements ETL Process For each entity and for each set of columns X as the business key ETL Process Domain integrity Business rules 23

24 Consistency Pilot Study Consistency Dependency checking For each defined dependency X >Y in the target entity position_nbr ----->position_descr in DSS.Dim_person integrity_1=100% Meta data Duplication Uniqueness checking Data profiling DSS.DIM_POSITION position_descr position_nbr Dependency checking Referential integrity checking test For each defined dependency X ----->Y in entity r ETL Process Domain integrity Business rules 24

25 Repository and Reports Ranges Color code 25

26 Repository and Reports 26

27 Summary Fitness and Trustworthiness of data has become very important Data quality framework defines a model of the organization data environment identifies relevant data quality dimensions and measures provides a guidance for data quality improvement Objective and Subjective assessment of data quality 27

28 Discussion

Data Quality Assessment Framework

Data Quality Assessment Framework Data Quality Assessment Framework ABSTRACT Many efforts to measure data quality focus on abstract concepts and cannot find a practical way to apply them. Or they attach to specific issues and cannot imagine

More information

Data Quality Assessment Tool for health and social care. October 2018

Data Quality Assessment Tool for health and social care. October 2018 Data Quality Assessment Tool for health and social care October 2018 Introduction This interactive data quality assessment tool has been developed to meet the needs of a broad range of health and social

More information

DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY

DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY DIRA : A FRAMEWORK OF DATA INTEGRATION USING DATA QUALITY Reham I. Abdel Monem 1, Ali H. El-Bastawissy 2 and Mohamed M. Elwakil 3 1 Information Systems Department, Faculty of computers and information,

More information

DATA QUALITY STRATEGY. Martin Rennhackkamp

DATA QUALITY STRATEGY. Martin Rennhackkamp DATA QUALITY STRATEGY Martin Rennhackkamp AGENDA Data quality Data profiling Data cleansing Measuring data quality Data quality strategy Why data quality strategy? Implementing the strategy DATA QUALITY

More information

C. Batini & M. Scannapieco Data and Information Quality Book Figures. Chapter 12: Methodologies for Information Quality Assessment and Improvement

C. Batini & M. Scannapieco Data and Information Quality Book Figures. Chapter 12: Methodologies for Information Quality Assessment and Improvement C. Batini & M. Scannapieco Data and Information Quality Book Figures Chapter 12: Methodologies for Information Quality Assessment and Improvement 1 Terminologies adopted in chapter sections Section Topic

More information

Applying Semantic Integration to improve Data Quality

Applying Semantic Integration to improve Data Quality UTRECHT UNIVERSITY Applying Semantic Integration to improve Data Quality by O.F. Brouwer A thesis submitted in partial fulfillment for the degree of Master of Science in the Faculty of Science Department

More information

Assessing data quality in records management systems as implemented in Noark 5

Assessing data quality in records management systems as implemented in Noark 5 1 Assessing data quality in records management systems as implemented in Noark 5 Dimitar Ouzounov School of Computing Dublin City University Dublin, Ireland Email: dimitar.ouzounov2@computing.dcu.ie Abstract

More information

Data Management Glossary

Data Management Glossary Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative

More information

Methodologies for Data Quality Assessment and Improvement

Methodologies for Data Quality Assessment and Improvement Methodologies for Data Quality Assessment and Improvement CARLO BATINI Università di Milano - Bicocca CINZIA CAPPIELLO Politecnico di Milano CHIARA FRANCALANCI Politecnico di Milano 16 and ANDREA MAURINO

More information

A Step towards Centralized Data Warehousing Process: A Quality Aware Data Warehouse Architecture

A Step towards Centralized Data Warehousing Process: A Quality Aware Data Warehouse Architecture A Step towards Centralized Data Warehousing Process: A Quality Aware Data Warehouse Architecture Maqbool-uddin-Shaikh Comsats Institute of Information Technology Islamabad maqboolshaikh@comsats.edu.pk

More information

SAFe Reports Last Update: Thursday, July 23, 2015

SAFe Reports Last Update: Thursday, July 23, 2015 SAFe Reports Last Update: Thursday, July 23, 2015 This document describes the set of reports provided by Jazz Reporting Service (JRS) aligned with SAFe (Scaled Agile Framework) metrics. Some of these reports

More information

Health Information Exchange Content Model Architecture Building Block HISO

Health Information Exchange Content Model Architecture Building Block HISO Health Information Exchange Content Model Architecture Building Block HISO 10040.2 To be used in conjunction with HISO 10040.0 Health Information Exchange Overview and Glossary HISO 10040.1 Health Information

More information

BI/DWH Test specifics

BI/DWH Test specifics BI/DWH Test specifics Jaroslav.Strharsky@s-itsolutions.at 26/05/2016 Page me => TestMoto: inadequate test scope definition? no problem problem cold be only bad test strategy more than 16 years in IT more

More information

Data Governance Central to Data Management Success

Data Governance Central to Data Management Success Data Governance Central to Data Success International Anne Marie Smith, Ph.D. DAMA International DMBOK Editorial Review Board Primary Contributor EWSolutions, Inc Principal Consultant and Director of Education

More information

IBM Software IBM InfoSphere Information Server for Data Quality

IBM Software IBM InfoSphere Information Server for Data Quality IBM InfoSphere Information Server for Data Quality A component index Table of contents 3 6 9 9 InfoSphere QualityStage 10 InfoSphere Information Analyzer 12 InfoSphere Discovery 13 14 2 Do you have confidence

More information

GOVERNMENT GAZETTE REPUBLIC OF NAMIBIA

GOVERNMENT GAZETTE REPUBLIC OF NAMIBIA GOVERNMENT GAZETTE OF THE REPUBLIC OF NAMIBIA N$7.20 WINDHOEK - 7 October 2016 No. 6145 CONTENTS Page GENERAL NOTICE No. 406 Namibia Statistics Agency: Data quality standard for the purchase, capture,

More information

HEALTH INFORMATION INFRASTRUCTURE PROJECT: PROGRESS REPORT

HEALTH INFORMATION INFRASTRUCTURE PROJECT: PROGRESS REPORT HEALTH INFORMATION INFRASTRUCTURE PROJECT: PROGRESS REPORT HCQI Expert Group Meeting 7-8 November 2013 Agenda to improve health information infrastructure» In 2010, health ministers called for improvement

More information

CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING

CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING in partnership with Overall handbook to set up a S-DWH CoE: Deliverable: 4.6 Version: 3.1 Date: 3 November 2017 CoE CENTRE of EXCELLENCE ON DATA WAREHOUSING Handbook to set up a S-DWH 1 version 2.1 / 4

More information

Lecture 1. Chapter 6 Architectural design

Lecture 1. Chapter 6 Architectural design Chapter 6 Architectural Design Lecture 1 1 Topics covered Architectural design decisions Architectural views Architectural patterns Application architectures 2 Software architecture The design process

More information

IBM InfoSphere Information Server Version 8 Release 7. Reporting Guide SC

IBM InfoSphere Information Server Version 8 Release 7. Reporting Guide SC IBM InfoSphere Server Version 8 Release 7 Reporting Guide SC19-3472-00 IBM InfoSphere Server Version 8 Release 7 Reporting Guide SC19-3472-00 Note Before using this information and the product that it

More information

Data Clairvoyance. A business approach to data. Real data practitioners, delivering real improvements to your enterprise data assets.

Data Clairvoyance. A business approach to data. Real data practitioners, delivering real improvements to your enterprise data assets. Data Clairvoyance A business approach to data. A professional services firm that provides a very unique and holistic approach that enables your organization to be successful in traversing the data challenges

More information

OBJECTIVES DEFINITIONS CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS. Figure 1-1a Data in context

OBJECTIVES DEFINITIONS CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS. Figure 1-1a Data in context OBJECTIVES CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS Modern Database Management 11 th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi! Define terms! Name limitations of conventional

More information

Level 4 Diploma in Computing

Level 4 Diploma in Computing Level 4 Diploma in Computing 1 www.lsib.co.uk Objective of the qualification: It should available to everyone who is capable of reaching the required standards It should be free from any barriers that

More information

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. # 3 Relational Model Hello everyone, we have been looking into

More information

Emergency Compliance DG Special Case DAMA INDIANA

Emergency Compliance DG Special Case DAMA INDIANA 1 Emergency Compliance DG Special Case DAMA INDIANA Agenda 2 Overview of full-blown data governance (DG) program Emergency compliance with a specific regulation We'll use GDPR as an example What is GDPR

More information

Chapter 4. The Relational Model

Chapter 4. The Relational Model Chapter 4 The Relational Model Chapter 4 - Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations and relations in the relational model.

More information

CHAPTER 2: DATA MODELS

CHAPTER 2: DATA MODELS Database Systems Design Implementation and Management 12th Edition Coronel TEST BANK Full download at: https://testbankreal.com/download/database-systems-design-implementation-andmanagement-12th-edition-coronel-test-bank/

More information

Global, regional and national SDG follow-up and review processes. Yongyi Min UN Statistics Division/DESA

Global, regional and national SDG follow-up and review processes. Yongyi Min UN Statistics Division/DESA Global, regional and national SDG follow-up and review processes Yongyi Min UN Statistics Division/DESA Follow-up and reviews National voluntary presentations: with or without national indicators High-level

More information

EZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore

EZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore Oracle Database 12c: Performance Management and Tuning NEW Duration: 5 Days What you will learn In the Oracle Database 12c: Performance Management and Tuning course, learn about the performance analysis

More information

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science 1 Distributed Database Systems Basic concepts and Definitions Data Collection of facts and figures concerning an object

More information

DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting.

DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting. DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting April 14, 2009 Whitemarsh Information Systems Corporation 2008 Althea Lane Bowie,

More information

qwertyuiopasdfghjklzxcvbnmqw ertyuiopasdfghjklzxcvbnmqwert uiopasdfghjklzxcvbnmqwertyuio

qwertyuiopasdfghjklzxcvbnmqw ertyuiopasdfghjklzxcvbnmqwert uiopasdfghjklzxcvbnmqwertyuio qwertyuiopasdfghjklzxcvbnmqw ertyuiopasdfghjklzxcvbnmqwert yuiopasdfghjklzxcvbnmqwertyui opasdfghjklzxcvbnmqwertyuiopa A Tutorial on Checking Data in a Database sdfghjklzxcvbnmqwertyuiopasdf DatabaseAnswers.org

More information

FEATURES BENEFITS SUPPORTED PLATFORMS. Reduce costs associated with testing data projects. Expedite time to market

FEATURES BENEFITS SUPPORTED PLATFORMS. Reduce costs associated with testing data projects. Expedite time to market E TL VALIDATOR DATA SHEET FEATURES BENEFITS SUPPORTED PLATFORMS ETL Testing Automation Data Quality Testing Flat File Testing Big Data Testing Data Integration Testing Wizard Based Test Creation No Custom

More information

Fundamentals of Database Systems (INSY2061)

Fundamentals of Database Systems (INSY2061) Fundamentals of Database Systems (INSY2061) 1 What the course is about? These days, organizations are considering data as one important resource like finance, human resource and time. The management of

More information

Luncheon Webinar Series April 25th, Governance for ETL Presented by Beate Porst Sponsored By:

Luncheon Webinar Series April 25th, Governance for ETL Presented by Beate Porst Sponsored By: Luncheon Webinar Series April 25th, 2014 Governance for ETL Presented by Beate Porst Sponsored By: 1 Governance for ETL Questions and suggestions regarding presentation topics? - send to editor@dsxchange.com

More information

CHAPTER 2: DATA MODELS

CHAPTER 2: DATA MODELS CHAPTER 2: DATA MODELS 1. A data model is usually graphical. PTS: 1 DIF: Difficulty: Easy REF: p.36 2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the

More information

Jo-Anna Wood WHO Global Observatory for ehealth: research and resources for use in Australia HIC 2016

Jo-Anna Wood WHO Global Observatory for ehealth: research and resources for use in Australia HIC 2016 Jo-Anna Wood B.Comm, MA, CHIA Chair HISA Victoria Committee WHO Global Observatory for ehealth: research and resources for use in Australia Prepared by The Checkley Group www.checkley.com.au INTRODUCTION

More information

Chapter 6 Architectural Design. Chapter 6 Architectural design

Chapter 6 Architectural Design. Chapter 6 Architectural design Chapter 6 Architectural Design 1 Topics covered Architectural design decisions Architectural views Architectural patterns Application architectures 2 Software architecture The design process for identifying

More information

Relational Database Components

Relational Database Components Relational Database Components Chapter 2 Class 01: Relational Database Components 1 Class 01: Relational Database Components 2 Conceptual Database Design Components Class 01: Relational Database Components

More information

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended.

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended. Previews of TDWI course books are provided as an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews can not be printed. TDWI strives

More information

Data Vault Brisbane User Group

Data Vault Brisbane User Group Data Vault Brisbane User Group 26-02-2013 Agenda Introductions A brief introduction to Data Vault Creating a Data Vault based Data Warehouse Comparisons with 3NF/Kimball When is it good for you? Examples

More information

Developing an integrated approach to the analysis of MOD cyber-related risks

Developing an integrated approach to the analysis of MOD cyber-related risks Developing an integrated approach to the analysis of MOD cyber-related risks James Tate, Colette Jeffery Joint Enablers Analysis Group 28 th July 2016 COVERING Overview 1. risk research 2. Customer requirement

More information

Metabase Metadata Management System Data Interoperability Need and Solution Characteristics

Metabase Metadata Management System Data Interoperability Need and Solution Characteristics Metabase Metadata Management System Data Interoperability and 2008 Althea Lane Bowie, Maryland 20716 Tele: 301-249-1142 Email: Whitemarsh@wiscorp.com Web: www.wiscorp.com : Interoperable business information

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Data Boot Camp: Part II Enhancing Data Quality for Improvement. November 18, :00-1:00pm ET

Data Boot Camp: Part II Enhancing Data Quality for Improvement. November 18, :00-1:00pm ET Data Boot Camp: Part II Enhancing Data Quality for Improvement November 18, 2015 12:00-1:00pm ET Who s on the call today 4 Kaye Phillips, Senior Director, CFHI Trevor Strome, CFHI QI & Measurement Coach

More information

Sustainable Consumption and Production

Sustainable Consumption and Production Sustainable Consumption and Production Resolution 2/8 Charles Arden-Clarke Head, Secretariat 10 Year Framework of Programmes on Sustainable Consumption and Production/One Planet Network CPR Meeting 28

More information

Chapter 6 Architectural Design. Lecture 1. Chapter 6 Architectural design

Chapter 6 Architectural Design. Lecture 1. Chapter 6 Architectural design Chapter 6 Architectural Design Lecture 1 1 Topics covered ² Architectural design decisions ² Architectural views ² Architectural patterns ² Application architectures 2 Software architecture ² The design

More information

Webinar: federated interoperability solutions on Joinup how to maximize the value delivered?

Webinar: federated interoperability solutions on Joinup how to maximize the value delivered? Webinar: federated interoperability solutions on Joinup how to maximize the value delivered? Framework Contract DI/07171 Lot 2 ISA Action 4.2.4: European Federated Interoperability Repository 12 May 2015

More information

Data Collection & Industry Standards

Data Collection & Industry Standards Data Collection & Industry Standards (Chapter 8 Software Project Estimation) Alain Abran (Tutorial Contribution: Dr. Monica Villavicencio) 1 Copyright 2015 Alain Abran Topics covered 1. Introduction 2.

More information

NCHRP Project 20-97: Improving Findability and Relevance of Transportation Information. Part I: Project Overview Gordon Kennedy, Washington State DOT

NCHRP Project 20-97: Improving Findability and Relevance of Transportation Information. Part I: Project Overview Gordon Kennedy, Washington State DOT NCHRP Project 20-97: Improving Findability and Relevance of Transportation Information Part I: Project Overview Gordon Kennedy, Washington State DOT November, 2017 NCHRP is a State-Driven Program Sponsored

More information

IS4H TOOLKIT TOOL: Workshop on Developing a National ehealth Strategy (Workshop Template)

IS4H TOOLKIT TOOL: Workshop on Developing a National ehealth Strategy (Workshop Template) IS4H TOOLKIT TOOL: Workshop on Developing a National ehealth Strategy (Workshop Template) Department of Evidence and Intelligence for Action in Health PAHO/WHO Workshop on Developing a National ehealth

More information

Towards a Vocabulary for Data Quality Management in Semantic Web Architectures

Towards a Vocabulary for Data Quality Management in Semantic Web Architectures Towards a Vocabulary for Data Quality Management in Semantic Web Architectures Christian Fürber Universitaet der Bundeswehr Muenchen Werner-Heisenberg-Weg 39 85577 Neubiberg +49 89 6004 4218 christian@fuerber.com

More information

Semantic interoperability, e-health and Australian health statistics

Semantic interoperability, e-health and Australian health statistics Semantic interoperability, e-health and Australian health statistics Sally Goodenough Abstract E-health implementation in Australia will depend upon interoperable computer systems to share information

More information

Zachman Classification, Implementation & Methodology

Zachman Classification, Implementation & Methodology Zachman Classification, Implementation & Methodology Stan Locke B.Com, M.B.A. Zachman Framework Associates StanL@offline.com www.zachmaninternational.com As Managing Director of Metadata Systems Software

More information

Accelerating Cloud Adoption

Accelerating Cloud Adoption Accelerating Cloud Adoption Ron Stuart July 2016 Disruption Disruption is the new normal Globally interconnected, convenient and more efficient than ever before NZ Government challenge is to use disruptive

More information

Architectural Design

Architectural Design Architectural Design Topics i. Architectural design decisions ii. Architectural views iii. Architectural patterns iv. Application architectures PART 1 ARCHITECTURAL DESIGN DECISIONS Recap on SDLC Phases

More information

2 The IBM Data Governance Unified Process

2 The IBM Data Governance Unified Process 2 The IBM Data Governance Unified Process The benefits of a commitment to a comprehensive enterprise Data Governance initiative are many and varied, and so are the challenges to achieving strong Data Governance.

More information

Oracle Database 12c: Performance Management and Tuning

Oracle Database 12c: Performance Management and Tuning Oracle University Contact Us: +43 (0)1 33 777 401 Oracle Database 12c: Performance Management and Tuning Duration: 5 Days What you will learn In the Oracle Database 12c: Performance Management and Tuning

More information

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database? DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS Complete book short Answer Question.. QUESTION 1: What is database? A database is a logically coherent collection of data with some inherent meaning, representing

More information

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept

More information

2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the integrity of the data.

2. An implementation-ready data model needn't necessarily contain enforceable rules to guarantee the integrity of the data. Test bank for Database Systems Design Implementation and Management 11th Edition by Carlos Coronel,Steven Morris Link full download test bank: http://testbankcollection.com/download/test-bank-for-database-systemsdesign-implementation-and-management-11th-edition-by-coronelmorris/

More information

Dr. Mustafa Jarrar. Knowledge Engineering (SCOM7348) (Chapter 4) University of Birzeit

Dr. Mustafa Jarrar. Knowledge Engineering (SCOM7348) (Chapter 4) University of Birzeit Mustafa Jarrar Lecture Notes, Knowledge Engineering (SCOM7348) University of Birzeit 1 st Semester, 2011 Knowledge Engineering (SCOM7348) Uniqueness (Chapter 4) Dr. Mustafa Jarrar University of Birzeit

More information

How Insurers are Realising the Promise of Big Data

How Insurers are Realising the Promise of Big Data How Insurers are Realising the Promise of Big Data Jason Hunter, CTO Asia-Pacific, MarkLogic A Big Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies

More information

Data Quality Assessment: Data Validation (Data Techniques), Consistency with other Energy Statistics Availability of Metadata

Data Quality Assessment: Data Validation (Data Techniques), Consistency with other Energy Statistics Availability of Metadata 7 th Regional JODI Training Workshop 8-10 October 2012, Rabat, Morocco Data Quality Assessment: Data Validation (Data Techniques), Consistency with other Energy Statistics Availability of Metadata Presented

More information

PERFORMANCE INVESTIGATION TOOLS & TECHNIQUES. 7C Matthew Morris Desynit

PERFORMANCE INVESTIGATION TOOLS & TECHNIQUES. 7C Matthew Morris Desynit PERFORMANCE INVESTIGATION TOOLS & TECHNIQUES 7C Matthew Morris Desynit Desynit > Founded in 2001 > Based in Bristol, U.K > Customers worldwide > Technology Mix 2E/Plex Java &.Net Web & mobile applications

More information

USAID RECOMMENDED DATA QUALITY ASSESSMENT (DQA) CHECKLIST

USAID RECOMMENDED DATA QUALITY ASSESSMENT (DQA) CHECKLIST PROGRAM CYCLE ADS 201 Additional Help USAID RECOMMENDED DATA QUALITY ASSESSMENT (DQA) CHECKLIST Data Quality Assessment Checklist and Recommended Procedures This Data Quality Assessment (DQA) Checklist

More information

Government of Ontario IT Standard (GO ITS) GO-ITS Number 56.3 Information Modeling Standard

Government of Ontario IT Standard (GO ITS) GO-ITS Number 56.3 Information Modeling Standard Government of Ontario IT Standard (GO ITS) GO-ITS Number 56.3 Information Modeling Standard Version # : 1.6 Status: Approved Prepared under the delegated authority of the Management Board of Cabinet Queen's

More information

How Clean is Clean Enough? Determining the Most Effective Use of Resources in the Data Cleansing Process

How Clean is Clean Enough? Determining the Most Effective Use of Resources in the Data Cleansing Process How Clean is Clean Enough? Determining the Most Effective Use of Resources in the Data Cleansing Process Research-in-Progress Jeffery Lucas The University of Alabama, Tuscaloosa, AL 35487 jslucas@cba.ua.edu

More information

Business Impacts of Poor Data Quality: Building the Business Case

Business Impacts of Poor Data Quality: Building the Business Case Business Impacts of Poor Data Quality: Building the Business Case David Loshin Knowledge Integrity, Inc. 1 Data Quality Challenges 2 Addressing the Problem To effectively ultimately address data quality,

More information

Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata

Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata Meeting Host Supporting Partner Meeting Sponsors Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata 105th OGC Technical Committee Palmerston North, New Zealand Dr.

More information

Can We Reliably Benchmark HTA Organizations? Michael Drummond Centre for Health Economics University of York

Can We Reliably Benchmark HTA Organizations? Michael Drummond Centre for Health Economics University of York Can We Reliably Benchmark HTA Organizations? Michael Drummond Centre for Health Economics University of York Outline of Presentation Some background Methods Results Discussion Some Background In recent

More information

Chapter 3: The Relational Database Model

Chapter 3: The Relational Database Model Chapter 3: The Relational Database Model Student: 1. The practical significance of taking the logical view of a database is that it serves as a reminder of the simple file concept of data storage. 2. You

More information

Promoting Accuracy Through Data Quality: The UC Data Validation Framework

Promoting Accuracy Through Data Quality: The UC Data Validation Framework Promoting Accuracy Through Data Quality: The UC Data Validation Framework University of California Office of the President OFFICE OF INSTITUTIONAL RESEARCH & ACADEMIC PLANNING [IRAP] CAIR 2016 Conference

More information

Jim Harris Blogger in Chief

Jim Harris Blogger in Chief Jim Harris Blogger in Chief www.ocdqblog.com Jim Harris Blogger in Chief www.ocdqblog.com E mail jim.harris@ocdqblog.com Twitter twitter.com/ocdqblog LinkedIn linkedin.com/in/jimharris Adventures in Data

More information

IRVLA The Irish Virtual Research Library and Archive project.

IRVLA The Irish Virtual Research Library and Archive project. IRVLA The Irish Virtual Research Library and Archive project. A presentation to the HII International Advisory Committee John Mc Donough IVRLA Project Manager Outline Background. Scope. The Vision Thing.

More information

Towards a joint service catalogue for e-infrastructure services

Towards a joint service catalogue for e-infrastructure services Towards a joint service catalogue for e-infrastructure services Dr British Library 1 DI4R 2016 Workshop Joint service catalogue for research 29 September 2016 15/09/15 Goal A framework for creating a Catalogue

More information

Predicting impact of changes in application on SLAs: ETL application performance model

Predicting impact of changes in application on SLAs: ETL application performance model Predicting impact of changes in application on SLAs: ETL application performance model Dr. Abhijit S. Ranjekar Infosys Abstract Service Level Agreements (SLAs) are an integral part of application performance.

More information

Requirements Validation and Negotiation

Requirements Validation and Negotiation REQUIREMENTS ENGINEERING LECTURE 2017/2018 Joerg Doerr Requirements Validation and Negotiation AGENDA Fundamentals of Requirements Validation Fundamentals of Requirements Negotiation Quality Aspects of

More information

Not All Data Are Created Equal - Taxonomic Data and Data Governance

Not All Data Are Created Equal - Taxonomic Data and Data Governance Not All Data Are Created Equal - Taxonomic Data and Data Governance ABSTRACT Business value lost due to poor data quality has lead organizations to look for Data Governance. The assumption is that having

More information

BCS Specialist Certificate in Service Desk and Incident Management Syllabus

BCS Specialist Certificate in Service Desk and Incident Management Syllabus BCS Specialist Certificate in Service Desk and Incident Management Syllabus Version 1.9 April 2017 This qualification is not regulated by the following United Kingdom Regulators - Ofqual, Qualification

More information

Data Migration Plan Updated (57) Fingrid Datahub Oy

Data Migration Plan Updated (57) Fingrid Datahub Oy 1 (57) Street address Postal address Phone Fax Business Identity Code FI27455435, VAT reg. Läkkisepäntie 21 P.O.Box 530 forename.surname@fingrid.fi FI-00620 Helsinki FI-00101 Helsinki +358 30 395 5000

More information

Measurement of the quality of structured and unstructured data accumulating in the product life cycle in a data quality dashboard

Measurement of the quality of structured and unstructured data accumulating in the product life cycle in a data quality dashboard Institute of Parallel and Distributed Systems Department of Applications of Parallel and Distributed Systems Universität Stuttgart IPVS Universitätsstraße 38 D-70569 Stuttgart Master Thesis Nr. 0990-0004

More information

Government of Ontario IT Standard (GO ITS)

Government of Ontario IT Standard (GO ITS) Government of Ontario IT Standard (GO ITS) GO-ITS Number 56.3 Information Modeling Standard Version # : 1.5 Status: Approved Prepared under the delegated authority of the Management Board of Cabinet Queen's

More information

Integration With the Business Modeler

Integration With the Business Modeler Decision Framework, J. Duggan Research Note 11 September 2003 Evaluating OOA&D Functionality Criteria Looking at nine criteria will help you evaluate the functionality of object-oriented analysis and design

More information

European Commission - ISA Unit

European Commission - ISA Unit DG DIGIT Unit.D.2 (ISA Unit) European Commission - ISA Unit INTEROPERABILITY QUICK ASSESSMENT TOOLKIT Release Date: 12/06/2018 Doc. Version: 1.1 Document History The following table shows the development

More information

Data Quality and Cleaning

Data Quality and Cleaning Data Quality and Cleaning A Case of Mobile Phone Survey Data INNA KOUPER DATA TO INSIGHT CENTER SCHOOL OF INFORMATICS AND COMPUTING INDIANA UNIVERSITY September, 28 2016 Why DQ Data becomes: Big Frequent

More information

Reducing Consumer Uncertainty

Reducing Consumer Uncertainty Spatial Analytics Reducing Consumer Uncertainty Towards an Ontology for Geospatial User-centric Metadata Introduction Cooperative Research Centre for Spatial Information (CRCSI) in Australia Communicate

More information

Dynamic Models - A case study in developing curriculum regulation and conformity using Protege

Dynamic Models - A case study in developing curriculum regulation and conformity using Protege Dynamic Models - Document driven information system for policy implementation A case study in developing curriculum regulation and conformity using Protege Dr. Mike Hobbs & Dominic Myers Department of

More information

Metadata Framework for Resource Discovery

Metadata Framework for Resource Discovery Submitted by: Metadata Strategy Catalytic Initiative 2006-05-01 Page 1 Section 1 Metadata Framework for Resource Discovery Overview We must find new ways to organize and describe our extraordinary information

More information

Information Security Continuous Monitoring (ISCM) Program Evaluation

Information Security Continuous Monitoring (ISCM) Program Evaluation Information Security Continuous Monitoring (ISCM) Program Evaluation Cybersecurity Assurance Branch Federal Network Resilience Division Chad J. Baer FNR Program Manager Chief Operational Assurance Agenda

More information

POSITION DESCRIPTION

POSITION DESCRIPTION POSITION DESCRIPTION Engagement Manager Unit/Branch, Directorate: Location: Outreach & Engagement, Information Assurance and Cyber Security Directorate Auckland Salary range: H $77,711 - $116,567 Purpose

More information

ETL Testing Concepts:

ETL Testing Concepts: Here are top 4 ETL Testing Tools: Most of the software companies today depend on data flow such as large amount of information made available for access and one can get everything which is needed. This

More information

Business Intelligence Roadmap HDT923 Three Days

Business Intelligence Roadmap HDT923 Three Days Three Days Prerequisites Students should have experience with any relational database management system as well as experience with data warehouses and star schemas. It would be helpful if students are

More information

Copyright 2016 Datalynx Pty Ltd. All rights reserved. Datalynx Enterprise Data Management Solution Catalogue

Copyright 2016 Datalynx Pty Ltd. All rights reserved. Datalynx Enterprise Data Management Solution Catalogue Datalynx Enterprise Data Management Solution Catalogue About Datalynx Vendor of the world s most versatile Enterprise Data Management software Licence our software to clients & partners Partner-based sales

More information

Cambridge TECHNICALS LEVEL 3

Cambridge TECHNICALS LEVEL 3 Cambridge TECHNICALS LEVEL 3 IT GUIDE Version ocr.org.uk/it CONTENTS Introduction 3 Communication and employability skills for IT 4 2 Information systems 5 3 Computer systems 6 4 Managing networks 7 5

More information

METADATA MANAGEMENT AND STATISTICAL BUSINESS PROCESS AT STATISTICS ESTONIA

METADATA MANAGEMENT AND STATISTICAL BUSINESS PROCESS AT STATISTICS ESTONIA Distr. GENERAL 06 May 2013 WP.13 ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN UNION (EUROSTAT)

More information

EuroRec Functional Statements Repository. EHR-QTN Workshop Vilnius, January 26, 2011 Dr. Jos Devlies, Belgium

EuroRec Functional Statements Repository. EHR-QTN Workshop Vilnius, January 26, 2011 Dr. Jos Devlies, Belgium EuroRec Functional Statements Repository EHR-QTN Workshop Vilnius, January 26, 2011 Dr. Jos Devlies, Belgium Health IT has a great potential To increase efficiency of care by Reducing useless and duplicated

More information

Building Next- GeneraAon Data IntegraAon Pla1orm. George Xiong ebay Data Pla1orm Architect April 21, 2013

Building Next- GeneraAon Data IntegraAon Pla1orm. George Xiong ebay Data Pla1orm Architect April 21, 2013 Building Next- GeneraAon Data IntegraAon Pla1orm George Xiong ebay Data Pla1orm Architect April 21, 2013 ebay Analytics >50 TB/day new data 100+ Subject Areas >100 PB/day Processed >100 Trillion pairs

More information

Data governance and data quality: is it on your agenda or lurking in the shadows?

Data governance and data quality: is it on your agenda or lurking in the shadows? Data governance and data quality: is it on your agenda or lurking in the shadows? Associate Professor Anne Young Director Planning, Quality and Reporting The University of Newcastle Context Data governance

More information

Introduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure

Introduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure Databases databases Terminology of relational model Properties of database relations. Relational Keys. Meaning of entity integrity and referential integrity. Purpose and advantages of views. The relational

More information