Hanming Tu, Accenture, Berwyn, USA

Similar documents
Metadata Driven Automation in Drug Research

Implementing CDISC Using SAS. Full book available for purchase here.

2 nd ehs Workshop, CC-IN2P3 14 th October 2015, Lyon,

From Implementing CDISC Using SAS. Full book available for purchase here. About This Book... xi About The Authors... xvii Acknowledgments...

SAS offers technology to facilitate working with CDISC standards : the metadata perspective.

IS03: An Introduction to SDTM Part II. Jennie Mc Guirk

Study Data Reviewer s Guide Completion Guideline

CDISC Standards End-to-End: Enabling QbD in Data Management Sam Hume

Standards Driven Innovation

CDASH Standards and EDC CRF Library. Guang-liang Wang September 18, Q3 DCDISC Meeting

Conversion of Company Standard Trial Data to SDTM

CDISC SDTM and ADaM Real World Issues

Automated Creation of Submission-Ready Artifacts Silas McKee, Accenture, Pennsylvania, USA Lourdes Devenney, Accenture, Pennsylvania, USA

Improving Metadata Compliance and Assessing Quality Metrics with a Standards Library

How a Metadata Repository enables dynamism and automation in SDTM-like dataset generation

SDTM-ETL TM. New features in version 1.6. Author: Jozef Aerts XML4Pharma July SDTM-ETL TM : New features in v.1.6

Paper FC02. SDTM, Plus or Minus. Barry R. Cohen, Octagon Research Solutions, Wayne, PA

CDISC Standards and the Semantic Web

Dataset-XML - A New CDISC Standard

Automation of SDTM Programming in Oncology Disease Response Domain Yiwen Wang, Yu Cheng, Ju Chen Eli Lilly and Company, China

Mapping and Terminology. English Speaking CDISC User Group Meeting on 13-Mar-08

Developing an Integrated Platform

Legacy to SDTM Conversion Workshop: Tools and Techniques

AUTOMATED CREATION OF SUBMISSION-READY ARTIFACTS SILAS MCKEE

The Wonderful World of Define.xml.. Practical Uses Today. Mark Wheeldon, CEO, Formedix DC User Group, Washington, 9 th December 2008

Standards Metadata Management (System)

CDASH MODEL 1.0 AND CDASHIG 2.0. Kathleen Mellars Special Thanks to the CDASH Model and CDASHIG Teams

Implementing CDISC at Boehringer Ingelheim

Study Data Reviewer s Guide

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania Kim Minkalis, Accenture Life Sciences, Wayne, Pennsylvania

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania

AZ CDISC Implementation

Introduction to ADaM and What s new in ADaM

Customizing SAS Data Integration Studio to Generate CDISC Compliant SDTM 3.1 Domains

Why organizations need MDR system to manage clinical metadata?

How to write ADaM specifications like a ninja.

Study Data Tabulation Model Implementation Guide: Human Clinical Trials Prepared by the CDISC Submission Data Standards Team

Paper DS07 PhUSE 2017 CDISC Transport Standards - A Glance. Giri Balasubramanian, PRA Health Sciences Edwin Ponraj Thangarajan, PRA Health Sciences

Managing CDISC version changes: how & when to implement? Presented by Lauren Shinaberry, Project Manager Business & Decision Life Sciences

Standardizing FDA Data to Improve Success in Pediatric Drug Development

Managing your metadata efficiently - a structured way to organise and frontload your analysis and submission data

CDISC Public Webinar Standards Updates and Additions. 26 Feb 2015

Taming Rave: How to control data collection standards?

Considerations on creation of SDTM datasets for extended studies

Advantages of a real end-to-end approach with CDISC standards

Converting Data to the SDTM Standard Using SAS Data Integration Studio

CS05 Creating define.xml from a SAS program

The application of SDTM in a disease (oncology)-oriented organization

From SDTM to displays, through ADaM & Analyses Results Metadata, a flight on board METADATA Airlines

Now let s take a look

An Alternate Way to Create the Standard SDTM Domains

Data Integration and ETL with Oracle Warehouse Builder

PharmaSUG Paper CD16

PharmaSUG 2014 PO16. Category CDASH SDTM ADaM. Submission in standardized tabular form. Structure Flexible Rigid Flexible * No Yes Yes

CDISC Implementation, Real World Applications

Application of SDTM Trial Design at GSK. 9 th of December 2010

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Study Composer: a CRF design tool enabling the re-use of CDISC define.xml metadata

Challenges with the interpreta/on of CDISC - Who can we trust?

Pharmaceutical Applications

R1 Test Case that tests this Requirement Comments Manage Users User Role Management

SAS Clinical Data Integration 2.4

CDISC Implementation Step by Step: A Real World Example

Less is more - A visionary View on the Future of CDISC Standards

SAS Clinical Data Integration Server 2.1

SAS Clinical Data Integration 2.6

OpenCDISC Validator 1.4 What s New?

esubmission - Are you really Compliant?

Considerations in Data Modeling when Creating Supplemental Qualifiers Datasets in SDTM-Based Submissions

Pharmaceuticals, Health Care, and Life Sciences. An Approach to CDISC SDTM Implementation for Clinical Trials Data

SDTM Implementation Guide Clear as Mud: Strategies for Developing Consistent Company Standards

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA

What is high quality study metadata?

Best Practices for E2E DB build process and Efficiency on CDASH to SDTM data Tao Yang, FMD K&L, Nanjing, China

PhUSE EU Connect Paper PP15. Stop Copying CDISC Standards. Craig Parry, SyneQuaNon, Diss, England

CDISC Migra+on. PhUSE 2010 Berlin. 47 of the top 50 biopharmaceu+cal firms use Cytel sofware to design, simulate and analyze their clinical studies.

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES

Edwin Ponraj Thangarajan, PRA Health Sciences, Chennai, India Giri Balasubramanian, PRA Health Sciences, Chennai, India

ISO Information and documentation Digital records conversion and migration process

SDTM Automation with Standard CRF Pages Taylor Markway, SCRI Development Innovations, Carrboro, NC

PharmaSUG2014 Paper DS09

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

ODM The Operational Efficiency Model: Using ODM to Deliver Proven Cost and Time Savings in Study Set-up

PharmaSUG Paper PO21

BUSINESS-BASED VALUE IN AN MDR

AUTOMATED SDTM CREATION AND DISCREPANCY DETECTION JOBS: THE NUMBERS TELL THE TALE. Joris De Bondt PhUSE Conference Oct 2014

Techno Expert Solutions An institute for specialized studies!

Call: SAS BI Course Content:35-40hours

PhUSE Paper SD09. "Overnight" Conversion to SDTM Datasets Ready for SDTM Submission Niels Mathiesen, mathiesen & mathiesen, Basel, Switzerland

Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples

Hyperion Interactive Reporting Reports & Dashboards Essentials

Migrating Express Applications To Oracle 9i A Practical Guide

An Efficient Solution to Efficacy ADaM Design and Implementation

The Submission Data File System Automating the Creation of CDISC SDTM and ADaM Datasets

Updates on CDISC Standards Validation

The development of standards management using EntimICE-AZ

Customer oriented CDISC implementation

Data Integrity through DEFINE.PDF and DEFINE.XML

DIA 11234: CDER Data Standards Common Issues Document webinar questions

Improving CDISC SDTM Data Quality & Compliance Right from the Beginning

Transcription:

Hanming Tu, Accenture, Berwyn, USA

Agenda Issue Statement Create Mapping Build Reusable Codes Define Repeatable Workflow Check compliance Conclusion Copyright 2016 Accenture. All rights reserved. 2

Issue Statement Project Triangle CDISC SDTM Data Conversion is part of an ETL (extraction, transformation and loading) process It requires to develop new ETL packages for each study and produces the same data structure and maintain correct relationships for all the studies How could we generate ETL packages that produce the quality SDTM data sets from vastly heterogeneous sources with high efficiency? Copyright 2016 Accenture. All rights reserved. 3

Issue Statement Moving targets The CDISC SDTM model is the target but it is still evolving How do you manage the change of your target? How do you reuse the map, codes and process that you have developed? How to maintain or increase the quality while the target (scope), schedule (time) and budget (cost) may be still in flux. Copyright 2016 Accenture. All rights reserved. 4

Four Major Steps Create Mapping Build Reusable Codes Define Repeatable Workflow Check compliance Copyright 2016 Accenture. All rights reserved. 5

Create Mapping Map Source to Target Mapping data sets to domains Mapping variables to columns Mapping specification Copyright 2016 Accenture. All rights reserved. 6

Mapping at Domain Level Four types of SDTM Mapping Simple match: one to one Single dataset matches with single SDTM domain Simple but very rare Only if clinical data management system (CDMS) is SDTM compliance but still not 100% Example domains o Trial Arms -- TA o Trial Elements -- TE o Trial Visits -- TV o Trial Inclusion/Exclusion Criteria -- TI o Trial Summary TS Converging match: many to one Many to one: records from many different datasets merged into one SDTM domain Complex and common Represent dependent relationship Example SDTM domains or classes o RELREC: AE, CM, LB, MB, MS through studyid, rdomain, usubjid, idvar, idvarval, reltype, relid columns o SUPPQUAL: SuppAE, SuppCM, SuppDM, SuppEG, SuppEX, SuppMH, SuppPE, etc Topic based match: domain aggregation Topic based match: Datasets with topic variables will be consolidated into SDTM observation classes o The TRT topic domains to Interventions o The TERM topic domains to Events o The TESTCD topic domains to Findings This has to be done for every study; then all the studies are pooled into one in the multi-study project. Diverging match: one to many One to many: records from one dataset split into many SDTM domains Common and complex Example SDTM domains o Demographics DM è DM, DS, SC, SUPPDM o Laboratory Test Results LB è LB, CO, SUPPLB Copyright 2016 Accenture. All rights reserved. 7

Mapping at Variable Level Mapping variables to columns Data transformation Decoding or encoding: translating, controlled terms, lookup value to code or vice versa Combining or splitting: many fields to a column or vice versa Transposing or pivoting: rows to columns or vice versa Selecting or filtering: variables or records Aggregating or deriving: new variables Generating: surrogate-keys Metadata transformation Name: matched or renamed Data type: matched or casted Length: contained or split Label: matched or changed Map Specification Example Copyright 2016 Accenture. All rights reserved. 8

Build Reusable Codes Important considerations for increasing the code reusability Standard adoption is the key for code reusability Train people to understand the standards Define standard templates Build public libraries for code snippets and public transformation: Custom functions, procedures and packages; public data rules; and public Experts Group code snippets and functional transformation into modular mapping and transformation: pluggable maps Define workflow to govern the process: Workflow Manager and Process Flows Copyright 2016 Accenture. All rights reserved. 9

Build Reusable Codes Key consideration: Metadata Metadata-driven process is the key for automation Metadata makes data meaningful Metadata is machine readable Metadata is the base for automation Metadata model used in the system Copyright 2016 Accenture. All rights reserved. 10

Define Workflows: Process Repeatability How to link reusable silo codes into a repeatable process? Built an automatic data conversion system to link the pieces in data integration and standardization process. Provides the workflow to link maps to form a controlled data flow, from vertical code reusability to horizontal process repeatability. The data conversion portal provides a web-based software tool for the Data Integration and Standardization (DIS) Department to use. The portal improves and accelerates the Data Conversion Development (DCD) process. Copyright 2016 Accenture. All rights reserved. 11

Define Workflows: Features Features implemented in the system Load and store map specification in relational database Manage workspace with client, project, study, and specification hierarchy Allow users to create and delete intermittent views and tables that we used in mapping Run data conversion jobs by domain or by a group of domains or all Link and copy tables from an Oracle database or use the tables created and loaded through SAS upload utility Keep audit trails for each job Track the performance of each job Copyright 2016 Accenture. All rights reserved. 12

Build Repeatable Process Intelligence based automation: is it fast? Extract common components Ø Build a public code library o Transformation o Utilities: functions, procedures, packages, pluggable maps, workflows Ø Build metadata repository: o SDTM data model o Controlled terminologies o Specification lookup tables: mapping intelligence Ø Create a base project o Common modules o Public locations (db links) o o Ø Code: Replication Efficiency: Automation Standards Metadata Build subsequent projects Ø Create location linking to metadata repository o Import public utilities: transformation, data rules and experts o Copy the base project and modules Ø Process: Automation Use or create utilities to replicate the process: Script for Project Initial Set Up, Mapping Specification, Mapping Creation Use analytics tool to identify the areas for replication and automation: Data Profiling & Data Rules for Source Data Review / Edit Checks Copyright 2016 Accenture. All rights reserved. 13

Approach Comparison Comparison among the approaches and tools Traditional Approach: No standard/no metadata ETL using custom programming such as SAS, PL/SQL, JAVA, Perl, etc. High paid programmers OWB Approach: Standards/No metadata ETL with User Interface Users do not need to know the programming language PL/ SQL Accenture Approach: Standards/Metadata Web-based User Interface No PL/SQL programming is needed No audit trail In an audited environment In an audited environment with validated products No security Built-in security: database and OWB security Authenticated and authorized users only with audit trails No consistence among coding Consistence with all the users Automatic and consistent coding Difficult to manage and support Easy to manage and support Easy to manage and support Scalability: silo and not scale; through adding more manpower Scalability through hardware and software Very scalable Copyright 2016 Accenture. All rights reserved. 14

Check the Compliance The quality questions: is it good? How could we verify whether the data sets compliance to the standards or the defined quality? How do we measure the quality of the work? We need to have some sort of governance in place to ensure the quality of data sets and set of metadata to describe the quality of the data. Clinical data quality information is semantic information about clinical data quality (CDQ), including how a clinical trial is conducted and how the data elements are collected, entered, processed, and analyzed. CDQ concerns not just data accuracy, but also data traceability and compliance against common standards. Copyright 2016 Accenture. All rights reserved. 15

Check the Compliance Compliance Standards System-based approach Ø Paper-based (DDE): o Source-to-database error rates: 976 per 10,000 fields o CRF-to-database: 14 errors per 10,000 fields Ø Fax-Based (OCR) o Re-fax Rate: 5~20% Ø EDC systems o Source-to-database: 50 errors per 10,000 fields Standard-based approach Ø Process standards: GxP where x=l, C, M, etc. Ø Data standards: o Structure: CDISC ODM, CDASH o Value (metadata and controlled terminology): ISO 11179 IT -- Metadata registries (MDR) Organizations ISO/IEC Software System ISO 20943 - IT -- Procedures for Achieving Metadata Registry Content Consistency ISO 21090 - Healthcare Data types ISO 23081 Records management CWM Data warehousing; RDF Web resources; DIF Scientific data sets CDISC CSHARE o Content: CDISC SDTM Ø Verification standards: o Validation checks: WebSDM checks o JANUS checks: FDA checks for data loading o Severity Definition: Low, Medium and High Run through compliance checks Ø Consistency: Checks data in 2 or more columns to ensure data correspond in cross-column (visit number without visit description, age unit without age), crossdomain (SUBJID in a domain but not in DM) or crosssystem (external dictionary). Ø Format: Checks if data are in an allowable format such as ISO 8601; Leading and trailing spaces; and missing value. in character column Ø Limit: Checks if data are within range such as start/end time, and toxicity grade. Ø Metadata: Checks if tables and columns have valid metadata Ø Presence: Checks data that are missing or present Ø Referential: Checks if a described table/record data relation are valid; SDTM is not 3NF, referential information is stored as data (RDOMAIN, USUBJID, IDVAR, IDVARVAL), in Supplemental Qualifiers (SUPPQUAL), related records (RELREC) and comments(co). Ø Value: Checks data against valid values such as code lists, Illegal values Copyright 2016 Accenture. All rights reserved. 16

Check Compliance Intelligent automation: is it smart? Ø Ensure data quality Ø Provide insights for risk-based monitoring Ø Provide insights for redesigning CRFs Ø Provide insights for transformation automation Copyright 2016 Accenture. All rights reserved. 17

Conclusion: Efficiency Level Matrix Standard-based systems allow for integration while metadata-driven systems enable automation; The more intelligence collected about the clinical data, the more integration could be; Standard Based Integration List of Efficient Levels: Level of Code Reusability H M Integration 3 2 Intelligence 6 5 Intelligence 9 8 1 2 3 4 5 No code reuse double programming for every study Code reuse at function level Code reuse at module level and company code standard and reusable code library exists Adopted standards in part of the process in silo systems Adopted standards in some part of the process with some code reusability 6 Adopted standard enable the high reusability in the process L 1 4 7 7 Adopted standard but have not build up metadata and no intelligence Low Medium High L M H Metadata Driven Level of Process Repeatability 8 9 Adopted standards and start learning the insight of the process Adopted standard in all parts and with learned intelligence applied to the process Copyright 2016 Accenture. All rights reserved. 18

Further automation through Intelligent Data Flow Copyright 2016 Accenture. All rights reserved. 19

Q & A Questions and Answers Contact Information Hanming Tu P: 610-407-1817; C: 484-881-2384 E: hanming.h.tu@accenture.com Address: 1160 West Swedesford Road, Berwyn, PA 19312, USA Web: www.accenture.com Fax: 610-535-6615 Dave Evans P: 484-881-2411 E: dave.a.evans@accenture.com Copyright 2016 Accenture. All rights reserved. 20