Improving CDISC Data Quality & Compliance Right from the Beginning Bharat Chaudhary, Cytel Padamsimh Balekundri, Cytel Session CD08 PhUSE 2015, Vienna
Agenda Background Overview: Development The Problem: In Development Our Solution: Set of Utilities Summary 2
Background (Study Data Tabulation Model) Beneficial for clinical data reviewers & analysts as it brings standardization in presenting the data Complexity of standards - Risk of compliance issue Rework Validation tools & Techniques - Limitations Scope for Improvement Ideas 3
Overview: Development acrf Specification Program development CDMS Data independent validation datasets OpenCDISC Validator 4
The Problem: In development acrf Specification Program development CDMS Data independent validation REWORK datasets OpenCDISC Validator 5
Solution 1 2 acrf Specification Program development CDMS Data independent validation REWORK datasets 3 What can be done? OpenCDISC Validator 6
Steps to go through Rework Area Solution With & Without 7
Solution 1 8
Solution 1 Rework Area acrf Specification Program development CDMS Data independent validation datasets ü We might miss the basics while creating the spec (specification) ü End up creating multiple versions of spec OpenCDISC Validator 9
Solution 1 Spec Checker (SSC) Solution ü Identify compliance issues at spec level, even before programming ü Minimize the number of spec versions, which saves time & efforts 10
Solution 1 Framework (SSC) Technology: Inputs: Execution Time: MS Excel Spec < 1 Min Single Window View IG Standards COMPARE Imported Metadata from spec Result 11
Execution Step Solution 1 1. Mapping specification open/active (SSC) 2. Spec Checker 12
Solution 1 (SSC) 13
Solution 1 (SSC) 14
Solution 1 (SSC) 15
Final Report Solution 1 (SSC) 16
Final Report Solution 1 (SSC) 17
Solution 1 Average time taken to review the Specification with 10 domains. (SSC) 3-4 Hrs With & Without <20 Min 18
Solution 2 19
Solution 2 Rework Area acrf Specification Program development CDMS Data independent validation datasets ü ü compliance & Data consistency errors are not identified End up doing multiple validation cycles OpenCDISC Validator 20
Solution 2 Dataset Checker (SDC) Solution ü ü Identify data quality and compliance issues at development stage Minimize the number of validation cycle, which saves time & efforts 21
Framework Solution 2 SDC Technology: Inputs: Execution Time: (SDC) SAS Macro SAS Dataset < 5 Min datasets 22
Framework Solution 2 (SDC) Master Macro ü ü Having inbuilt checks First these checks will apply on the Domain 23
Framework Solution 2 (SDC) CDISC Master file ü It s a reference file ü Having metadata of commonly used CDISC IG versions 24
Solution 2 Framework (SDC) ü ü Checks file Having Checks in this excel file User can add any number of custom checks 25
Framework Solution 2 (SDC) Study Specific macro ü ü User-defined checks If user wants to add any complex algorithm 26
Execution Step Solution 2 (SDC) To apply on all domains in sdtm library : %sdtm_chk (sdtm_v=3.1.2, lib= sdtm); To apply on specific domain (i.e. LB) : %sdtm_chk (sdtm_v=3.1.2, lib= sdtm, check=lb ); 27
Final Report Solution 2 (SDC) ü Easy navigation & filtering can be applied for better understanding of issues 28
Real Life Example Solution 2 Date of early visit > Date of later visit (SDC) Duplicate records Unexpected Results 29
Real Life Example Solution 2 (SDC) Non missing -ORRES & -ORRESU but --STRESC is blank Incorrect order of variables, USUBJID before DOMAIN 30
Solution 2 Average time taken to review the SAS Dataset with 1 domain. (SDC) 2-3 Hrs With & Without <25 Min 31
Solution 3 32
Solution 3 Rework Area acrf Specification Program development CDMS Data independent validation ü ü datasets Repeat runs of OpenCDISC validator for same study Possibility of missing to identify new issues each time we run the validator OpenCDISC Validator 33
Solution 3 OpenCDISC Report Reviewer (OCR) Solution ü It saves time on multiple report reviews for the same study, by Importing and aligning comments from previous/first report Identifying new errors/warnings 34
Framework Solution 3 Technology: Inputs: Execution Time: (OCR) MS Excel-VBA Macro Reviewed OpenCDISC Report < 2 min Reviewed Report OCR New Summarized Report Importing Comments 35
Execution Step Solution 3 ü First reviewed OpenCDISC Report (OCR) ü New report 36
Execution Step Solution 3 (OCR) ü will import comments from previous report and align them with new report ü Filter out New Errors and address it, Instead of reviewing entire report 37
Final Report Solution 3 (OCR) ü The OCR utility also generates the summary report 38
Solution 3 Average time taken to review new errors in OpenCDISC Reports. (OCR) 2-3 Hrs With & Without <2 Min 39
Summary SSC acrf Specification Program development CDMS Data independent validation REWORK datasets SDC OpenCDISC Validator OCR 40
Questions? Thank you!! bharat.chaudhary@cytel.com padamsimh.balekundri@cytel.com 41