Data Consistency and Quality Issues in SEND Datasets

Similar documents
Study Data Reviewer s Guide

Study Data Reviewer s Guide Completion Guideline

Improving Metadata Compliance and Assessing Quality Metrics with a Standards Library

Standardising The Standards The Benefits of Consistency

Study Data Technical Conformance Guide (TCG)

Pharmaceuticals, Health Care, and Life Sciences. An Approach to CDISC SDTM Implementation for Clinical Trials Data

Introduction to ADaM and What s new in ADaM

CDASH Standards and EDC CRF Library. Guang-liang Wang September 18, Q3 DCDISC Meeting

Helping The Define.xml User

esource Initiative ISSUES RELATED TO NON-CRF DATA PRACTICES

Vaccine data collection tool Oct Functions, Indicators & Sub-Indicators

The Wonderful World of Define.xml.. Practical Uses Today. Mark Wheeldon, CEO, Formedix DC User Group, Washington, 9 th December 2008

How to review a CRF - A statistical programmer perspective

Implementing CDISC Using SAS. Full book available for purchase here.

Dataset-XML - A New CDISC Standard

Data Integrity and the FDA AFDO Education Conference

Why organizations need MDR system to manage clinical metadata?

Requirements Validation and Negotiation

Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples

DIA 11234: CDER Data Standards Common Issues Document webinar questions

CDISC SDTM and ADaM Real World Issues

CDISC Standards End-to-End: Enabling QbD in Data Management Sam Hume

How to handle different versions of SDTM & DEFINE generation in a Single Study?

How to write ADaM specifications like a ninja.

Enterprise GRC Implementation

QP Current Practices, Challenges and Mysteries. Caitriona Lenagh 16 March 2012

How a Metadata Repository enables dynamism and automation in SDTM-like dataset generation

Streamline SDTM Development and QC

What is high quality study metadata?

SDTM Implementation Guide Clear as Mud: Strategies for Developing Consistent Company Standards

Beyond OpenCDISC: Using Define.xml Metadata to Ensure End-to-End Submission Integrity. John Brega Linda Collins PharmaStat LLC

IBIS. Case Study: Image Data Management System. IBISimg at Novartis using Oracle Database 11g Multimedia DICOM

Leveraging ALCOA+ Principles to Establish a Data Lifecycle Approach for the Validation and Remediation of Data Integrity. Bradford Allen Genentech

Common Programming Errors in CDISC Data

Complying with FDA's 21 CFR Part 11 Regulation

An Efficient Solution to Efficacy ADaM Design and Implementation

Considerations on creation of SDTM datasets for extended studies

Legacy to SDTM Conversion Workshop: Tools and Techniques

Detecting Aberrant Data as an Indicator of Data Integrity Issues

Paper FC02. SDTM, Plus or Minus. Barry R. Cohen, Octagon Research Solutions, Wayne, PA

Best Practice for Explaining Validation Results in the Study Data Reviewer s Guide

ebook library PAGE 1 HOW TO OPTIMIZE TRANSLATIONS AND ACCELERATE TIME TO MARKET

esubmission - Are you really Compliant?

A SDTM Legacy Data Conversion

CDASH MODEL 1.0 AND CDASHIG 2.0. Kathleen Mellars Special Thanks to the CDASH Model and CDASHIG Teams

TxDOT Internal Audit Materials and Testing Audit Department-wide Report

Updates on CDISC Standards Validation

NEWCASTLE CLINICAL TRIALS UNIT STANDARD OPERATING PROCEDURES

From Implementing CDISC Using SAS. Full book available for purchase here. About This Book... xi About The Authors... xvii Acknowledgments...

Data Integrity in Clinical Trials

Edwin Ponraj Thangarajan, PRA Health Sciences, Chennai, India Giri Balasubramanian, PRA Health Sciences, Chennai, India

Signature Practices and Technologies for TMF An Industry Overview. Kathie Clark Wingspan Technology Vice President Product Management

EUROPEAN MEDICINES AGENCY (EMA) CONSULTATION

Elements of Data Management

Leveraging Study Data Reviewer s Guide (SDRG) in Building FDA s Confidence in Sponsor s Submitted Datasets

UNIVERSITY OF LEICESTER, UNIVERSITY OF LOUGHBOROUGH & UNIVERSITY HOSPITALS OF LEICESTER NHS TRUST JOINT RESEARCH & DEVELOPMENT SUPPORT OFFICE

PharmaSUG Paper PO22

Application of SDTM Trial Design at GSK. 9 th of December 2010

User Guide 16-Mar-2018

PharmaSUG Paper AD03

EXAM PREPARATION GUIDE

Protocol Deviations and Protocol Violations Made Simple. Donna Brassil, MA, RN, CCRC Rhonda G. Kost, MD

FDA Audit Preparation

Traceability Look for the source of your analysis results

Standard Operating Procedure. Data Management. Adapted with the kind permission of University Hospitals Bristol NHS Foundation Trust

PhUSE EU Connect Paper PP15. Stop Copying CDISC Standards. Craig Parry, SyneQuaNon, Diss, England

CBER STUDY DATA STANDARDS UPDATE

The Role of the American National Standards Institute (ANSI) Irwin Silverstein, Ph.D. IPEA

OpenCDISC Validator 1.4 What s New?

Data Governance. Data Governance, Data Architecture, and Metadata Essentials Enabling Data Reuse Across the Enterprise

Preparing the Office of Scientific Investigations (OSI) Requests for Submissions to FDA

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components

Standards Driven Innovation

The Role of the QP in MA Compliance

Paper DS07 PhUSE 2017 CDISC Transport Standards - A Glance. Giri Balasubramanian, PRA Health Sciences Edwin Ponraj Thangarajan, PRA Health Sciences

turning data into dollars

PharmaSUG Paper PO21

Data Integrity and Worldwide Regulatory Guidance

PharmaSUG Paper DS16

Bay Area CDISC Network: PhUSE Working Group for Inspection Site Selection Data Standards

ADaM Compliance Starts with ADaM Specifications

EXAM PREPARATION GUIDE

5 Data processing tables, worksheets, and checklists

THE TRIAL MASTER FILE

The Data Center Migration Methodology Phase 2: Discovery

Chapter 10: Regulatory Documentation

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES

Touchstone Technologies, Inc. Course Catalog February 2017

HL7 TOOLS NEWBORN SCREENING HEALTH IT GUIDE AND TOOLKIT

SAS offers technology to facilitate working with CDISC standards : the metadata perspective.

Fundamentals of STEP Implementation

Case Study Update on Structured Content Approaches at Genzyme

FDA Study Data Technical Conformance Guide v4.2

LOUGHBOROUGH UNIVERSITY RESEARCH OFFICE STANDARD OPERATING PROCEDURE. Loughborough University (LU) Research Office SOP 1027 LU

Study Data Reviewer s Guide. FDA/PhUSE Project Summary

PharmaSUG 2014 PO16. Category CDASH SDTM ADaM. Submission in standardized tabular form. Structure Flexible Rigid Flexible * No Yes Yes

Experience of electronic data submission via Gateway to PMDA

Model-Based Techniques in the Development of Net-Centric Applications. Timothy A. Anderson Basil C. Krikeles. June 20, 2007

Hanming Tu, Accenture, Berwyn, USA

Standardizing FDA Data to Improve Success in Pediatric Drug Development

Transcription:

Data Consistency and Quality Issues in SEND Datasets PointCross has reviewed numerous SEND datasets prepared for test submissions to the FDA and has worked with the FDA on their KickStart program to ensure that received datasets are suitable for pharm/tox review. What we are observing is that creating a dataset that is technically compliant (i.e., one that passes the validation rules) is a low bar it is necessary, but not sufficient to meet FDA review requirements. In our experience, sponsors and CROs are having a much more difficult time ensuring that SEND datasets have no data quality issues that impact fitness for review. We have routinely observed cases where SEND datasets are inconsistent with the accompanying study report and have insufficient data for pharm/tox review despite the fact that these data often have been collected and tabulated in the study report. These issues are serious and can result in delays in the review and approval process. Technical Compliance Issues Violations of the CDISC SEND standard or FDA Specific SEND Validation Rules in a SEND dataset prevent it from being loaded into the FDA s NIMS. For example, a critical piece of missing information can disrupt data loading. We believe that simple formatting issues can be resolved with a little experience and self-run validations. Some examples of technical compliance issues beyond simple formatting that we have observed include: Syntax errors in datasets or Define.xml that prevent validation or loading into FDA s NIMS Coded values that are not defined within the SEND dataset Data Consistency, Quality and Sufficiency Issues Being technically conformant to the SEND standards should not be the only concern for sponsors when creating SEND datasets. Frequently, we are seeing that the greatest challenge for companies is ensuring the SEND dataset is consistent and sufficient for FDA review. These types of issues may be the result of relying on two separate processes to generate the study report and the SEND dataset (see figure below). Because data typically originates from multiple LIMS systems with proprietary data models, sponsors or CROs creating SEND datasets are required to reconcile disparate terminologies, units, groupings, and coded data sources. As a result, there can be inconsistencies between the study report and SEND datasets. If FDA reviewers see a signal of interest or calculate a group summary that does not match what is reported in the study report, these discrepancies will result in reviewers questioning the trustworthiness of the dataset. The review process may be disrupted until such data issues are resolved. In cases of extreme discrepancies, a submission could be put at risk. 1 Page 2015 PointCross Life Sciences, Inc.

Examples of data quality and consistency issues we have observed include: Inconsistencies in trial sets labels and study report groups, which can be confusing to reviewers. Incorrectly parsed modifiers for qualitative data, which can result in the misinterpretation of a finding. The SEND dataset lacks data present in the study report that can be reported in SEND. This can cause the reviewers to question why the data was not included, and delays the review process. Individual data elements in the SEND dataset missing or misaligned from the matching data in the study report, which may cause the reviewer to question the quality of the SEND dataset or the report. Specimen information missing in SEND domains like Laboratory Test Results and Macroscopic Findings, which makes the data difficult to review and interpret. 2 Page 2015 PointCross Life Sciences, Inc.

Common Causes for Data Inconsistency in SEND Datasets There are several points in the process of generating SEND datasets where issues related to consistency, quality and sufficiency can occur. Examples include: Establishing the SEND trial design domains from the study protocol (TS, TA, TE, TX, EX). Transforming data terminology from LIMS terminology to SEND Controlled Terminology. Standardizing qualitative findings into SEND. Merging laboratory findings data from multiple providers into a single domain (LB for example). Harmonizing naming conventions, data and terminologies from multiple data providers. Re-packing required reportable data into SEND tabulation format. Generating the Define.xml and Study Data Reviewer s Guide (SDRG). Except for the extraction from the LIMS systems, all of these processes are different from those used for tabulating data to support the study report. Even the process of LIMS extraction is fraught with risks. For example, data managers may not extract all of the data from the LIMS system required to create the SEND dataset such that it is equivalent to the data in the study report. Transformation of LIMS terminology to SEND controlled terminology (CT) may result in errors due to the lack of familiarity of the study director and other toxicologists or pathologists with the SEND CT. Finally, since many sponsors are unaware that the same data can be modeled differently by their different CROs, they may not question decisions made by CROs until it is too late in the process. Ultimately, accountability for the SEND dataset rests with sponsors, as they are the ones signing off on the submission to the FDA. Trial Design Variations The study design, sponsor defined groups, and group summary data in the study report are designed for readability by a toxicologist or reviewer. The trial design of a SEND dataset is designed to be machinereadable and it is very granular. It takes into account all variations in the trial arm of each dose group leading to more trial sets of fewer and more narrowly similar subjects. The granularity of the SEND trial design domains makes reconciling subject IDs used across various LIMS systems even more challenging. The increased granularity in SEND trial design domains compared to sponsor groups can introduce further challenges when trying to ensure the SEND dataset is equivalent to the study report, and for conducting data quality checks. Terminology Differences CROs and labs are familiar with the terminology in their study reports and in their LIMS systems. However, SEND controlled terminology (CT) can be unfamiliar, and converting LIMS data to SEND can inject new errors or cause difficulties when comparing the SEND data to the study report. This increases the quality assurance burden for both CROs delivering the standardized SEND data, as well as for sponsors receiving them. 3 Page 2015 PointCross Life Sciences, Inc.

Standardizing Qualitative Findings Domains Histopathology domains such as MA and MI are not adequately covered by CDISC CT, but the modifiers and qualifiers must be parsed into a form not found in the LIMS data. Without a semantically enabled tool, accompanied by expert human oversight, this process is prone to generate inconsistent incidence summaries between the SEND dataset and study report. This in turn can alarm the reviewers and delay the review process. Ensuring that Required Reportable Data is Reported in SEND If the business decisions about what domains and variables in SEND are to be reported or not is not clear in the SDRG, the resultant data that is prepared for submission will likely not meet reviewer expectations. Computing Derived Values included in SEND 3.1 SEND IG V3.1 imposes calculations for certain derived values to support machine readability and presentation of data. These values will not be available in LIMS systems, but must be calculated from the collected data. An example is the calculation of Nominal Day (NOMDY) which is intended to group measurements that may have been taken over a range of study days into a single nominal day that can be used for group summary calculation, graphing and tabulation purposes. NOMDY is an example of a calculation made from the date/time stamps in the LIMS system. Calculating these manually can be a source of error that is best avoided by automation. Although these are not yet required variables, most sponsors will insist this data be provided by CROs or by their internal data providers. It is also important that these be aligned with the presentation of data in the study report. 4 Page 2015 PointCross Life Sciences, Inc.

Approaches for Resolving Data Quality in SEND Datasets at CROs Consistency, quality, and sufficiency in SEND datasets can be enforced in two basic ways: by inspection, or by ensuring it through the process that generates it. Controlling the process is less expensive over time, and will produce better quality SEND datasets. Since consistency with the study report is a major concern, this process should focus on a single data repository, or source of truth, with a familiar terminology that is easily accessible and that may be repurposed for SEND datasets, tabulations for study reports, or interim study data reporting and monitoring. This is a function that simply cannot be satisfied by a GLP LIMS system. By Inspection The gold standard of a study is the study report generated by a Principal Investigator toxicologist and the study team. In order to compare SEND datasets to this standard, it is necessary for these datasets to be read and available to a qualified toxicologist who can then re-constitute the trial sets to represent the same sponsor-defined groups with exclusions and derive the summary data. In practical terms, this is still a very time and labor-intensive task (requiring additional FTEs) because of the number of groups, number of visit days, and the number of findings in a typical study. However, this is precisely what a reviewer will be able to do using the FDA NIMS data visualization and analysis tool, ToxVision. This inspection can be performed by the pharma sponsor independent of the CRO. The use of ToxVision available from PointCross for data QA by sponsors or their CROs will reduce the time and effort to perform such inspections. By Process CROs and labs have a well-established process to extract collected data and prepare them in a form that allows them to tabulate and print the individual subject data into tables that will become part of the study report. In addition, they provide the ability to pivot and generate datasets for any specified sponsor-defined group for any set of visit days for a specific lab test or finding, so that group summaries may be calculated for the summary report sections. This data can be directly imported into a single source of truth reportable data repository, with a universally accessible data model (UDM), and with a standard global terminology, that can then be automatically transformed to the preferred terminology of the toxicologist or the SEND CT. The required tabulations and the CDISC SEND datasets can then be automatically generated directly from this single truth reportable data store. This approach is summarized in the figure below which also includes a preceding step to specify the study data packages required from each provider at the outset as discussed in the next section of this paper. By Planning and Specifying the Study Data Plan CROs and other data providers begin data collection after completion of the study planning process and acceptance of the protocol. During this early stage of the study, sponsors and CROs should make 5 Page 2015 PointCross Life Sciences, Inc.

decisions about what portions of the study will be included in the SEND dataset, and how to use this information to make better informed decisions about how the protocols are applied into their data collection system. By doing this, sponsors and CROs can ensure that all contracted labs will collect all of the data necessary to make a complete SEND dataset. We recommend that a Nonclinical Study Data Specification (NSDS), consistent with the protocol, be generated by the Study Director's team at the study start. This will ensure that all data collected during the study, or as modified by protocol amendments, will be present in the final SEND dataset. PointCross provide a software solution, NSDS, to generate master study data specifications that define all components of the study, including DMPK/TK, histopathology, and other types of data. Use of NSDS provides the least cost, least risk, and maximum quality assurance while making periodic interim monitoring of studies easier. In this approach SEND datasets are completed contemporaneously with the study reports. 6 Page 2015 PointCross Life Sciences, Inc.