Dataset-XML - A New CDISC Standard

Similar documents
Paper DS07 PhUSE 2017 CDISC Transport Standards - A Glance. Giri Balasubramanian, PRA Health Sciences Edwin Ponraj Thangarajan, PRA Health Sciences

SAS offers technology to facilitate working with CDISC standards : the metadata perspective.

Creating Define-XML v2 with the SAS Clinical Standards Toolkit 1.6 Lex Jansen, SAS

Creating Define-XML version 2 including Analysis Results Metadata with the SAS Clinical Standards Toolkit

Implementing CDISC Using SAS. Full book available for purchase here.

CDISC Standards End-to-End: Enabling QbD in Data Management Sam Hume

Creating Define-XML v2 with the SAS Clinical Standards Toolkit

The Wonderful World of Define.xml.. Practical Uses Today. Mark Wheeldon, CEO, Formedix DC User Group, Washington, 9 th December 2008

Lex Jansen Octagon Research Solutions, Inc.

CDASH Standards and EDC CRF Library. Guang-liang Wang September 18, Q3 DCDISC Meeting

Aquila's Lunch And Learn CDISC The FDA Data Standard. Disclosure Note 1/17/2014. Host: Josh Boutwell, MBA, RAC CEO Aquila Solutions, LLC

From Implementing CDISC Using SAS. Full book available for purchase here. About This Book... xi About The Authors... xvii Acknowledgments...

Managing CDISC version changes: how & when to implement? Presented by Lauren Shinaberry, Project Manager Business & Decision Life Sciences

Introduction to Define.xml

Advantages of a real end-to-end approach with CDISC standards

Improving Metadata Compliance and Assessing Quality Metrics with a Standards Library

PharmaSUG Paper AD03

FDA XML Data Format Requirements Specification

Submission-Ready Define.xml Files Using SAS Clinical Data Integration Melissa R. Martinez, SAS Institute, Cary, NC USA

Introduction to ADaM and What s new in ADaM

Lex Jansen Octagon Research Solutions, Inc.

Material covered in the Dec 2014 FDA Binding Guidances

Less is more - A visionary View on the Future of CDISC Standards

Business & Decision Life Sciences

CDASH MODEL 1.0 AND CDASHIG 2.0. Kathleen Mellars Special Thanks to the CDASH Model and CDASHIG Teams

Edwin Ponraj Thangarajan, PRA Health Sciences, Chennai, India Giri Balasubramanian, PRA Health Sciences, Chennai, India

Updates on CDISC Standards Validation

R1 Test Case that tests this Requirement Comments Manage Users User Role Management

Standards Driven Innovation

Standards Metadata Management (System)

Experience of electronic data submission via Gateway to PMDA

Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization

CDISC SDTM and ADaM Real World Issues

From ODM to SDTM: An End-to-End Approach Applied to Phase I Clinical Trials

It s All About Getting the Source and Codelist Implementation Right for ADaM Define.xml v2.0

CDISC Standards and the Semantic Web

SDTM-ETL TM. New features in version 1.6. Author: Jozef Aerts XML4Pharma July SDTM-ETL TM : New features in v.1.6

Beyond OpenCDISC: Using Define.xml Metadata to Ensure End-to-End Submission Integrity. John Brega Linda Collins PharmaStat LLC

How to handle different versions of SDTM & DEFINE generation in a Single Study?

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Hanming Tu, Accenture, Berwyn, USA

Study Composer: a CRF design tool enabling the re-use of CDISC define.xml metadata

esubmission - Are you really Compliant?

Optimization of the traceability when applying an ADaM Parallel Conversion Method

CDISC Journal. Generating a cabig Patient Study Calendar from a Study Design in ODM with Study Design Model Extension. By Jozef Aerts.

SAS Clinical Data Integration 2.6

SDTM-ETL 3.1 User Manual and Tutorial

ODM The Operational Efficiency Model: Using ODM to Deliver Proven Cost and Time Savings in Study Set-up

OpenCDISC Validator 1.4 What s New?

Why organizations need MDR system to manage clinical metadata?

SAS Clinical Data Integration 2.4

PharmaSUG 2014 PO16. Category CDASH SDTM ADaM. Submission in standardized tabular form. Structure Flexible Rigid Flexible * No Yes Yes

Pharmaceuticals, Health Care, and Life Sciences. An Approach to CDISC SDTM Implementation for Clinical Trials Data

CDISC Public Webinar Standards Updates and Additions. 26 Feb 2015

PhUSE US Connect 2019

Step Up Your ADaM Compliance Game Ramesh Ayyappath & Graham Oakley

Study Data Reviewer s Guide Completion Guideline

PhUSE Paper SD09. "Overnight" Conversion to SDTM Datasets Ready for SDTM Submission Niels Mathiesen, mathiesen & mathiesen, Basel, Switzerland

Dealing with changing versions of SDTM and Controlled Terminology (CT)

CDISC Variable Mapping and Control Terminology Implementation Made Easy

DIA 11234: CDER Data Standards Common Issues Document webinar questions

Adding, editing and managing links to external documents in define.xml

How to write ADaM specifications like a ninja.

An Efficient Solution to Efficacy ADaM Design and Implementation

AUTOMATED CREATION OF SUBMISSION-READY ARTIFACTS SILAS MCKEE

Managing Custom Data Standards in SAS Clinical Data Integration

PhUSE Protocol Representation: The Forgotten CDISC Model

Taming Rave: How to control data collection standards?

SAS, XML, and CDISC. Anthony T Friebel XML Development Manager, SAS XML Libname Engine Architect SAS Institute Inc.

esource Initiative ISSUES RELATED TO NON-CRF DATA PRACTICES

From SDTM to displays, through ADaM & Analyses Results Metadata, a flight on board METADATA Airlines

IBIS. Case Study: Image Data Management System. IBISimg at Novartis using Oracle Database 11g Multimedia DICOM

define.xml: A Crash Course Frank DiIorio

Customer oriented CDISC implementation

Legacy to SDTM Conversion Workshop: Tools and Techniques

SAS Clinical Standards Toolkit 1.7.1: What's New

Clinical Standards Toolkit 1.7

Electronic Data Capture (EDC) Systems and Part 11 Compliance

A SDTM Legacy Data Conversion

Comparison of FDA and PMDA Requirements for Electronic Submission of Study Data

Revision of Technical Conformance Guide on Electronic Study Data Submissions

Out-of-the-box %definexml

Helping The Define.xml User

SDTM-ETL 3.2 User Manual and Tutorial

PhUSE EU Connect Paper PP15. Stop Copying CDISC Standards. Craig Parry, SyneQuaNon, Diss, England

The development of standards management using EntimICE-AZ

Harnessing the Web to Streamline Statistical Programming Processes

Managing your metadata efficiently - a structured way to organise and frontload your analysis and submission data

Standards Implementation: It Should be Simple Right? Thursday January 18, 2018

PharmaSUG Paper DS16

Data Consistency and Quality Issues in SEND Datasets

Codelists Here, Versions There, Controlled Terminology Everywhere Shelley Dunn, Regulus Therapeutics, San Diego, California

Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples

Paper FC02. SDTM, Plus or Minus. Barry R. Cohen, Octagon Research Solutions, Wayne, PA

Study Data Reviewer s Guide. FDA/PhUSE Project Summary

Define.xml tools supporting SEND/SDTM data process

How a Metadata Repository enables dynamism and automation in SDTM-like dataset generation

Paper AD16 MDMAP An innovative application utilized in the management of clinical trial metadata

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components

NCI/CDISC or User Specified CT

Transcription:

Dataset-XML - A New CDISC Standard Lex Jansen Principal Software Developer @ SAS CDISC XML Technologies Team Single Day Event CDISC Tools and Optimization September 29, 2014, Cary, NC

Agenda Dataset-XML Introduction What is Dataset-XML Dataset-XML and ODM Dataset-XML and Define-XML Dataset-XML more detail SAS Tools for Dataset-XML FDA Pilot

Introduction

Nov 5, 2012 FDA Study Data Exchange Standards Meeting Regulatory New Drug Review: Solutions for Study Data Exchange Standards http://www.fda.gov/drugs/developmentapprovalprocess/formssubmissionrequirements/electronicsubmissions/ucm332003.htm

Nov 5, 2012 FDA Study Data Exchange Standards Meeting Solicit input from industry, technology vendors and other members of the public What are the advantages and disadvantages of current and emerging open, consensus-based standards for the exchange of regulated study data Agenda based on federal register notice (FRN) with pre-meeting questions

Nov 5, 2012 FDA Study Data Exchange Standards Meeting Background The current study data exchange format supported by FDA is the ASCIIbased SAS Transport (XPORT) version 5 file format. Although XPORT has been an exchange format for many years, it is not an extensible modern technology. Moreover, it is not supported and maintained by an open, consensus-based standards development organization. FDA would like to discuss the current and emerging open study data exchange standards that will support interoperability.

Nov 5, 2012 FDA Study Data Exchange Standards Meeting Limitations of SAS Version 5 Transport (XPT) were discussed Technical Data set and Variable name length limitation (8) Data set and Variable label length limitation (40) Character variable data lengths limitation (200) Limited data types (Character, Numeric) Very limited international character support (only ASCII) Structural Two-dimensional flat data structure for hierarchical/multi-relational round data Lack of robust information model

Nov 5, 2012 FDA Study Data Exchange Standards Meeting Five options were presented at the meeting 1. SAS Transport v5 extensions (SAS Version 8 Transport format, available in SAS 9.3), addresses the character size issues 2. CDISC Operational Data Model (ODM) 3. HL7 Version 3 including Clinical Document Architecture (CDA) 4. Semantic Web Technologies: Resource Description Framework (RDF) Web Ontology Language (OWL) 5. Analytic Information Markup Language (AnIML)

What is Dataset-XML

What is Dataset-XML Alternative to SAS Version 5 Transport (XPT) format for data sets Based on CDISC ODM and Define-XML for representation of SDTM, SEND, ADaM or legacy (non-cdisc) tabular data set structures Capability to support CDISC data submissions to the FDA Based or aligned with Define-XML metadata Easy to transform to a data set for analysis (SAS, R,...)

What is Dataset-XML Benefits Open, non-proprietary standard without the field width or data set and variable naming restrictions of SAS V5 Transport files Supports representation of data relationships, metadata versions and audit trails Note: not all of these will be available in the first release Harmonized with BRIDG, CDISC Controlled Terminology Data elements include references to metadata in Define-XML Straightforward implementation starting from SDTM data in SAS Supports FDA goal of encouraging open source reviewer tool development Facilitates Validation since both data and metadata share underlying technology Enables re-thinking some of the length restrictions in standards

What is Dataset-XML Status Final specification for version 1.0 has been released in April 2014 Includes sample Define-XML files with associated Define-XML file and XML schema

What is Dataset-XML

Dataset-XML Tools Various tools under development to support Validation Data browsing (similar to SAS Viewer) Conversion of SAS XPT files to Dataset-XML Conversion of SAS data sets to Dataset-XML Conversion of Dataset-XML to SAS data sets Conversion of Dataset-XML to R

Dataset-XML Tools http://wiki.cdisc.org/display/pub/cdisc+dataset-xml+resources

Dataset-XML Tools http://wiki.cdisc.org/display/pub/cdisc+dataset-xml+resources

What is Dataset-XML Data and Metadata Data and Metadata in Submissions Today Data Metadata SAS V5 XPT Define-XML

What is Dataset-XML Data and Metadata Data and Metadata in Submissions Today Data Metadata Dataset-XML Define-XML ODM-based Standards

What is Dataset-XML Data and Metadata Relationship of Dataset-XML to other CDISC Standards Define-XML Extended by ODM Extended by Dataset-XML Represents Represents Metadata Data Defined by follows SEND model SEND-IG SDTM model SDTM-IG ADaM model ADaM-IG

What is Dataset- XML Data Transport Convert SAS data sets to Dataset-XML Send Dataset-XML Receive Dataset-XML Convert to SAS data sets or load into a data warehouse Data Transport

Dataset-XML and ODM

Dataset-XML and ODM Vendor neutral XML Schema for exchange and archive of Clinical Trials metadata and data: snapshots, updates, archives In global production use since 2000 currently at v1.3.2 Supports Part 11 compliance and FDA Guidance on Computerized Systems Includes vendor extension capability Human and machine readable

Dataset-XML and ODM Hierarchical metadata structure: Study, protocol, events, forms, item groups, items Represents an entire clinical study: Study metadata Administrative metadata Reference data Subject data Audit information Basis for Define-XML metadata description document used in submissions CDASH-ODM form metadata available SDM-XML represents BRIDG protocol/study design model (structure, workflow, timing) CT-XML delivers NCI-EVS controlled terminology

Dataset-XML and ODM - extensions CRT-DDS v1 Study Design Model Define- XML v2 ODM Analysis Results CT-XML Dataset- XML

Dataset-XML and ODM - extensions

Dataset-XML and ODM MetaData Data Data

Dataset-XML and ODM Unique Object Identifiers In ODM, there are many instances where one object needs to reference another -- both within the same file and across files within a series of ODM documents To accomplish this, the target element is given a unique identifier (its OID) All elements that need to reference that target element just use its OID The values used for OIDs can follow any convention, or even can be randomly generated The only allowed use of OIDs is to define an unambiguous link between a definition of an object and references to it

Dataset-XML and ODM

Dataset-XML and Define-XML

Dataset-XML and Define-XML (data and metadata) SAS Data

Dataset-XML and Define-XML (data and metadata) SAS Data

Dataset-XML and Define-XML Data set name? Variable names?

Dataset-XML and Define-XML Data set name? Variable names?

Dataset-XML and Define-XML Data set name? Variable names?

Dataset-XML and Define-XML

Dataset-XML and Define-XML

Dataset-XML and Define-XML

Dataset-XML More Detail

What is Dataset- XML Data Transport

Dataset-XML Subject Data Example

Dataset-XML Fields not Populated Fields that are not populated do not have any <ItemData> elements The following examples are incorrect in Dataset-XML

Dataset-XML Non-Subject Data Example

Dataset-XML Supplemental Qualifiers

SAS Tools for Dataset-XML

SAS Tools for Dataset-XML Available Now

SAS Tools for Dataset-XML Available Now CST 1.7 CDI 2.6

Dataset-XML SAS Tools SAS Data %datasetxml_write() %xml_validate() %cstutilcompare datasets() define.xml %datasetxml_read() Dataset-XML SAS Data

Dataset-XML SAS Tools SAS Data %cstutilcompare datasets() Expected differences Date- and time-related columns may get a different length, since they do not have a length defined in the Define-XML metadata Small differences in precision can be expected around the machine precision for numeric variables that represent real numbers. Character data that contains leading spaces or trailing spaces may lose the leading and trailing spaces. SAS Data

Dataset-XML SAS Tools

Dataset-XML SAS Tools

FDA Pilot

Dataset-XML FDA Pilot https://www.federalregister.gov/articles/2013/11/27/2013-28391/transport-format-for-the-submission-of-regulatory-study-data-notice-of-pilot-project

Dataset-XML FDA Pilot Objectives: Conduct an evaluation of the CDISC Dataset-XML standard as a solution to the challenges of SAS XPORT V5 transport Assess the technical capability of Dataset-XML to exchange and archive regulatory study data Assess the capability of Dataset-XML to transport the FDA-supported study data standards (SDTM, SEND, ADaM) specified in the Data Standards Catalog High level timetable: Dataset-XML submission - May/June 2014 Conduct testing - July/August 2014 Evaluate results, communicate findings - 4th Quarter 2014

Dataset-XML FDA Pilot some challenges File sizes Character encoding

Dataset-XML FDA Pilot some challenges File size SAS (compress) XPT XML ZIP LB 301.51 MB 636.07 MB 1.75 GB 52.91 MB QS 432.08 MB 776.73 MB 2.04 GB 53.68 MB SUPPLB 338.98 MB 717.81 MB 1.79 GB 29.25 MB SUPPQS 39.23 MB 37.28 MB 214.05 MB 3.73 MB

Dataset-XML FDA Pilot some challenges - Encoding An XML document starts with an optional XML declaration: <?xml version="1.0" encoding="utf-8"?> The XML declaration is the very first statement of the XML document Leaving out the encoding means: UTF-8 UTF-8 is a superset of ASCII; the first 128 characters of UTF-8 are identical to (7-bit) ASCII The first 256 codes of UTF-8 are identical to ISO 8859-1

Dataset-XML FDA Pilot some challenges - Encoding ISO 8859-1 Windows Latin-1 (Code page 1252) is not the same as ISO 8859-1 (ISO Latin 1)

Dataset-XML FDA Pilot some challenges - Encoding Windows Latin-1 This causes issues when this gets encoded as UTF-8 Be careful when copying from Excel or Word... You may want to review the "AutoCorrect" options in Word, Excel and PowerPoint.

Dataset-XML FDA Pilot some challenges - Encoding

THANK YOU! QUESTIONS?