Introduction to Define.xml Bay Area CDISC Implementation Network 4 April 2008 John Brega, PharmaStat LLC
Presentation Objectives 1. Introduce the concept and purpose of define.xml 2. Introduce the published standard 3. Provide an overview of the structure, content and presentation of Define 2
Background Official name of the standard is Case Report Tabulation Data Definition Specification (CRT-DDS) or define.xml for short Current version is v1.0.0, released as final in February 2005 Based on ODM v1.2 with extensions 3
Background Metadata Submission Guidelines Appendix to the Study Data Tabulation Model Implementation Guide (Metadata IG Appendix for short) was released for public comment in July 2007 along with SDTM IG v3.1.2 The Study Data Specifications guidance for ectd submissions specifies define.xml as the method for documenting SDTM datasets since March 2005. 4
Purpose and Uses Designed to document SDTM datasets in an ectd submission Also documents ADaM analysis data, but is not specified for that purpose by FDA Takes the place of Define.pdf from the original guidance for electronic submissions Has broader applications for transmitting study metadata Will become CDISC s method for publishing machine-readable metadata 5
But what is it? Well, there s this file called define.xml in a directory with a bunch of other stuff, and when you double click on it, your web browser opens up and displays a bunch of tables that look like spreadsheets with links. When you click on links it shows other tables or opens up documents or SAS transport files. Sometimes things don t work quite right or they look funky... 6
The Define Cocktail Your Define experience is the product of a combination of elements: 1. The file called define.xml is an xml document. This is just structured data and contains no formatting for presentation. The structure and content of this file is specified by the CRT-DDS standard. 7
The Define Cocktail Your Define experience is the product of a combination of elements: 1. The define.xml file 2. The file called define1-0-0.xsl is an xml style sheet with several associated files. Together, this group of components is a program that define.xml references. When you open the xml document this program is executed by your browser. It reads the xml document and displays it on your screen for convenient viewing. 8
The Define Cocktail Your Define experience is the product of a combination of elements: 1. The define.xml document 2. The define1-0-0.xsl style sheet and related files 3. Documents and datasets referenced by the xml data and placed in the same directory as the define.xml and style sheet. These may include an annotated CRF, SAP, SAS transport datasets and other supplemental documents. If these referenced items are not present the links on your screen will go nowhere when you click them. 9
Here s what it looks like This is the example define.xml, style sheet, data and documents provided by CDISC in the Metadata IG Appendix download. 10
What we saw 1. Links to external documents and datasets, including blankcrf.pdf and SAS xpt files. 11
What we saw 1. Links to external documents and datasets 2. Index of datasets, each with a description, a link to its variable list, and a link to its.xpt file 12
What we saw 1. Links to external documents and datasets 2. Index of datasets 3. Index of variables for each dataset. For each variable, a description and links to its codelist, CRF page and computational method (if any). If xxtestcd or QNAM, a link to its value list. 13
What we saw 1. Links to external documents and datasets 2. Index of datasets 3. Index of variables 4. Index of values for xxtestcd for Findings datasets and QNAM for supplemental qualifiers. These are presented as if they were variables. 14
What we saw 1. Links to external documents and datasets 2. Index of datasets 3. Index of variables 4. Index of values 5. Computational Algorithms associated with variables or values 15
What we saw 1. Links to external documents and datasets 2. Index of datasets 3. Index of variables 4. Index of values 5. Computational Algorithms 6. Codelists for controlled terminologies 16
When things go sideways In this demo the xml did not display correctly when I first opened it. My back button stopped working after clicking on a value list and when I clicked on a page number it opened the acrf but didn t go to the right page. Some columns and cells were empty that I thought should be filled. How do I know where to fix things that don t work? 17
When things go sideways Remember, the xml document is simply structured content. If it is conforming, then all presentation problems are a result of the style sheet interacting with your browser, directory content and OS environment. Style sheets are fragile and easily broken by browser settings, missing pieces, old software versions, etc. It s helpful to understand the interaction between structure, content, and presentation elements. 18
The content s the thing We don t all have to know how to create xml or style sheets, but we do need to know what content is needed and how to assemble it. Define should describe actual, not ideal datasets. Much of the content can be assembled from proc contents and proc freq outputs. Some content is culled from the protocol or SAP. Referential integrity among the elements is key. Style sheets have a low tolerance for things being almost right. 19
The future of Define The standard is 3 years old and needs updating to support CDISC s expanding needs, such as Results metadata in ADaM 2.0. ODM has moved to v1.3, which affects the future CRT-DDS standard, perhaps in unintended ways. Caution: the define.xml and style sheet from CDISC Pilot 1, now downloadable, are highly experimental and have not been vetted by any committee. The Pilot Team cautions against following their example until CDISC deliberates. 20
The future of Define Though the xml schema standard is very clear, there is no official standard style sheet and CDISC says it doesn t want to develop software. There are examples, and FDA may be developing narrow expectations, but for now you can write your own style sheet to do whatever you want, and some companies do. This may become a problem for the FDA. 21
The future of Define We ll get this all worked out over the next couple of years. In the meantime, the standard works and is part of the ectd specification. Let s stop eyeing the exits and deal with it! 22
Thank you! Questions? John Brega: JBrega@PharmaStat.com 23
Links to CDISC and FDA Resources CDISC Website: www.cdisc.org Download Define.xml v1.0.0 standard: www.cdisc.org/models/def/v1.0/index.html Download Metadata IG Appendix: www.cdisc.org/models/sdtm/v1.1/index.html Download ODM v1.2.1: www.cdisc.org/models/odm/v1.2.1/index.html 24
Links to CDISC and FDA Resources ectd specification of Define.xml as documentation of SDTM datasets: www.fda.gov/cder/regulatory/ersr/studydata.pdf 25