Introduction to define.xml Dave Iberson-Hurst 27 th May 2010 ESUG Webinar 1
Outline 2
Outline Introduction Purpose of define.xml XML How define works FAQ Q&A 3
Introduction 4
Introduction 5
Purpose 6
Purpose Describes What is included within the data Where did the data come from Derivations, code lists, annotated PDF etc to aid understanding Machine Readable Human Readable (after processing) To aid/inform the reviewer, unambiguous communication 7
Submission & ectd Revision 2, June 2008 http://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm 8
XML 9
Dark Side of the Moon <CDCollection> <CD TotalTime="45.02"> <Artist>Pink Floyd</Artist> <Title>Dark Side of the Moon</Title> <Track Label="1a">Speak To Me</Track> <Track Label="1b">Breathe</Track> <Track Label="2">On the Run</Track> <Track Label="3">Time</Track> <Track Label="4">The Great Gig in the Sky</Track> <Track Label="5">Money</Track> <Track Label="6">Us and Them</Track> <Track Label="7">Any Colour You Like</Track> <Track Label="8">Brain Damage</Track> <Track Label="9">Eclipse</Track> </CD> </CDCollection> 10
Dark Side of the Moon Structure <CDCollection> <CD TotalTime="45.02"> <Artist>Pink Floyd</Artist> <Title>Dark Side of the Moon</Title> <Track Label="1a">Speak To Me</Track> Element <Track Label="1b">Breathe</Track> <Track Label="2">On the Run</Track> <Track Label="3">Time</Track> Attribute <Track Label="4">The Great Gig in the Sky</Track> <Track Label="5">Money</Track> <Track Label="6">Us and Them</Track> <Track Label="7">Any Colour You Like</Track> <Track Label="8">Brain Damage</Track> <Track Label="9">Eclipse</Track> </CD> </CDCollection> 11
XML Schemas in Simple Terms Defines elements, attributes, data types etc. and their relationships Provides the specification for an XML document Enables validation of XML documents 12
Transformations XSL Extensible Stylesheet Language Used to transform an XML document Requires a tool known as XSLT processor Focuses on presentation while XML focuses on content and structure XML Document XSLT Processor New Document <?xml version="1.0"?> <xsl:stylesheet version="1.0"... XSL Document 13
How define.xml Works 14
Define Specification http://www.cdisc.org/define-xml 15
Overall Structure ODM Study GlobalVariables MetaDataVersion Links and Variable Level ItemGroupDef - Domains ItemDef - Variables CodeList - Code lists 16
Overall Structure <ODM xmlns="http://www.cdisc.org/ns/odm/v1.2" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:def="http://www.cdisc.org/ns/def/v1.0" xsi:schemalocation="http://www.cdisc.org/ns/odm/v1.2 define1-0-0.xsd" FileOID="Study1234" ODMVersion="1.2" FileType="Snapshot" CreationDateTime="2004-07-28T12:34:13-06:00"> <Study OID="1234"> <GlobalVariables> <StudyName>1234</StudyName> <StudyDescription>1234 Data Definition</StudyDescription> <ProtocolName>1234</ProtocolName> </GlobalVariables> <MetaDataVersion OID="CDISC.SDTM.3.1.0" Name="Study 1234, Data Definitions" Description="Study 1234, Data Definitions" def:defineversion="1.0.0" def:standardname="cdisc SDTM" def:standardversion="3.1.0">... All the content is here... </MetaDataVersion> </Study> </ODM> 17
Domain Meta Data Dataset Name 2 character prefix Description The description for the domain Location Folder and filename Structure level of detail provided Purpose Purpose Key Fields Used to identify and index records 18
Domain Meta Data <ItemGroupDef OID="DM" Name="DM" Repeating="No" IsReferenceData="No" Purpose="Tabulation" def:label="demographics" def:structure="one record per event per subject" def:domainkeys="studyid, USUBJID" def:class="special Purpose" def:archivelocationid="location.dm"> <ItemRef ItemOID="STUDYID" OrderNumber="1" Mandatory="Yes" Role="Identifier"/> <ItemRef ItemOID="DOMAIN" OrderNumber="2" Mandatory="Yes" Role="Identifier"/> <ItemRef ItemOID="USUBJID" OrderNumber="3" Mandatory="Yes" Role="Identifier"/>... More itemrefs Here... </ItemGroupDef> 19
Domain Meta Data <ItemGroupDef OID="DM" Name="DM" Repeating="No" IsReferenceData="No" Purpose="Tabulation" def:label="demographics" def:structure="one record per event per subject" def:domainkeys="studyid, USUBJID" def:class="special Purpose" def:archivelocationid="location.dm"> <ItemRef ItemOID="STUDYID" OrderNumber="1" Mandatory="Yes" Role="Identifier"/> <ItemRef ItemOID="DOMAIN" OrderNumber="2" Mandatory="Yes" Role="Identifier"/> <ItemRef ItemOID="USUBJID" OrderNumber="3" Mandatory="Yes" Role="Identifier"/>... More itemrefs Here... </ItemGroupDef> 20
Variable Meta Data Variable Name 8 character name Variable Description The description Type Character String or Numeric Format Identifies controlled terminology or presentation Origin Indicator of variable origin CRF or Derived Role How variable is used within a dataset (ID, Topic, Timing, Qualifier) Comments Used by sponsor to assist reviewer in interpreting the data Label Variable Label References Computational Method, Code Lists & Value Lists 21
Variable Meta Data <ItemDef OID="DOMAIN" Name="DOMAIN" DataType="text" Length="2" Origin="CRF Page" Comment="DOMAIN ABBREVIATION" def:label="domain ABBREVIATION"> </ItemDef> <ItemDef OID="STUDYID" Name="STUDYID" DataType="text" Length="8" Origin="CRF Page" Comment="Demographics CRF Page 4" def:label="study IDENTIFIER"> </ItemDef> <ItemDef OID="SUBJID" Name="SUBJID" DataType="text" Length="60" Origin="CRF Page" Comment="Demographics CRF Page 4" def:label="subject IDENTIFIER"> </ItemDef> 22
Variable Meta Data <ItemDef OID="DOMAIN" Name="DOMAIN" DataType="text" Length="2" Origin="CRF Page" Comment="DOMAIN ABBREVIATION" def:label="domain ABBREVIATION"> </ItemDef> <ItemDef OID="STUDYID" Name="STUDYID" DataType="text" Length="8" Origin="CRF Page" Comment="Demographics CRF Page 4" def:label="study IDENTIFIER"> </ItemDef> <ItemDef OID="SUBJID" Name="SUBJID" DataType="text" Length="60" Origin="CRF Page" Comment="Demographics CRF Page 4" def:label="subject IDENTIFIER"> </ItemDef> 23
Variable Meta Data <ItemDef OID="VS.VSTESTCD.FRAME Name="FRAME" DataType="float Length="8 SignificantDigits="1" Origin="CRF Page Comment="Vital Signs CRF Page 4" def:label="frame"> <CodeListRef CodeListOID="FRAME"/> </ItemDef> <CodeList OID="FRAME" Name="FRAME" DataType="text"> <CodeListItem CodedValue="S"> <Decode><TranslatedText xml:lang="en">small</translatedtext></decode> </CodeListItem> <CodeListItem CodedValue="M"> <Decode><TranslatedText xml:lang="en">medium</translatedtext></decode> </CodeListItem> <CodeListItem CodedValue="L"> <Decode><TranslatedText xml:lang="en">large</translatedtext></decode> </CodeListItem> <CodeListItem CodedValue="XL"> <Decode><TranslatedText xml:lang="en">extra large</translatedtext></decode> </CodeListItem> </CodeList> 24
Variable Meta Data <ItemDef OID="VS.VSTESTCD.FRAME Name="FRAME" DataType="float Length="8 SignificantDigits="1" Origin="CRF Page Comment="Vital Signs CRF Page 4" def:label="frame"> <CodeListRef CodeListOID="FRAME"/> </ItemDef> <CodeList OID="FRAME" Name="FRAME" DataType="text"> <CodeListItem CodedValue="S"> <Decode><TranslatedText xml:lang="en">small</translatedtext></decode> </CodeListItem> <CodeListItem CodedValue="M"> <Decode><TranslatedText xml:lang="en">medium</translatedtext></decode> </CodeListItem> <CodeListItem CodedValue="L"> <Decode><TranslatedText xml:lang="en">large</translatedtext></decode> </CodeListItem> <CodeListItem CodedValue="XL"> <Decode><TranslatedText xml:lang="en">extra large</translatedtext></decode> </CodeListItem> </CodeList> 25
Value Level Meta Data SDS Version 3 makes use of "Tall Skinny" structure. Findings domains consist of Test/Result pairs (xxtestcd/xxorres) Interpretation of information in the Results depends on the value of xxtestcd Results for different tests may have different data types, formats, labels, etc 26
Value Level Meta Data <def:valuelistdef OID="ValueList.VS.VSTESTCD"> <ItemRef ItemOID="VS.VSTESTCD.FRAME" OrderNumber="10" Mandatory="No"/> <ItemRef ItemOID="VS.VSTESTCD.HTRAW" OrderNumber="11" Mandatory="No"/> <ItemRef ItemOID="VS.VSTESTCD.WTRAW" OrderNumber="12" Mandatory="No"/> <ItemRef ItemOID="VS.VSTESTCD.MEANBP" OrderNumber="13" Mandatory="No"/> </def:valuelistdef> 27
Value Level Meta Data <def:valuelistdef OID="ValueList.VS.VSTESTCD"> <ItemRef ItemOID="VS.VSTESTCD.FRAME" OrderNumber="10" Mandatory="No"/> <ItemRef ItemOID="VS.VSTESTCD.HTRAW" OrderNumber="11" Mandatory="No"/> <ItemRef ItemOID="VS.VSTESTCD.WTRAW" OrderNumber="12" Mandatory="No"/> <ItemRef ItemOID="VS.VSTESTCD.MEANBP" OrderNumber="13" Mandatory="No"/> </def:valuelistdef> 28
Additional Information Annotated CRF Link to file containing annotated CRF See draft Meta Data Guidelines (draft) at http://www.cdisc.org/msg-draft 29
Annotated CRF <def:annotatedcrf> <def:documentref leafid="blankcrf"/> </def:annotatedcrf> <def:leaf ID="blankcrf" xlink:href="blankcrf.pdf"> <def:title>annotated Case Report Form</def:title> </def:leaf> 30
Examples http://www.cdisc.org/define-xml 31
FAQ 32
Define is an ODM Extension? Define.xml is built from the components used by CDISC to build the Operational Data Model (ODM) The ODM is used to transport Case Report Form (CRF) data Define.xnl is used to transport tabulation metadata They are quite different use cases 33
Same Components, Different Use 34
Define is an ODM Extension? 35
Define is Machine Readable? Define.xml is built using XML technology A computer can consume and process (and understand) the information within the define.xml file 36
Define is Machine Readable? 37
Define is Human Readable? As we said, define.xml is built using XML technology A computer can consume and process (and understand) the information within the define.xml file But using style sheet technology we can also transform the XML into a form that humans can understand 38
Define is Human Readable? 39
What tools do I use with define.xml? http://www.cdisc.org/define-xml 40
What tools do I use with define.xml? Slide courtesy of Formedix Limited 41
Tools OpenCDISC Validator http://www.opencdisc.org/ XML4Pharma CDISC Define.xml Checker http://www.xml4pharma.com/cdisc_define_checker/index.html SAS tool set http://www.sas.com/industry/pharma/cdisc/ Formedix Origin Submission Modeller http://www.formedix.com/cms/index.php?option=com_content&task=view&i d=28&itemid=53 Entimo entmice DARE http://www.entimo.com/solution/entimice_dare.html Octagon Checkpoint http://www.octagonresearch.com/checkpoint-data-validation.html 42
CDISC Plans New release of define Q3/Q4 2010 Support ADaM metadata Support SDTM V3.1.2 43
Summary 44
Purpose Describes What is included within the data Where did the data come from Derivations, code lists, annotated PDF etc to aid understanding Machine Readable Human Readable (after processing) To aid/inform the reviewer, unambiguous communication 45
Q&A dave.iberson-hurst@assero.co.uk dibersonhurst@cdisc.org 46