Mapping and Terminology. English Speaking CDISC User Group Meeting on 13-Mar-08

Similar documents
CS05 Creating define.xml from a SAS program

SDTM-ETL 3.2 User Manual and Tutorial

Considerations in Data Modeling when Creating Supplemental Qualifiers Datasets in SDTM-Based Submissions

Hanming Tu, Accenture, Berwyn, USA

A Taste of SDTM in Real Time

SDTM Attribute Checking Tool Ellen Xiao, Merck & Co., Inc., Rahway, NJ

From SAP to BDS: The Nuts and Bolts Nancy Brucken, i3 Statprobe, Ann Arbor, MI Paul Slagle, United BioSource Corp., Ann Arbor, MI

Planning to Pool SDTM by Creating and Maintaining a Sponsor-Specific Controlled Terminology Database

Dealing with changing versions of SDTM and Controlled Terminology (CT)

Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania Kim Minkalis, Accenture Life Sciences, Wayne, Pennsylvania

SDTM Implementation Guide Clear as Mud: Strategies for Developing Consistent Company Standards

Introduction to Define.xml

The application of SDTM in a disease (oncology)-oriented organization

Generating Define.xml from Pinnacle 21 Community

Doctor's Prescription to Re-engineer Process of Pinnacle 21 Community Version Friendly ADaM Development

CDISC Standards and the Semantic Web

The Wonderful World of Define.xml.. Practical Uses Today. Mark Wheeldon, CEO, Formedix DC User Group, Washington, 9 th December 2008

PharmaSUG Paper PO21

IS03: An Introduction to SDTM Part II. Jennie Mc Guirk

Best Practice for Explaining Validation Results in the Study Data Reviewer s Guide

OpenCDISC Validator 1.4 What s New?

SDTM-ETL 3.1 User Manual and Tutorial. Working with the WhereClause in define.xml 2.0. Author: Jozef Aerts, XML4Pharma. Last update:

Making a List, Checking it Twice (Part 1): Techniques for Specifying and Validating Analysis Datasets

DIA 11234: CDER Data Standards Common Issues Document webinar questions

Considerations on creation of SDTM datasets for extended studies

Study Composer: a CRF design tool enabling the re-use of CDISC define.xml metadata

DCDISC Users Group. Nate Freimark Omnicare Clinical Research Presented on

Define.xml 2.0: More Functional, More Challenging

Study Data Tabulation Model Implementation Guide: Human Clinical Trials Prepared by the CDISC Submission Data Standards Team

Riepilogo e Spazio Q&A

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES

CDISC Implementation Step by Step: A Real World Example

Advantages of a real end-to-end approach with CDISC standards

ADaM Implementation Guide Prepared by the CDISC ADaM Team

R1 Test Case that tests this Requirement Comments Manage Users User Role Management

Sandra Minjoe, Accenture Life Sciences John Brega, PharmaStat. PharmaSUG Single Day Event San Francisco Bay Area

Traceability in the ADaM Standard Ed Lombardi, SynteractHCR, Inc., Carlsbad, CA

CDISC Implementation, Real World Applications

Paper FC02. SDTM, Plus or Minus. Barry R. Cohen, Octagon Research Solutions, Wayne, PA

Beyond OpenCDISC: Using Define.xml Metadata to Ensure End-to-End Submission Integrity. John Brega Linda Collins PharmaStat LLC

SAS offers technology to facilitate working with CDISC standards : the metadata perspective.

How to write ADaM specifications like a ninja.

Introduction to ADaM standards

SDTM-ETL. New features in version 3.2. SDTM-ETLTM: New features in v.3.2

Automation of SDTM Programming in Oncology Disease Response Domain Yiwen Wang, Yu Cheng, Ju Chen Eli Lilly and Company, China

CDASH Standards and EDC CRF Library. Guang-liang Wang September 18, Q3 DCDISC Meeting

Now let s take a look

CDISC Standards End-to-End: Enabling QbD in Data Management Sam Hume

CDASH MODEL 1.0 AND CDASHIG 2.0. Kathleen Mellars Special Thanks to the CDASH Model and CDASHIG Teams

SDTM-ETL 4.0 Preview of New Features

Data De-Identification Made Simple

Robust approach to create Define.xml v2.0. Vineet Jain

Streamline SDTM Development and QC

Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA

How to handle different versions of SDTM & DEFINE generation in a Single Study?

Why organizations need MDR system to manage clinical metadata?

Deriving Rows in CDISC ADaM BDS Datasets

Traceability: Some Thoughts and Examples for ADaM Needs

Introduction to ADaM and What s new in ADaM

Customer oriented CDISC implementation

Common Programming Errors in CDISC Data

PharmaSUG Paper DS24

Application of SDTM Trial Design at GSK. 9 th of December 2010

Johannes Ulander. Standardisation and Harmonisation Specialist S-Cubed. PhUSE SDE Beerse,

PharmaSUG Paper DS-24. Family of PARAM***: PARAM, PARAMCD, PARAMN, PARCATy(N), PARAMTYP

Let s Create Standard Value Level Metadata

Office of Clinical Research. CTMS Reference Guide Report Building

PharmaSUG2011 Paper CD20

Data Standardisation, Clinical Data Warehouse and SAS Standard Programs

Edwin Ponraj Thangarajan, PRA Health Sciences, Chennai, India Giri Balasubramanian, PRA Health Sciences, Chennai, India

SAS Application to Automate a Comprehensive Review of DEFINE and All of its Components

SDTM domains by query - is it possible?

ADaM Implementation Guide Status Update

Improving Metadata Compliance and Assessing Quality Metrics with a Standards Library

Metadata and ADaM.

Lex Jansen Octagon Research Solutions, Inc.

Submission-Ready Define.xml Files Using SAS Clinical Data Integration Melissa R. Martinez, SAS Institute, Cary, NC USA

Creating an ADaM Data Set for Correlation Analyses

Implementing CDISC Using SAS. Full book available for purchase here.

PharmaSUG Paper DS06 Designing and Tuning ADaM Datasets. Songhui ZHU, K&L Consulting Services, Fort Washington, PA

ADaM Reviewer s Guide Interpretation and Implementation

SDTM-ETL 3.1 User Manual and Tutorial

esubmission - Are you really Compliant?

Updates on CDISC Standards Validation

Power Data Explorer (PDE) - Data Exploration in an All-In-One Dynamic Report Using SAS & EXCEL

e-subs: Centre Marks Submission (CMS) FAQs

Are you Still Afraid of Using Arrays? Let s Explore their Advantages

CDISC SDTM and ADaM Real World Issues

A SAS based solution for define.xml

Implementing CDISC at Boehringer Ingelheim

Experience of electronic data submission via Gateway to PMDA

It s All About Getting the Source and Codelist Implementation Right for ADaM Define.xml v2.0

Building a Fast Track for CDISC: Practical Ways to Support Consistent, Fast and Efficient SDTM Delivery

PhUSE EU Connect Paper PP15. Stop Copying CDISC Standards. Craig Parry, SyneQuaNon, Diss, England

Adding, editing and managing links to external documents in define.xml

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA

AUTOMATED CREATION OF SUBMISSION-READY ARTIFACTS SILAS MCKEE

Sorting and Filtering Data

Transcription:

Mapping and Terminology English Speaking CDISC User Group Meeting on 13-Mar-08

Statement of the Problem GSK has a large drug portfolio, therefore there are many drug project teams GSK has standards 8,200 variables for core data and validated rating scales with 860 codelists 12,000 other variables (therapeutic standards and study specific variables) with another 1,500 codelists For SDTM submissions in the near term, all data will need to be mapped Later submissions will be a combination of mapped studies and SDTM designed studies

What do we mean by Standards A combination of definition and implementation Definition: the data items and associated codelists - the variable catalogue The associations between data items i.e. what set of items constitutes a single piece of information e.g. severe headache for subject 1 on 13-Mar-08 with action=drug stopped ) - the data records The data checks we want the summaries/displays we want to produce Implementation: The paper CRFs/eCRFs The dataset specifications The programming to create checks, datasets, displays

Impact of CDISC Implementation Let s pick a point in time implementation doesn t mean changing what we want to collect or what we want to report but it does mean changing our datasets our software, our processes and possibly our paper/e-crfs, and we may need to change some of our codelists (more on these later)

Our mapping Our expectation is that GSK will always have to do some mapping We expect our experiences gained from mapping our standards to help us design studies with SDTM in mind This will be true for most companies with in-house standards

One Hand Grenade We have 2300 codelists CDISC has developed around 50 in two years Even if this rate is accelerated by orders of magnitude, all companies will have to determine how to handle this new terminology as it comes on-stream over the years (particularly mandatory terminology)

General Rules for Starting Mapping Find a place for all our variables If we don t capture data, don t generate it Map from GSK standard to SDTM, not the other direction Don t map anything purely derived for analysis purposes If no logical place for a variable, assume SUPPQUAL

Our Mapping Experiences When our variables have a simple match to SDTM variables (which might include a transposition) easy, just apply algorithms all about how to do these mappings efficiently and with no errors can be done by anyone When our variables don t have a simple match, particularly when a choice is involved hard, risk of multiple and inconsistent approaches dependent on guidelines to eliminate errors and inconsistencies need the right people to interrogate the data

Our Approach Excel based (everyone has access, familiar software) Spreadsheet automatically pulls in our standards metadata and is used in dropdowns to avoid typos Minimise the amount of manual effort Make it as easy as possible Staged approach a structured template guiding the mapper through the process Don t over-complicate the process to fit every eventuality Team effort rather than purely individual Committee to address issues that crop up

Tabs in the Spreadsheet GSK variable catalogue The SDTM Domain Our mapping sheet SUPPQUAL variables An easy way of looking at our GSK codelists Value level metadata Tabs have been placed in the order in which they are to be completed

Key Feature of the Mapping Sheet SDTM Variable Name SDTM Variab le Include in SDTM+ dataset Source Variable Name Defau lt Value Req'd SDTM+ Var Desc Data Type Source Type Source Variable Description STUDYID Y Unique identifier for the study Char Y Req STUDYID N Unique identifier for the study Text Y SourceVariable STUDYID Unique identifier for the study Y DOMAIN Y Domain abbreviation Char Y Req DOMAIN N Domain abbreviation Text Y DefaultValue DS Y USUBJID Y Unique subject identifier Char Y Req DOMAIN N Unique subject identifier Text Y SourceVariable USUBJID Unique subject ID N USUBJID N Unique identifier for the study Text Y SourceVariable STUDYID Unique identifier for the study N USUBJID N Subject identifier Fixed Y SourceVariable SUBJID Subject ID N DSSEQ Y Sequence number Num Req DSSEQ N Start value for generated sequence Fixed SourceVariable Y number DSDTC Y Date/time of assessment Char Exp DSDTC N Datetime of assessment Datetime SourceVariable N DSDTC N Actual date of assessment Date SourceVariable N DSDTC N Actual time of assessment Time SourceVariable N DSDTC N Actual date of assessment (char Date SourceVariable N dup) DSDUR Y Duration Char DSDUR N Duration in years Float SourceVariable N DSDUR N Duration in months Fixed SourceVariable N DSDUR N Duration in weeks Fixed SourceVariable N DSDUR N Duration in days Fixed SourceVariable N DSDUR N Duration in hours Fixed SourceVariable N DSDUR N Duration in minutes Fixed SourceVariable N DSDUR N Duration in seconds Fixed SourceVariable N DSDUR N Duration Fixed SourceVariable N DSDUR N Duration units Text SourceVariable N DSDUR N Duration code Text SourceVariable N DSDUR N Duration decode Text SourceVariable N A home for all the variables that map easily even if there are multiple variables that go to make just one SDTM variable no need for mappers to add rows!

Key Feature of the Value Level Metadata Sheet Include in SDTM+ dataset Source Variable Name SDTM Value TESTCD SDTM+ Description Data Type Source Type Variable Description Y SYSBP Systolic blood pressure Y N SYSBP Original numeric result Float Y SourceVariable SYSBP Systolic blood pressure (mmhg) N SYSBP Original units decode Text N SYSBP Original units code Text N SYSBP Original units (intuitive code or free text) Text Y DefaultValue MMHG N SYSBP Original result score code Fixed N SYSBP Original result code Text N SYSBP Original result (decode) Text N SYSBP Original result specify Text N SYSBP Original result text Text N SYSBP Original result date Date N SYSBP Original result date (char dup) Text N SYSBP Original result time Time N SYSBP Original result datetime Datetime N SYSBP Assessor code Text N SYSBP Assessor decode Text N SYSBP Location used for the measurement decode Text N SYSBP Location used for the measurement code Text N SYSBP Method of test or examination code Text N SYSBP Method of test or examination decode Text N SYSBP Severity decode Text N SYSBP Toxicity grade decode Text N SYSBP Subject position code Text Y SourceVariable VSPOSCD Subject position code N SYSBP Subject position decode Text Y SourceVariable VSPOS Subject position N SYSBP Baseline flag Text This sheet handles the mapping of topic variables (parameters, questions etc) and results when the source dataset is horizontal/non-normalised also need to handle provision of value level metadata when the source dataset is normalised Defaul t Value

SDTM Variable Name Key Feature of the SUPPQUAL Sheet Include in SDTM Variable SDTM+ Var Desc SDTM+ Data Type dataset Source Type Source Variable Name Source Variable Description Default Value STUDYID Y Unique identifier for the study Char Y Req STUDYID N Unique identifier for the study Text Y SourceVariable STUDYID Unique identifier for the study Y RDOMAIN Y Related domain abbreviation Char Y Exp RDOMAIN N Related domain abbreviation Text Y DefaultValue VS Y USUBJID Y Unique subject identifier Char Y Req RDOMAIN N Unique subject identifier Text Y SourceVariable USUBJID Unique subject ID N USUBJID N Unique identifier for the study Text Y SourceVariable STUDYID Unique identifier for the study N USUBJID N Subject identifier Fixed Y SourceVariable SUBJID Subject ID N IDVAR Y Identifying variable Char Y Exp IDVAR N Identifying variable Text Y ValueLevel Y IDVARVAL Y Identifying variable value Char Y Exp IDVARVAL N Identifying variable value Text Y ValueLevel Y QNAM Y Qualifier variable name Char Y Req QNAM N Qualifier variable name Text Y ValueLevel Y QLABEL Y Qualifier variable label Char Y Req QLABEL N Qualifier variable label Text Y ValueLevel Y QVAL Y Data value Char Y Req QVAL N Numeric data value Float ValueLevel N QVAL N Text data value Text ValueLevel N QVAL N Date data value Date ValueLevel N QVAL N Date data value (char dup) Text ValueLevel N QVAL N Time data value Time ValueLevel N QVAL N Datetime data value Datetime ValueLevel N QVAL N Data value code Text ValueLevel N QVAL N Data value score code Fixed ValueLevel N QVAL N Data value decode Text ValueLevel N QVAL N Data value specify Text ValueLevel N QORIG Y Origin Char Req QORIG N Origin Text ValueLevel Y QEVAL Y Evaluator Char Exp QEVAL N Evaluator text Text ValueLevel N QEVAL N Evaluator code Text ValueLevel N QEVAL N Evaluator decode Text ValueLevel N QEVAL N Evaluator specify Text ValueLevel N This sheet contains all the SUPPQUAL variables and is populated according to the information entered on the SUPPQUAL Metadata sheet Req'd

The easy interface to enter SUPPQUAL information Include in SDTM+ Source Type Source Variable Name SDTM Qualifier QNAM SDTM+ Description Data Type Variable Description Default Value Y VSQUAL Reading qualifier Y N VSQUAL Identifying variable Text Y DefaultValue VSSEQ N VSQUAL Identifying variable value Text Y SDTMVariable VSSEQ Sequence number N VSQUAL Numeric data value Float N VSQUAL Text data value Text N VSQUAL Date data value Date N VSQUAL Date data value (char dup) Text N VSQUAL Time data value Time N VSQUAL Datetime data value Datetime N VSQUAL Data value code Text Y SourceVariable VSQUALCD Reading qualifier code N VSQUAL Data value score code Fixed N VSQUAL Data value decode Text Y SourceVariable VSQUAL Reading qualifier N VSQUAL Data value specify Text N VSQUAL Evaluator text Text N VSQUAL Evaluator code Text N VSQUAL Evaluator decode Text N VSQUAL Evaluator specify Text N VSQUAL Origin Text Y DefaultValue CRF All the information needed for each SUPPQUAL variable is entered through this sheet (no limitation on the number of SUPPQUAL records)

Learnings Knowledge, understanding the context of the data is critical for successful mapping at the dataset level e.g. vitals at the study level (not always obvious what is in a variable) It is possible to define algorithms to automate the physical mapping Multiple options for mapping a variable (wiggle-room) Not everything goes into SUPPQUAL (SUPPQUAL should be a last resort!) Pre-processing is sometimes needed to make things mappable Some of our codelists contain multiple sets of information, in other cases we have multiple codelists covering a single set of information. These take extra effort. We ve created principles, adapted the template to reflect our learnings mapping the 50 or so core standards On rare occasions, it isn t possible to use the template e.g. our genetics sample information (all the data needs to be pre-processed) It takes time to agree on a home for tricky variables