How Macro Design and Program Structure Impacts GPP (Good Programming Practice) in TLF Coding

Similar documents
How Macro Design and Program Structure Impacts GPP (Good Programming Practice) in TLF Coding

Using PROC SQL to Generate Shift Tables More Efficiently

Traceability in the ADaM Standard Ed Lombardi, SynteractHCR, Inc., Carlsbad, CA

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania

A Practical and Efficient Approach in Generating AE (Adverse Events) Tables within a Clinical Study Environment

PharmaSUG Paper DS-24. Family of PARAM***: PARAM, PARAMCD, PARAMN, PARCATy(N), PARAMTYP

Hands-On ADaM ADAE Development Sandra Minjoe, Accenture Life Sciences, Wayne, Pennsylvania Kim Minkalis, Accenture Life Sciences, Wayne, Pennsylvania

From SAP to BDS: The Nuts and Bolts Nancy Brucken, i3 Statprobe, Ann Arbor, MI Paul Slagle, United BioSource Corp., Ann Arbor, MI

The Implementation of Display Auto-Generation with Analysis Results Metadata Driven Method

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

NCI/CDISC or User Specified CT

Creating output datasets using SQL (Structured Query Language) only Andrii Stakhniv, Experis Clinical, Ukraine

PharmaSUG Paper DS24

From Just Shells to a Detailed Specification Document for Tables, Listings and Figures Supriya Dalvi, InVentiv Health Clinical, Mumbai, India

AUTOMATED CREATION OF SUBMISSION-READY ARTIFACTS SILAS MCKEE

DCDISC Users Group. Nate Freimark Omnicare Clinical Research Presented on

How to write ADaM specifications like a ninja.

Metadata and ADaM.

Deriving Rows in CDISC ADaM BDS Datasets

Introduction to ADaM standards

An Efficient Solution to Efficacy ADaM Design and Implementation

Some Considerations When Designing ADaM Datasets

The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards Maggie Ci Jiang, Teva Pharmaceuticals, Great Valley, PA

%ANYTL: A Versatile Table/Listing Macro

Conversion of CDISC specifications to CDISC data specifications driven SAS programming for CDISC data mapping

ADaM Reviewer s Guide Interpretation and Implementation

ADaM Implementation Guide Prepared by the CDISC ADaM Team

PharmaSUG Paper DS06 Designing and Tuning ADaM Datasets. Songhui ZHU, K&L Consulting Services, Fort Washington, PA

An Introduction to Visit Window Challenges and Solutions

THE DATA DETECTIVE HINTS AND TIPS FOR INDEPENDENT PROGRAMMING QC. PhUSE Bethan Thomas DATE PRESENTED BY

Creating an ADaM Data Set for Correlation Analyses

Implementing CDISC Using SAS. Full book available for purchase here.

A Taste of SDTM in Real Time

Introduction to ADaM and What s new in ADaM

Sorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine

Leveraging ADaM Principles to Make Analysis Database and Table Programming More Efficient Andrew L Hulme, PPD, Kansas City, MO

Validating Analysis Data Set without Double Programming - An Alternative Way to Validate the Analysis Data Set

It s All About Getting the Source and Codelist Implementation Right for ADaM Define.xml v2.0

ADaM for Medical Devices: Extending the Current ADaM Structures

Pooling Clinical Data: Key points and Pitfalls. October 16, 2012 Phuse 2012 conference, Budapest Florence Buchheit

PharmaSUG DS05

Manipulating Statistical and Other Procedure Output to Get the Results That You Need

Customer oriented CDISC implementation

Analysis Data Model (ADaM) Data Structure for Adverse Event Analysis

ABSTRACT INTRODUCTION WHERE TO START? 1. DATA CHECK FOR CONSISTENCIES

Riepilogo e Spazio Q&A

Data Standardisation, Clinical Data Warehouse and SAS Standard Programs

Applying ADaM Principles in Developing a Response Analysis Dataset

Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples

ADaM IG v1.1 & ADaM OCCDS v1.0. Dr. Silke Hochstaedter

There s No Such Thing as Normal Clinical Trials Data, or Is There? Daphne Ewing, Octagon Research Solutions, Inc., Wayne, PA

Automation of STDM dataset integration and ADaM dataset formation

Let s Get FREQy with our Statistics: Data-Driven Approach to Determining Appropriate Test Statistic

Linking Metadata from CDASH to ADaM Author: João Gonçalves Business & Decision Life Sciences, Brussels, Belgium

SAS Online Training: Course contents: Agenda:

Dealing with changing versions of SDTM and Controlled Terminology (CT)

Prove QC Quality Create SAS Datasets from RTF Files Honghua Chen, OCKHAM, Cary, NC

Programming checks: Reviewing the overall quality of the deliverables without parallel programming

From Implementing CDISC Using SAS. Full book available for purchase here. About This Book... xi About The Authors... xvii Acknowledgments...

Use of Traceability Chains in Study Data and Metadata for Regulatory Electronic Submission

PharmaSUG Paper PO22

CFB: A Programming Pattern for Creating Change from Baseline Datasets Lei Zhang, Celgene Corporation, Summit, NJ

SAS (Statistical Analysis Software/System)

Optimization of the traceability when applying an ADaM Parallel Conversion Method

ADaM Implementation Guide Status Update

PharmaSUG China 2018 Paper AD-62

Automate Clinical Trial Data Issue Checking and Tracking

%Addval: A SAS Macro Which Completes the Cartesian Product of Dataset Observations for All Values of a Selected Set of Variables

Working with Composite Endpoints: Constructing Analysis Data Pushpa Saranadasa, Merck & Co., Inc., Upper Gwynedd, PA

Programming Gems that are worth learning SQL for! Pamela L. Reading, Rho, Inc., Chapel Hill, NC

Programmatic Automation of Categorizing and Listing Specific Clinical Terms

Yes! The basic principles of ADaM are also best practice for our industry Yes! ADaM a standard with enforceable rules and recognized structures Yes!

Contents of SAS Programming Techniques

Reducing SAS Dataset Merges with Data Driven Formats

PharmaSUG Paper PO21

Advanced Data Visualization using TIBCO Spotfire and SAS using SDTM. Ajay Gupta, PPD

What is the ADAM OTHER Class of Datasets, and When Should it be Used? John Troxell, Data Standards Consulting

From Manual to Automatic with Overdrive - Using SAS to Automate Report Generation Faron Kincheloe, Baylor University, Waco, TX

Using SAS to Analyze CYP-C Data: Introduction to Procedures. Overview

This paper describes a report layout for reporting adverse events by study consumption pattern and explains its programming aspects.

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO

Advanced Visualization using TIBCO Spotfire and SAS

ADaM and traceability: Chiesi experience

SAS Programs SAS Lecture 4 Procedures. Aidan McDermott, April 18, Outline. Internal SAS formats. SAS Formats

A Macro for Generating the Adverse Events Summary for ClinicalTrials.gov

A Lazy Programmer s Macro for Descriptive Statistics Tables

V for Variable Information Functions to the Rescue

Clinical Data Visualization using TIBCO Spotfire and SAS

Mapping Clinical Data to a Standard Structure: A Table Driven Approach

Automated generation of program templates with metadata based specification documents for integrated analysis Thomas Wollseifen, Germany Vienna, Oct

Making a List, Checking it Twice (Part 1): Techniques for Specifying and Validating Analysis Datasets

The Proc Transpose Cookbook

The Dataset Diet How to transform short and fat into long and thin

Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment

Beyond OpenCDISC: Using Define.xml Metadata to Ensure End-to-End Submission Integrity. John Brega Linda Collins PharmaStat LLC

ADaM Compliance Starts with ADaM Specifications

Utilization of Python in clinical study by SASPy

PhUSE Paper TT05

SD10 A SAS MACRO FOR PERFORMING BACKWARD SELECTION IN PROC SURVEYREG

Automate Analysis Results Metadata in the Define-XML v2.0. Hong Qi, Majdoub Haloui, Larry Wu, Gregory T Golm Merck & Co., Inc.

Transcription:

How Macro Design and Program Structure Impacts GPP (Good Programming Practice) in TLF Coding Galyna Repetatska, Kyiv, Ukraine PhUSE 2016, Barcelona

Agenda Number of operations for SAS processor: between multiplicative and additive Tools and factors helpful to minimize programming and data dependency Keys to universal open-code programming TLF-conventional variables #1: groups, categories and analysis data Alignment with GPP TLF-conventional variables #2: control decimal alignment One-Proc calculation with BY and OUTPUT for Adverse Events by Severity Different types of analysis for Demographics and Baseline Characteristics Useful tricks of PROC SQL to generalize study-specific programming From open code to macro design 2 Proprietary & Confidential. 2016 Chiltern

Number of operations for SAS processor: between multiplicative and additive Calculation of each block individually gives the maximum of program steps: N operations ~ N a * N var * N grp * N par * N tpt ; BDS structure helps to reduce program (but not for pooled categories yet): N operations ~ N a * N var * N grp ; Reasonable minimum of operations (Data/Proc steps used to provide result) will be number of statements in specification or shell used to describe task: N operations N a + N var + N grp + N par + N tpt ; The only non-vanishing component is type of analysis: Time points: Ntpt Table 14.3.x.x Summary of Change from Baseline in Vital Sign Results Safety Population Parameter: xxx (units) ADVS.param Number of parameters: Npar Treatment groups: Ngrp=2 N operations ~ N a TRT PBO (N=xx) (N=xx) Timepoint Baseline At Timepoint Change Baseline At Timepoint Change ADVS.base ADVS.aval ADVS.chg ADVS.avisit,atpt Analysis Variables: Nvar=3 Baseline n Types of analysis: xxx Na=1 xxx Mean xxx.x xxx.x SD xxx.xx xxx.xx Median xxx.x xxx.x Min, Max xxx.x, xxx xxx.x, xxx Q1, Q3 Xxx.x, xxx.x Xxx.x, xxx.x Post-Treatment Assessment 1 n xxx xxx xxx xxx xxx xxx Mean xxx.x xxx.x xxx.x xxx.x xxx.x xxx.x SD xxx.xx xxx.xx xxx.xx xxx.xx xxx.xx xxx.xx Median xxx.x xxx.x xxx.x xxx.x xxx.x xxx.x Min, Max xxx.x, xxx xxx.x, xxx xxx.x, xxx xxx.x, xxx xxx.x, xxx xxx.x, xxx Q1, Q3 Xxx.x, xxx.x Xxx.x, xxx.x Xxx.x, xxx.x Xxx.x, xxx.x Xxx.x, xxx.x Xxx.x, xxx.x 3 Proprietary & Confidential. 2016 Chiltern Note: Only subjects with both baseline and timepoint values are summarized at a given timepoint.

Tools and Factors helpful to minimize Programming and Data Dependency Subsequently, reducing the number of operations directly impacts: Ø Ø Ø LOG file and debug; How much dissociated WORK datasets will be kept, reviewed and joined together; Adaptability to another task. Basic elements helpful for TLF programming: BY statement allows to repeat analysis by categories, settled by list of variables; SDTM structure for Interventions and ADAM BDS standard variables perfectly match use of BY statement and provides traceability of result; We can reinforce BY with OUTPUT to create categories for TLF analysis; Reference to variables, list of variables in BY statement and other common settings (such as formatting) via macro variables to enable flexibility; Organize code following GPP principles in order to optimize work and result, thereof: ü Do not derive anything in more than one place; ü Perform only one task per module or macro. 4 Proprietary & Confidential. 2016 Chiltern

Keys to Universal Open-Code Use of TLF-conventional variables Traceability of data Flexibility due to macro variables Alignment with GPP principles 5 Proprietary & Confidential. 2016 Chiltern

TLF-conventional variables #1: groups, categories and analysis data Variables in Dataset Macro Variables Subject-level groups: o TRT(N), GRP(N) treatment/subject groups o Example: GRP = AGEGR1; GRPN = AGEGR1N; Data-level categories: o CAT1(N), CAT2(N) grouping categories o Subject to be counted once per category o "Gender", "BMI(kg/m2)", "BMI group", AVISIT(N), PARAM(N), AEBODSYS Variables for analysis and output: o COL1(N), COL2(N) columns to display o Example 1: "n", "Mean (SD)", "Any AE" o Example 2: RACE, AVALCAT1(N), CRITxx o AVALUE(N) basic variables for analysis o PVALUE(N), LOGVALUE, o &BYTRT, &BYGRP o BYTRT = TRTAN TRTA; o &BYCAT, &BYVIS, &BYPARM o BYVIS= AVISITN AVISIT; o BYPARM= PARCAT1 PARAMN PARAMCD PARAM; o BYCAT=PARCAT1N cat1; o &BYMOCK o BYMOCK = PARAMN PARAM CAT1N CAT1 COL1N COL1; o &BYVAL o BYVAL= ASEVN ASEV; o Names to be the same or similar 6 Proprietary & Confidential. 2016 Chiltern

Alignment with GPP Not Recommended: Data adsl; set adsl; TRT01AN=0; TRT01A = "Total"; Treatment variable explicitly shown (+) Modification to other variable not flexible: many changes through code (-) WORK.ADSL not subject-level yet (-) Assigned Total for TRT01A(N) variable out of controlled terminology (-) ANRIND = "Overall"; AEBODSYS = propcase(aebodsys,"."); AEDECOD = " " strip(aedecod); Recommended: Data subj_trt; length TRTN 8 TRT $40; set adsl; trtn = trt01an; trt = trt01a; if not missing(trtn) then do; trtn = 0; trt = "Total"; call missing(trt01an, trt01a); end; %let bytrt= trtn trt; New TLF-conventional variable created; TRT01A(N) can be easily replaced; alternatively, global variable can be used; col1 = "Overall"; cat1 = propcase(aebodsys,"."); cat2 = " " strip(aedecod); %let bycat=aebodsys cat1 AEDECOD cat2; 7 Proprietary & Confidential. 2016 Chiltern

TLF-conventional variables #2: control decimal alignment Decimal Formats Macro Variables &Dec0 - &DecN global variables to maintain consistent decimal alignment %let dec0=3.; %let dec1=5.1; %let dec2=6.2; %let dec3=7.3; %let dec4=&dec3; %let dec5=&dec3; length col1n 8 col1 $200 rez $20; col1n = 1; col1 = "n"; rez = put(n,&dec0.); col1n = 2; col1 = "Mean"; rez = put(mean,&dec1.); col1n = 3; col1 = "SD"; rez = put(sd,&dec2.); NDEC/&NDEC[=0,1,2,3 ] number of decimals for MIN, MAX univariates o Refer to variable, not eventual instances o Local formatting for macro calls Utilize local dataset to track macro variables %local decv decm decs; %let byvar = avalcat1n avalcat1; Data _localvars_; DecV=symget("dec" put(&ndec.,1.)); DecM=symget("dec" put(&ndec.+1,1.)); DecS=symget("dec" put(&ndec.+2,1.)); _byvar_frq=tranwrd("&byvar",' ','*'); array lvars _ALL_; do over lvars; call symputx(vname(lvars),lvars); end; 8 Proprietary & Confidential. 2016 Chiltern

One-Proc Calculation with BY and OUTPUT: Adverse Events by Severity Each event representative have to be analyzed at 3 levels of categorization At each level one record per subject has to be selected o LVL (level of categorization) supplementary variable for datadriven ordering based on frequency o CAT1 can be created after processing, but earlier initialization of non-missing variable is in place o ADAE severity variables can be replaced to relationship to study drug, etc. OUTPUT Data aecat; length lvl 8 cat1 $200; label cat1="soc Preferred Term"; set adae; lvl=2; cat1=" " strip(aedecod); call missing(aedecod); lvl=1; cat1=propcase(aebodsys,'.'); lvl=0; cat1="subjects with at least one TEAE"; call missing(aedecod, aebodsys); run; %let bycat = AEBODSYS AEDECOD lvl cat1; %let byvar = ASEVN ASEV; ANY dataset variables # of levels 9 Proprietary & Confidential. 2016 Chiltern

One-Proc Calculation with BY and OUTPUT: Adverse Events by Severity BY %let bycat= AEBODSYS AEDECOD lvl cat1; %let byvar= ASEVN ASEV; %let bytrt= trtn trt; All set of treatment counts in one step Proc Means data=subj&rnum; by &bytrt; var flag; output out=totals&rnum n=nsub; run; *Add column labels, macro vars...; Traceability: counts and labels for treatment groups accessible from dataset Merge subject groups with AE categories Proc Sql noprint; create table data&rnum as select * from subj&rnum s, indata&rnum d where s.usubjid = d.usubjid; quit; Get AE with maximum severity at 3 levels Data datasubj&rnum; set data&rnum; by &bytrt &bycat usubjid &byvar; if last.usubjid; One-Proc Calculation Proc Freq data=datasubj&rnum; by &bytrt &bycat &byvar; tables flag / out=count_subj&rnum (drop=percent); Format table cells: Ø Use TOTALSxx.Nsub for %; Ø Format cells prior to any transpose; Ø Setup columns other than default [treatments] %let dec0 = 3.; Data res_all&rnum; merge count_subj&rnum totals&rnum; by &bytrt; length rez $20 column $20 collbl $40; percent = 100*count/Nsub; length _perc $8; _perc = cats("(",put(percent,5.1),"%)" ); rez = put(count,&dec0.) " " right(_perc); *~Create columns to transpose~*; column= 10*trtn + asevn; collbl = ASEV; 10 Proprietary & Confidential. 2016 Chiltern

One-Proc Calculation with BY and OUTPUT: Adverse Events by Severity & Proc Transpose data=res_all&rnum out=result&rnum prefix=trt; by &bycat &byvar; var rez; id trtn; idlabel trt; run; Standard layout Customized (spanning) Proc Transpose data=res_all&rnum out=result&rnum prefix=trt; by &bycat; var rez; id column; idlabel collbl; 11 Proprietary & Confidential. 2016 Chiltern

Different Types of Analysis for Demographic and Baseline Characteristic Data data_qual; length group $4 cat1n 8 cat1 $200 col1n 8 col1 $200 pcat $200; set adsl; group = "QUAL"; cat1n=1; cat1="gender"; col1n=ifn(sex="m",1,1,.); col1 =put(sex,$genderf.); pcat = sex; cat1n=3; cat1=vlabel(race); col1n= aracen; col1 = arace; pcat= ifc(race='white',race,'other',''); Data data_quan; length group $4 cat1n 8 cat1 $200 avalue ndec 8; set adsl; group = "QUAN"; cat1n = 2; cat1 = "Age"; avalue = age; ndec = 0; cat1n = 4; cat1="duration at Study(weeks)"; avalue = DURSTUDY; ndec = 1; Proc Freq data=data_qual; by trtn trt cat1n cat1; tables col1n*col1/out=freqs; run; 12 Proprietary & Confidential. 2016 Chiltern Proc Means data=data_quan; by trtn trt cat1n cat1 ndec; var avalue; output out=means &means_out; run;

Useful Tricks of PROC SQL to Generalize Study-Specific Programming With VARIABLE LISTS as BY-parameters, any data-driven shell can be done *this work well if full set of &BYVAL values appears at least once in dataset %let byparm=paramcd PARAM; %let byvis= AVISITN AVISIT; %let byval=avalc; Proc Sort data=data&oid nodupkey out=byparm&oid(keep=&byparm);by &byparm;run; Proc Sort data=data&oid nodupkey out=byvis&oid(keep=&byvis); by &byvis; run; Proc Sort data=data&oid nodupkey out=byval&oid(keep=&byval); by &byval; run; Proc Sql ; create table shell&oid as select * from byparm&oid, byvis&oid, byval&oid; quit; Lists of parameters, data-driven formats etc. can be created and printed: Proc Sql; select distinct cats(avisitn,"='",avisit,"'") into:_visfmt separated by ' ' from data&oid; select distinct strip(paramcd) as ParamLst into:_paramlst separated by ' ' from data&oid; quit; Proc Format; value avisfmt &_visfmt; run; 0='Baseline' 12='Week 12' 24='Week 24' 52='Week 52/Open-Label' 100='End of Study' --ParamLst-- BMI HEIGHT PULSE WEIGHT 13 Proprietary & Confidential. 2016 Chiltern

From OPEN CODE to MACRO DESIGN A: Prepare data and make subset Subject groups [1] Subset subjects Data categories [2] Subset data B: Perform calculations with standard procedures C: Format output cells and arrange to table structure D: Create and save TLF outputs Total numbers, default headers and labels Get final dataset(s) with original and/or TLF variables for output Output paths and settings; pagination, procedures for output data to files Calculate results with standard procedures Result macro Report macro (one or series) Join for series of outputs (global macro / variables) 14 Proprietary & Confidential. 2016 Chiltern

Appendix: Macro calls for Result and Report *=== Create Table for % of Responders===*; %result_resp_yn(oid=01, Result/Output ID insubj = adsl, selsubj= %str(where fasfl='y'), Subject-level bytrt = trtseqan trtseqa, indata = adeff, seldata= %str(where anl01fl='y'), Data-level byval = parcat1 avisitn avisit paramcd param, avalue = avalc, percents = TOTAL); Other settings * 4-column output by treatment sequence TRTSEQA *; %report_4trt(oid=01,vispage=2); < Macro call with the same parameters(or global settings), except for: oid= 02, bytrt= trt01pn trt01p > * 2-column output by planned treatments TRT01P *; %report_2trt(oid=02,vispage=3); 15 Proprietary & Confidential. 2016 Chiltern

Conclusions Number of data steps and procedure calls can be reduced to minimum: one procedure for each type of analysis GPP recommendations do not derive anything in more than one place, perform only one task per module or macro are reachable at SAS compiler level (not only due to repeated macro calls) Optimization of open-code enables us to develop powerful macro with high level of generalization 16 Proprietary & Confidential. 2016 Chiltern

*~~~~~ T H A N K Y O U! ~~~~~* References http://www.phusewiki.org/wiki/index.php?title=good_programming_practice http://www.phusewiki.org/wiki/index.php?title=good_programming_practice_guidance Acknowledges The author would like to thank Roman Ganzha for his careful review and comments Contact Information Galyna Repetatska, PhD Chiltern 51B Bohdana Khmelnytskogo str. Kyiv / 01030, Ukraine Email: Galyna.Repetatska@Chiltern.com LinkedIn: https://www.linkedin.com/in/halyna-repetatska 17 Proprietary & Confidential. 2016 Chiltern