Missing Pages Report. David Gray, PPD, Austin, TX Zhuo Chen, PPD, Austin, TX

Similar documents
Data Edit-checks Integration using ODS Tagset Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.

Clinical Data Visualization using TIBCO Spotfire and SAS

A SAS Macro Utility to Modify and Validate RTF Outputs for Regional Analyses Jagan Mohan Achi, PPD, Austin, TX Joshua N. Winters, PPD, Rochester, NY

PharmaSUG 2013 CC26 Automating the Labeling of X- Axis Sanjiv Ramalingam, Vertex Pharmaceuticals, Inc., Cambridge, MA

One Project, Two Teams: The Unblind Leading the Blind

So Much Data, So Little Time: Splitting Datasets For More Efficient Run Times and Meeting FDA Submission Guidelines

Automate Clinical Trial Data Issue Checking and Tracking

PharmaSUG Paper TT11

Data Quality Review for Missing Values and Outliers

Advanced Visualization using TIBCO Spotfire and SAS

Checking for Duplicates Wendi L. Wright

ABC Macro and Performance Chart with Benchmarks Annotation

An Efficient Tool for Clinical Data Check

A Tool to Compare Different Data Transfers Jun Wang, FMD K&L, Inc., Nanjing, China

Useful Tips When Deploying SAS Code in a Production Environment

How to Keep Multiple Formats in One Variable after Transpose Mindy Wang

A SAS Macro for Producing Benchmarks for Interpreting School Effect Sizes

Keeping Track of Database Changes During Database Lock

Validation Summary using SYSINFO

Give me EVERYTHING! A macro to combine the CONTENTS procedure output and formats. Lynn Mullins, PPD, Cincinnati, Ohio

Tracking Dataset Dependencies in Clinical Trials Reporting

Matt Downs and Heidi Christ-Schmidt Statistics Collaborative, Inc., Washington, D.C.

ABSTRACT INTRODUCTION WORK FLOW AND PROGRAM SETUP

Quick and Efficient Way to Check the Transferred Data Divyaja Padamati, Eliassen Group Inc., North Carolina.

MedDRA Dictionary: Reporting Version Updates Using SAS and Excel

PharmaSUG Paper AD06

A SAS Macro to Create Validation Summary of Dataset Report

%MISSING: A SAS Macro to Report Missing Value Percentages for a Multi-Year Multi-File Information System

Automated Checking Of Multiple Files Kathyayini Tappeta, Percept Pharma Services, Bridgewater, NJ

A Three-piece Suite to Address the Worth and Girth of Expanding a Data Set. Phil d Almada, Duke Clinical Research Institute, Durham, North Carolina

Create a Format from a SAS Data Set Ruth Marisol Rivera, i3 Statprobe, Mexico City, Mexico

Displaying Multiple Graphs to Quickly Assess Patient Data Trends

Easy CSR In-Text Table Automation, Oh My

A Macro to Keep Titles and Footnotes in One Place

Cover the Basics, Tool for structuring data checking with SAS Ole Zester, Novo Nordisk, Denmark

Building Sequential Programs for a Routine Task with Five SAS Techniques

The Output Bundle: A Solution for a Fully Documented Program Run

Electricity Forecasting Full Circle

Facilitate Statistical Analysis with Automatic Collapsing of Small Size Strata

What's the Difference? Using the PROC COMPARE to find out.

The Power of PROC SQL Techniques and SAS Dictionary Tables in Handling Data

Automatically Configure SDTM Specifications Using SAS and VBA

CC13 An Automatic Process to Compare Files. Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ

A Macro that can Search and Replace String in your SAS Programs

Quick Data Definitions Using SQL, REPORT and PRINT Procedures Bradford J. Danner, PharmaNet/i3, Tennessee

ABSTRACT DATA CLARIFCIATION FORM TRACKING ORACLE TABLE INTRODUCTION REVIEW QUALITY CHECKS

Get SAS sy with PROC SQL Amie Bissonett, Pharmanet/i3, Minneapolis, MN

Extending the Scope of Custom Transformations

SAS Drug Development Program Portability

22S:172. Duplicates. may need to check for either duplicate ID codes or duplicate observations duplicate observations should just be eliminated

Sorting big datasets. Do we really need it? Daniil Shliakhov, Experis Clinical, Kharkiv, Ukraine

What Do You Mean My CSV Doesn t Match My SAS Dataset?

PharmaSUG Paper AD09

%MAKE_IT_COUNT: An Example Macro for Dynamic Table Programming Britney Gilbert, Juniper Tree Consulting, Porter, Oklahoma

Out of Control! A SAS Macro to Recalculate QC Statistics

PhUSE US Connect 2018 Paper CT06 A Macro Tool to Find and/or Split Variable Text String Greater Than 200 Characters for Regulatory Submission Datasets

An Efficient Method to Create Titles for Multiple Clinical Reports Using Proc Format within A Do Loop Youying Yu, PharmaNet/i3, West Chester, Ohio

Cleaning Duplicate Observations on a Chessboard of Missing Values Mayrita Vitvitska, ClinOps, LLC, San Francisco, CA

T.I.P.S. (Techniques and Information for Programming in SAS )

Making a List, Checking it Twice (Part 1): Techniques for Specifying and Validating Analysis Datasets

A Simple Interface for defining, programming and managing SAS edit checks

9 Ways to Join Two Datasets David Franklin, Independent Consultant, New Hampshire, USA

Beginner Beware: Hidden Hazards in SAS Coding

Accelerating Production of Safety TFLs in Bioequivalence and Early Phase Denis Martineau, Algorithme Pharma, Laval, Quebec, Canada

Have SAS Annotate your Blank CRF for you! Plus dynamically add color and style to your annotations. Steven Black, Agility-Clinical Inc.

A Macro to Create Program Inventory for Analysis Data Reviewer s Guide Xianhua (Allen) Zeng, PAREXEL International, Shanghai, China

Submission-Ready Define.xml Files Using SAS Clinical Data Integration Melissa R. Martinez, SAS Institute, Cary, NC USA

SAS Macro Technique for Embedding and Using Metadata in Web Pages. DataCeutics, Inc., Pottstown, PA

To conceptualize the process, the table below shows the highly correlated covariates in descending order of their R statistic.

Create Metadata Documentation using ExcelXP

Real Time Clinical Trial Oversight with SAS

PharmaSUG Paper PO12

CDISC Variable Mapping and Control Terminology Implementation Made Easy

SAS ENTERPRISE GUIDE USER INTERFACE

Automating Comparison of Multiple Datasets Sandeep Kottam, Remx IT, King of Prussia, PA

PharmaSUG Paper PO10

Submitting SAS Code On The Side

Omitting Records with Invalid Default Values

Taming a Spreadsheet Importation Monster

Paper HOW-06. Tricia Aanderud, And Data Inc, Raleigh, NC

Paper DB2 table. For a simple read of a table, SQL and DATA step operate with similar efficiency.

Essential ODS Techniques for Creating Reports in PDF Patrick Thornton, SRI International, Menlo Park, CA

Paper S Data Presentation 101: An Analyst s Perspective

SAS Clinical Data Integration 2.6

Using PROC SQL to Calculate FIRSTOBS David C. Tabano, Kaiser Permanente, Denver, CO

Clip Extreme Values for a More Readable Box Plot Mary Rose Sibayan, PPD, Manila, Philippines Thea Arianna Valerio, PPD, Manila, Philippines

A Mass Symphony: Directing the Program Logs, Lists, and Outputs

PharmaSUG China Mina Chen, Roche (China) Holding Ltd.

Two useful macros to nudge SAS to serve you

Automation of SDTM Programming in Oncology Disease Response Domain Yiwen Wang, Yu Cheng, Ju Chen Eli Lilly and Company, China

Quality Control of Clinical Data Listings with Proc Compare

Run your reports through that last loop to standardize the presentation attributes

PharmaSUG Paper SP04

Countdown of the Top 10 Ways to Merge Data David Franklin, Independent Consultant, Litchfield, NH

Why organizations need MDR system to manage clinical metadata?

Efficient Processing of Long Lists of Variable Names

Developing Data-Driven SAS Programs Using Proc Contents

Quicker Than Merge? Kirby Cossey, Texas State Auditor s Office, Austin, Texas

Best Practice for Creation and Maintenance of a SAS Infrastructure

Using GSUBMIT command to customize the interface in SAS Xin Wang, Fountain Medical Technology Co., ltd, Nanjing, China

Transcription:

PharmaSUG2010 - Paper DM05 Missing Pages Report David Gray, PPD, Austin, TX Zhuo Chen, PPD, Austin, TX ABSTRACT In a clinical study it is important for data management teams to receive CRF pages from investigative sites in accordance with an agreed upon schedule so that the data may be entered and cleaned in a timely manner. It is therefore important to have a process in place that can project when CRF pages for a patient will be due and that can identify overdue expected CRF pages. This paper describes a macro developed in PC SAS 9.2 to generate a missing pages report, a page due report, and a summary report. Data management teams can utilize these reports to trace missing pages, predict data entry volume, and review the status of each patient. The reports are generated in EXCEL with one.xls file per investigator site and one for all sites combined. Each EXCEL file contains separate tabs for each report. INTRODUCTION This paper presents a macro, %MPR (Missing Pages Report), which generates one EXCEL output file per investigator site and one for all patients combined. Each EXCEL file has the following tabs: missing pages report (5 tabs) - all outstanding, outstanding 0-29 days, 30-59 days, 60-89 days, and 90 days or more. pages due report pages due forecast (only for all patients combined) summary report The macro requires two input sources; the data sets extracted from the clinical database and an EXCEL file which defines the pages expected for each visit, the visit s scheduled day, and the allowable + day window for the visit. SET UP The image below shows all the files you will need to run the MPR for a study. Figure 1. All Files Needed to Run MPR Data folder: This folder contains the data sets for all collected CRF data in the study. Copy the data sets into this folder or change the pointer (libname data) in MPR.SAS to the location of the data sets. It is assumed that most of these data sets have a page number associated with the record and therefore provide the set of pages that have been received and entered. Datasets not containing a page number variable will be ignored by the MPR macro. Input.xls: EXCEL file which defines the pages expected for each visit, the visit s scheduled day, and the allowable +day window for the visit. A detailed example is provided below. It is assumed that this file is located in the same folder as MPR.sas. MPR.SAS: SAS program containing the MPR macro and project specific macro variables. Program and usage is described below. 1

An example of Input.xls is shown below. This file provides the Sorting Order, Visit Number, Visit Name, Page Number, Schedule Day and Visit Window for the study. It may be necessary to obtain input from clinical and/or data management personnel in order to complete this table. MPR.sas requires the format of this file to be as shown. Sorting Order is required because the design of visit number itself may not be in the numeric sorted order in some studies. Sorting Order can keep the output report in chronological order so it is easy to review. Sort Visit No. Visit Number Visit Name Page Number Schedul e Day 1 2 3 4 5 6 7 8 9 10 11 12 13 0 0 1 1 SCREENING 2 2 VISIT 2 14 15 16 17 18 19 28 7 3 3 VISIT 3 20 21 22 23 24 25 56 7 4 4 VISIT 4 26 27 28 29 30 31 84 7 5 5 VISIT 5 32 33 34 35 36 37 38 112 14 6 6 VISIT 6 39 40 140 7 7 7 VISIT 7 41 42 168 7 8 8 VISIT 8 43 44 45 46 47 48 49 196 7 9 9 VISIT 9 50 51 224 7 10 10 VISIT 10 52 53 252 14 11 11 VISIT 11 54 55 56 57 58 59 60 280 7 12 12 VISIT 12 61 62 308 7 13 13 VISIT 13 63 64 336 7 14 14 VISIT 14 65 66 364 7 END OF 15 777 STUDY 67 68 69 70 71 72 73 74 392 14 16 888 SUMMARY 75 76 77 78 79 392 14 Visit Windo w SAS MACRO %MPR The MPR.sas macro is shown below. Nine project specific macro variables are in the first section and must be appropriately assigned before execution. It is necessary to identify the baseline visit and date in order to calculate the relative expected dates for subsequent visits. %Macro MPR; * Setup your study: you only need to set up 9 macro variables below based on your study. None of the other steps need to be changed; %let BaseDataSet=vis ; *Data set name which has baseline visit date; %let BaseLineVisit=1 ; *Baseline visit number in BaseDataSet; %let BaseVisitDate=visdt; *Baseline visit date in BaseDataSet; %let Patient=pt; *Patient variable in data set; %let Visit=visit; *Visit variable in data set; %let PageNum=PAG1A ; *Page number variable in data set; %let DataFolder=".\data"; *The location of all data sets; %let OutputName=Study_MPR; *The name of output EXCEL file; %let SiteDigits=4; *Define how many digits from Patient contains site; * Step1: readin INPUT.XLS which defines scheduled CRF pages, scheduled day and visit window for each visit, save it under data set ALL; proc import datafile="input.xls" out=all dbms=xls replace; getnames=yes; 2

data ALL ; set ALL (rename=(visit_number=visit)); if order=. and visit=. and Visit_Name='' then delete; sch_days = schedule_day + visit_window ; do i=1 to count(trim(left(page_number)),' ')+1 ; page = input(scan(page_number,i),??8.); output; end; * Step2: readin all CRF data sets under &DataFolder, calculate all received pages, save it under data set RECEIVE; libname data &DataFolder ; proc contents data=data._all_ noprint out=cont (keep=memname name where=(name="&pagenum")); %global numsets; proc sql noprint; select count(distinct memname) into : numsets from cont; quit; data _null_; set cont; call symput("dsn" trim(left(put(_n_,??8.))),memname); %macro GetPage; data receive; set data.&dsn1 (keep=&patient &Visit &PageNum) ; %do i=2 %to &numsets; data receive ; set receive data.&&dsn&i (keep=&patient &Visit &PageNum) ; if index(&pagenum,".") then page = input(scan(&pagenum,1,"."),??8.); else page = input(&pagenum,??8.); rename &Patient=pt &Visit=visit; %end; proc sort data = receive nodupkey; by pt visit page; %mend GetPage; %GetPage; * Step3: calculate missing pages, save it under data set MISSING; proc sort data = receive (keep=pt) out=ptlist nodupkey; by pt; proc sql; create table schedule as select all.*, PTLIST.* from all, PTLIST order by PTLIST.pt, all.visit, all.page; data missing; merge receive (in=x) schedule (in=y); by pt visit page; if y * ^x ; data missing(keep=pt visit Visit_Name misspg t_misspg order schedule_day sch_days); set missing; by pt visit; length misspg $2000 ; retain misspg t_misspg; if first.visit then do; misspg=trim(left(put(page,??8.))); 3

t_misspg=1; end; else do; misspg=trim(left(misspg)) ' ' trim(left(put(page,??8.))); t_misspg=t_misspg+1; end; if last.visit then output; proc sort data = missing; by pt order; * Step4: calculate outstanding days in OUT_DAYS and DUE_DAYS. If OUT_DAYS>0, then page is missing. Otherwise it is due in a future date on DUE_DAYS; proc sort data = data.&basedataset (where=(visit=&baselinevisit)) out = vis (keep=pt &BaseVisitDate rename=(&basevisitdate=randdt)) nodupkey; by pt; data missing; merge missing (in=x) vis (in=y); by pt; if x; vis_day = randdt + schedule_day ; vis_day2 = randdt + sch_days ; out_days = today() - (randdt + sch_days) ; due_days = vis_day - today() ; label pt="patient" Visit_Name="Visit_Name" misspg="missing Pages" vis_day="expected Visit Date" vis_day2="expected Visit Date + Window" out_days="days Outstanding" due_days="days Until Expected Visit Date" ; format vis_day vis_day2 date9. ; data report1 (keep=pt Visit_Name misspg out_days) report2 (keep=pt Visit_Name misspg t_misspg vis_day vis_day2 due_days); set missing; if out_days >= 0 then output report1; else output report2; data report2; set report2 ; label misspg="expected Pages" t_misspg="# of Expected Pages"; * Step5: calculate page due forecast based on 30 days period ; %macro forecast; proc sql; create table report4 as select "30 Days from report2 where due_days <=30; " as period, sum(t_misspg) as totdue quit; %do i=2 %to 12; %let temp1=%eval(30*(&i.-1)+1); %let temp2=%eval(30*&i.); proc sql; create table temp as select "&temp1-&temp2 Days" as period, sum(t_misspg) as totdue from report2 where 30*(&i-1)< due_days <=30*&i; quit; data report4 ; set report4 temp; label period="days Out" totdue="total Pages Due"; %end; 4

%mend forecast; %forecast; * Step6: calculate total present pages as totprtpg, total missing pages as totmispg, total pages due as totduepg ; proc sql; create table present as select pt, count(pt) as totprtpg from receive group by pt order by pt; create table due as select pt, sum(t_misspg) as totduepg from missing where out_days < 0 group by pt order by pt; create table miss as select pt, sum(t_misspg) as totmispg from missing where out_days >= 0 group by pt order by pt; quit; data report3 ; merge present miss due; by pt; if totmispg=. then totmispg=0; if totduepg=. then totduepg=0; if totprtpg=. then totprtpg=0; label pt="patient" totmispg="total Missing Pages" totduepg="total Pages Due" totprtpg="total Pages Present"; * Step7: output all data sets into one EXCEL file for all sites, then for each individual site, there is a single EXCEL output as well; data site (keep=site); set PTLIST ; site=substr(pt,1,&sitedigits); proc sort data = site nodupkey; by site; proc sql noprint; select count(distinct site) into : numsites from site; quit; data _null_; set site; call symput("site" trim(left(put(_n_,??8.))),site); %Macro output(dsn, label, newfile=no, site=all); proc export data=&dsn outfile="&&outputname._&site..xls" dbms=xls replace label; sheet = &label; NEWFILE=&newfile; %mend output; %output(report1,'missing Pages Report', newfile=yes); %output(report1 (where=( 0<=out_days<=29)),'Missing 0-29 days'); %output(report1 (where=(30<=out_days<=59)),'missing 30-59 days'); %output(report1 (where=(60<=out_days<=89)),'missing 60-89 days'); 5

%output(report1 (where=(90<=out_days)), %output(report2,'pages Due Report'); %output(report4,'pages Due Forecast'); %output(report3,'summary Report'); 'Missing GE 90 days'); %macro outputsites; %do i=1 %to &numsites; %output(report1 (where=(substr(pt,1,&sitedigits)="&&site&i")),'missing Pages Report', newfile=yes, site=&&site&i); %output(report1 (where=( 0<=out_days<=29 and substr(pt,1,&sitedigits)="&&site&i")),'missing 0-29 days',site=&&site&i); %output(report1 (where=(30<=out_days<=59 and substr(pt,1,&sitedigits)="&&site&i")),'missing 30-59 days',site=&&site&i); %output(report1 (where=(60<=out_days<=89 and substr(pt,1,&sitedigits)="&&site&i")),'missing 60-89 days',site=&&site&i); %output(report1 (where=(90<=out_days and substr(pt,1,&sitedigits)="&&site&i")), 'Missing GE 90 days',site=&&site&i); %output(report2 (where=(substr(pt,1,&sitedigits)="&&site&i")),'pages Due Report',site=&&site&i); %output(report3 (where=(substr(pt,1,&sitedigits)="&&site&i")),'summary Report',site=&&site&i); %end; %mend outputsites; %outputsites; x 'del "*.xls.bak" '; %Mend MPR; %MPR; OUTPUT EXAMPLES The MPR macro generates an EXCEL file, &OutputName_MPR_ALL.xls, and one EXCEL file for each site using the convention &OutputName_MPR_nnnn.xls, where nnnn is the site number. A screenshot following program execution is shown below. Figure 2. One Example of Execution Result 6

MISSING PAGES REPORT The first tab (Missing Pages Report) in each EXCEL file is a summary report for all patients who had at least one missing page. As you can see below, it displays the patient ID, visit name, pages missing in that visit, and how many days outstanding the pages are for that visit at the date the report was run. Pages listed in this report are for visits that have passed the expected visit date and the allowable visit date window. A screenshot of the first tab is shown below. Figure 3. One Example of the First Tab Missing Pages Report MISSING PAGES REPORT SUBSET BASED ON OUTSTANDING PAGES FOR A SPECIFIED NUMBER OF DAYS The 2nd to the 5th tabs in each EXCEL file are subsets of the first tab. Each includes only patient visits with missing pages outstanding between a specified number of days. For example, the 2nd tab lists missing pages with outstanding days between 0 to 29. A screenshot of an example 2nd tab is shown below. Figure 4. One Example of the 2 nd Tab Missing 0-29 days 7

PAGES DUE REPORT The 6th tab (Pages Due Report) in each EXCEL file displays page due information for each patient. It lists the patient ID, the visit, what pages are expected for the visit, how many pages are expected for the visit, when the visit will be due based on scheduled visit date, when the visit will be due based on scheduled visit date plus visit window, and how many days until the visit will be due based on scheduled visit date. Please note that the MPR macro considers an allowable visit window for each visit. This may result in a negative number in the days until due column. A negative number means that the visit has passed the scheduled visit date, but it is still within the upper bound of the visit window. In this case the pages listed are expected but not yet received. An example screenshot is below. E.g. Patient 053300002 and visit 1, suppose the day we run this MPR is 05-MAR- 2010, but the expected visit date is 04-MAR-2010, consider 7 days visit window, this visit should occur between 04- MAR-2010 and 11-MAR-2010. Although we are 1 day passed the scheduled visit date (that s why last column Days Until Expected Visit Date =-1), but we can not say this visit (and associated CRF pages) are missing since it is not beyond the upper bound of visit window which is 11-MAR-2010 in this case. Figure 5. One Example of the 6th Tab Pages Due Report PAGES DUE FORECAST The 7th tab (Pages Due Forecast) included in &OutputName_MPR_ALL.xls file displays page due forecast information. It lists the Days Out and Total Pages Due within that period. The purpose for this tab is to provide a forecast for how many pages are expected to be received in the next 30 days, next 31-60 days, and so forth. An example screenshot is below. The chart below can be easily generated via the EXCEL Chart Wizard by choosing Insert->Chart. 8

Figure 6. One Example of the 7th Tab in &OutputName_MPR_ALL.xls Pages Due Forecast SUMMARY REPORT The last tab in each EXCEL file displays the total pages present, total missing pages and total pages due information for each patient in the study. An example screenshot is below. Figure 7. One Example of the Last Tab Summary Report 9

CONCLUSION The MPR macro is a simple tool which can produce CRF page tracking reports for use by data management teams to trace missing CRF pages and manage workflow. The use of such a tool is especially valuable on large clinical trials and can help to ensure that data are collected and cleaned as the study progresses and that project timelines are achieved. ACKNOWLEDGMENTS We would like to thank our colleague John Gorden in the Biostatistics Department at PPD who provided comments on drafts of this paper. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: David Gray, Zhuo Chen PPD 7551 Metro Center Drive, Suite 300 Austin, TX 78744 E-mail: david.gray@ppdi.com, zhuo.chen@ppdi.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 10