A SAS and Java Application for Reporting Clinical Trial Data Kevin Kane MSc Infoworks (Data Handling) Limited
Reporting Clinical Trials Is Resource Intensive! Reporting a clinical trial program for a new drug costs on average $4.5M 1 Improving the efficiency of this process could save money and free up data exploitation resources in a company 1. Tufts Centre for Disease Control, Nov 2001
PHArmaceutical STAtistical Reporting A software application where user can generate reports at the press of a button Compatible with UNIX and MS Windows
Architecture Java based user interface File Report Run Tools Help Data Interrogator Report Wizards Output file SAS program SAS Macros
Minimal Set-up Before Analysis User selects the directory where the data resides Choose the name of a few key variables
SAS Code Generation Major problem with attempts to standardise reporting is that many reports have minor unusual formatting requirements phastar is also a SAS code generator Macro free SAS code can be generated for all reports
Auto-Summarise Feature phastar can automatically summarise a dataset or a directory of datasets The data interrogator attempts to identify datasets with keys and datasets that have counts of events Adds the appropriate summary table For datasets that can t be automatically summarised, simple summary statistics are generated
Some other features All reports can be subject to a where clause to subset the original dataset All reports can have a page XX of XX footer added Table numbers can be incremented automatically Table titles can be edited across tables without entering each report definition
Types of Reports Listings Summary Tables Tables of Event Counts General Linear Models Non-Parametric Analysis Survival Analysis
Example 1. Producing a simple data listing
Example 1. Producing a simple data listing
Example 1. Producing a simple data listing
Formats and Labels Select Attributes from the Tools menu
Preview Report Preview is one of the options of the Run Menu
Example 2. Producing an analysis table
Example 2. Producing an analysis table
Example 2. Producing an analysis table
Example 2. Producing an analysis table
Example 2. Producing an analysis table
Example 2. Producing an analysis table
Example 2. Producing an analysis table
Summary Tables Continuous and categorical variables summarised in one table Choice of statistics based on those available in PROC UNIVARIATE Specify column and by variables
Event count tables Specify multi-level variables which hold event descriptions to be counted Counts the number of events, number of patients or both Uses a population dataset where clause can be applied to event dataset, population dataset or both
Non-Parametric Analyses Similar to the GLM wizard except statistics are non-parametric Hodghes-Lehmann estimators Wilcoxon test Van-Elteren test
Survival Analyses Log-rank testing Estimates of the proportion surviving at specific time points Median and other percentile time to event Proportional hazard results Hazard ratios and confidence intervals Test of proportional hazard assumption
Conclusions 60% of reports can be directly created using phastar s wizards An additional 15% of reports can be created by generating SAS code and carrying out minor amendments Clinicians, health economists and other nontechnical staff could directly report clinical data Reporting clinical trials with phastar can increase the efficiency of reporting clinical trials substantially
For more information info@phastar.co.uk
A SAS and Java application for analysis and reporting of clinical trial data Kevin Kane, Infoworks (Data Handling) Limited Introduction Reporting the results of a drug trial is normally a labour intensive process. Each individual report usually requires a separate SAS program to be written. Although the SAS Macro Language can be used to help reduce the programming effort, this approach can be too complex for many statisticians and still can require a significant amount of programming. This paper gives details of an application that allows reports to be generated by a simple point and click interface. The PHASTAR system The aim of the PHASTAR project (Pharmaceutical Statistical Reporting) was to develop an easy to use computer application where a statistician could create typical tables and listings using a simple point and click interface. It was desirable that the system would be at least compatible with Microsoft Windows and UNIX operating systems. An application design with powerful SAS macros to do the data manipulation, analysis and reporting, combined with a JAVA user interface was chosen as most likely to be able to achieve these goals. PHASTAR is designed to work with a minimum overhead in terms of study setup. When a user starts to work on a study by selecting the directory where the data resides, a SAS program is fired up to interrogate the directory and send back information to the JAVA environment such as the contents of the SAS datasets in the directory, names of important variables and levels and labels of the treatment variable. An intelligent algorithm is used to pre-determine which variables are mostly likely to be the important variables, such as the variable that holds the subject, treatment and visit identifier. Once the JAVA environment has been passed all the information required, the user can then add, edit and manage the production of tables and listings. All reports for a study are listed in the main view of the application. Selecting an option to add a report from the menu brings up the appropriate wizard to guide the user through the process of producing the report type they have selected. The SAS macros that work in the background to carry out the report production have been written in a way that is as generic as possible. Currently, there are macros for listings, summary tables, tables of event counts (such as adverse events of concomitant medication uses), analysis of normally distributed data, non-parametric analysis and survival analysis. Addition of further macros is ongoing.
As an example of the flexibility of the macros, consider one of the simplest the macro that is used to produce data listings. The macro takes as parameters a list of the key or order variables and a list of the ordinary display variables. With no other parameters the macro will produce a data listing using all the variables ordered according to the key variable list. A common problem in the production of data listings is how to fit all the data onto one page (especially since the FDA has requested that all reports be 12 point). The macro has two features to help with this problem it is possible to wrap long text variables around within a report column; and it is possible to stack variables one on top of another. Other formatting options are available such as the ability to create grouped headings; insert a blank line between groups of observations; and adjust the space between columns. There is also a facility to add a Page XX of XX footer to each of the pages of the report. This macro should be able to produce approximately 90% of data listings. If a user wishes to add a new listing definition to the study, they select an option from the main menu. The New Listing Wizard appears to guide the user through the listing creation process. First, they have to enter into the system basic information, such as the table number and title. The dataset to be used for the listing is selected from a pull down list and information on the name of the output file is entered. Then by clicking the Next button, the next two screens allow the user to select from a list the listing s key and display variables. The ability to add, delete and change the order of the variables is shown. The next screen allows the addition to wrap variables or created stacked variable groups. The last screen allows the creation of grouped headings and whether to add the Page XX of XX option. Pressing the Finish button then completes the whole listing definition. The listing can then be previewed or executed to create an output file. One main criticism of many systems that use generic programs to create output is that often the user wants to make a simple change, usually to formatting, that is not catered for in the generic macro design. PHASTAR has a solution - all of the macros that are used for reporting have the capability to generate macro-free SAS code that can be used to generate the output that would have been created by running the macro directly. This automatically created SAS code can then be used as a basis for the report. This process does not require the user to have any advanced SAS macro knowledge. PHASTAR also has a facility for managing the table numbers, titles and footnotes for the tables that are to be generated by the application. Table numbers in different formats can automatically be incremented, and table titles and footnotes can be edited across tables without having to edit every individual report definition.
Although the listing example may seem relatively trivial it can be simple to write a SAS program to produce a listing - the concept when extended is powerful. The ability to create complex tables of results from general linear modelling at the touch of a button is an efficiency increase of much greater than 100%. It is estimated that at present PHASTAR has the capability to directly produce 60% of all clinical trial data reports. In addition to this, another 15% of reports could be produced by generating the SAS code and doing minimal SAS coding to the generated program. Another possibility that is raised by the introduction of PHASTAR is the possibility of use by clinicians, health economists and other non-statistical personnel involved in the clinical trial process. Because of PHASTAR s simplicity, the technical barrier to the non-statistician is removed, allowing them not only to generate hypotheses, but also to test them. In future releases, it is anticipated that PHASTAR will have a graphical component not only producing graphs for clinical trial reports but also producing graphical output to help in the analysis, e.g. diagnostics when carrying out general linear modelling. It is also anticipated that greater advantage be taken of SAS s output delivery system, allowing output in formats such as PDF and HTML to be directly produced. In fact, data investigation in general is a much more simple process using PHASTAR. For example, analyses can usually be run on subgroups by choosing a by variable from a list and pressing a button. It is not anticipated that all the tables, listings and graphics that are produced for a clinical trial report could have their production automated. It seems likely that there will be a proportion of reports that will be specialised and require a significant amount of programming either for data manipulation or analysis purposes. However, freeing up the time from more routine tasks allows the statistician or programmer to spend more time on more complex and interesting tasks. When considering the implications on cost and resource for a clinical trial program, the improvement in efficiency has the potential to have a dramatic impact. Using data from a SAS report 1, a typical clinical trial program for a new drug costs $4.5M in analysis and reporting activities. Even if a small percentage of this can be reduced, the savings to a medium to large size pharmaceutical could be substantial. However, with PHASTAR allowing the statistical resources in a company to be reallocated from more routine tasks and enabling statisticians to have a more strategic influence in a company could have the most important impact within a company. For more information, please contact:- Kevin Kane Infoworks (Data Handling) Limited
6 Arlington Gardens, Chiswick, London, W4 4EY +44 (0) 20 8987 8708 k_kane@msn.com References 1. Return on Investment Assessment for the Biomedical Knowledge Platform. SAS Website.