Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment

Size: px

Start display at page:

Download "Statistics and Data Analysis. Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment"

Archibald Daniel
6 years ago
Views:

1 Common Pitfalls in SAS Statistical Analysis Macros in a Mass Production Environment Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ Aiming Yang, Merck & Co., Inc., Rahway, NJ ABSTRACT Four pitfalls are commonly encountered in the statistical analysis of clinical trials. The first pitfall can occur in a cross-over study when the McNemar's test is used to estimate the significance level of treatment group differences for a binary outcome result. The second pitfall concerns the ordering in PROC LOGISTIC. The third pitfall involves PROC GLM and its handling of character CLASS variables. The fourth pitfall relates to non-estimable issues with PROC GLM. In a massive clinical trial program production environment, we need to watch for these pitfalls when recycling, reusing, or creating macros. The four pitfalls are described in detail. Recommendations and SAS coding solutions are provided. KEYWORDS Clinical Trial, Cross Over Study, McNemar's Test, PROC LOGISTIC, PROC GLM INTRODUCTION In clinical trials, statistical programmers usually have to deliver a lot of statistical analysis results in tight timelines. Very often a programmer would use some existing macros. In this type of mass production environment, it is important that the macros are robust and generate correct results. This paper lists several undesirable scenarios and provides possible methods/techniques to prevent them from happening. For example, in a cross over study, the McNemar's test is used to estimate the significance level between treatments for a binary outcome result. Usually the testing is carried out on a 2 X 2 frequency table. For some relatively rare events, the single column or single row happens. In this case, a warning message will pop out in the log. To prevent this warning message, a small screening step within the macro could be added beforehand to decide whether the method should be applied. Another example is to use the DESCENDING option properly in a PROC LOGISTIC procedure. The third example is that character treatment group variable, sometimes, could cause a wrong estimation in a CONTRAST or ESTIMATE statement in a PROC GLM procedure. The last example is to recommend using LSMeanDIFFCL in PROC GLM to avoid a potential problem caused by multicollinear data. It is important that every programmer examine existing programs carefully and make an effort to enhance the robustness of new macros. EXAMPLE 1: MCNEMAR'S TEST IN PROC FREQ When estimating the significance level between treatments for a binary outcome result for a 2 * 2 crossover design, the McNemar's test is often to be used. The 2-way adverse event (AE) table is like the following: Treatment A (AE1) With AE without AE Treatment B With AE a b (AE2) without AE c d The McNemar's test statistic is shown below: The following code could carry out the testing. proc freq data=temp; weight count; table AE1*AE2/agree; output out=mcnemar(keep= _MCNEM_ P_MCNEM) agree; Pitfall In clinical trials, safety analysis is a must. The clinician and statistician would like to know whether the occurrence rate of an adverse event is more likely to occur in one treatment arm than another. Since the response variable is the occurrence of an adverse event, due to the nature of the data and the relative rarity of the event, it is highly possible to have cells containing zero in the 2 by 2 table. For example, in the following pseudo data, 100 patients are grouped into two groups: 50 patients 1

2 receive treatment A in the first period and switch to treatment B in the second period. Another 50 patients are in treatment B in the first period and switch to treatment A in the second period. Assume that 98 patients do not have any adverse event in both periods. Two patients have an adverse event in treatment A. This means there is no adverse event for treatment B. The 2 way table would be like the following: Treatment B Treatment A With AE without AE With AE 0 0 without AE 2 98 PROC FREQ will not be able to carry out the analysis if either variable in the 2 * 2 table has less than 2 non-missing levels. There will be a note in the log indicating the data is not suitable to perform the test. SAS will not create a dataset if there is an output statement in the PROC FREQ procedure. Hence, there will be a warning message. NOTE: No statistics are computed for ae1 * ae2 since ae2 has less than 2 nonmissing levels. WARNING: No OUTPUT data set is produced because no statistics can be computed for this table, which has a row or column variable with less than 2 nonmissing levels. For another example, in the same pseudo data, 100 patients are grouped into two groups: 50 patients receive treatment A in the first period and switch to treatment B in the second period. Another 50 patients are in treatment B in the first period and switch to treatment A in the second period. Assume that 98 patients do not have any adverse event in both periods. Two patients have the adverse event in both the treatment A and treatment B period. This means that there is no patient who has the adverse event only in one treatment period. The 2-way table would be like the following: Treatment B Treatment A With AE without AE With AE 2 0 without AE 0 98 The McNemar's test would not be able to carry out the analysis if the data in a 2 * 2 table having zero frequencies in the offdiagonal cells. If a NOPRINT option is specified, there will be a note message in the log indicating there are no discordant data to compute McNemar's test. NOTE: There are no discordant pairs when computing McNemar's test, for the table of ae1 by ae2. However, SAS still will create a dataset when there is an output statement in the PROC FREQ procedure, though the data contain missing values. Suggested Solution These two examples show that a warning message is not necessarily produced if a test could not be carried out. If the data result in a single row or single column in a 2-way table, there will be a warning message. If the data satisfy the 2 X 2 table layout but with zero frequency in off-diagonal cells, the warning message will not show. A common practice for presenting the statistics when a test could not be carried out is simply put a 'N/A' message in the table output. For the first example, this paper provides the following technique to avoid the warning message. proc sql ; select sum(count) into :countr0 from temp where AE1=0; select sum(count) into :countr1 from temp where AE1=1; select sum(count) into :countc0 from temp where AE2=0; select sum(count) into :countc1 from temp where AE2=1; 2

3 %if (&countr0=0 or &countr1=0 or &countc0=0 or &countc1=0) %then %do; data McNemar; _MCNEM_ =.; P_MCNEM =.; %end; %else %do; proc freq data=temp noprint; weight count; table AE1*AE2/agree; output out=mcnemar(keep=_mcnem_ P_MCNEM) agree; %end; For the second example, there is no warning message. Programmers and statisticians should carefully check the SAS output, or, if the NOPRINT is specified, check the note message related to the "No discordant pair'" information. EXAMPLE 2: PROC LOGISTIC / DESCENDING In clinical trial or health care research, a binary variable is often the key result of interest in the study. In a clinical trial, for example, the binary result could be whether there is an AE event or not. In health care research, the binary result could be if a patient enters the long term care facility after being discharged from the hospital. PROC LOGISTIC is frequently used to estimate the impact from the various factors on the probability of having this event. Prior to using the procedure, it is important to know the value of the binary dependent variable. The DESCENDING option should be added to the PROC LOGISTIC statement if the binary dependent variable has value 1 (YES) vs. 0 (NO). The code is as following: proc logistic data=raw outstat=_outstat descending; class trt; model resp2 = trt age; On the other hand, the DESCENDING option should not be added if the binary dependent variable has value 1 (YES) vs. 2 (NO). If one does not know the data and blindly uses the procedure, the result might be interpreted in the wrong direction because the meaning of the sign of the coefficient is opposite from what is assumed. To compare the result of using DESCENDING with the result of not using it, here are three examples. Pitfall One pseudo dataset is created to present the issue. Example 1: Correct Approach --- binary value 1 (YES) vs. 0 (NO) / with DESCENDING option Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept trt trt trt age

4 Example 2: Correct Approach --- binary value 1 (YES) vs. 2 (NO) / without DESCENDING option Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept trt trt trt age Example 3: Incorrect Approach --- binary value 0 (NO) vs. 1 (YES) / without DESCENDING option Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept trt trt trt age The first two outputs have exactly same results because they are estimating the same thing the probability of having a binary response of YES. In the third output, the signs of the estimates are opposite from expected, because this incorrect approach is estimating the probability of having binary response as NO. Suggested Solution It is likely that many SAS/STAT users know that the DESCENDING option in PROC LOGISTIC model should be added if the binary response variable takes value 1 and 0. However, sometimes people use a macro borrowed from other studies, without checking whether the data are appropriate to use beforehand. To avoid this type of mistake, an ideal macro should alert the user to consider the use of the DESCENDING option in the PROC LOGISTIC model to prevent the wrong interpretation. One solution to avoid the misuse of the procedure is a bulletproofing. See below: proc sql noprint; select distinct(&var) into :resp separated by ' ' from raw; %if %scan(&resp,1) ne 0 or %scan(&resp,2) ne 1 or %length(%sysfunc(compress(&resp))) ne 2 %then %do; %put *** the response variable is not 0 / 1 ***; %end; There are many possible approaches to implement this procedure. How to proceed in the macro is up to the macro developer's preference. One possible approach is to stop executing the macro. Another Approach is to drop the DESCENDING statement from the model. If the binary response variable has value 1 (YES) vs. 2 (NO), converting 2 to 0 is also an appropriate solution. EXAMPLE 3: PROC GLM (WITH CHARACTER TREATMENT GROUP) In PROC GLM or PROC MIXED, the coefficient vector estimates β impact from the independent variables on the dependent variables. When testing a null hypothesis using the estimated coefficients, a CONTRAST statement could be added to perform a custom hypothesis test by specifying an L vector or matrix for testing the hypothesis L * β= 0. For example, say there are two treatment groups in a clinical trial study. The null hypothesis is that there is no difference between treatments in the percentage change from baseline in a lab test. In the CONTRAST statement, the L vector should be specified as the following: 4

5 proc glm data=raw outstat=_outstat; class trt; model pchg = trt / ss3; CONTRAST 'trtdiff' trt 1-1; The vector specified in the CONTRAST statement should align with the ordering of the CLASS variable. Assuming the variable TRT has values 1 and 2, and the coefficients of these two treatments are β1 and β2, then the contrast statement above tests β1 β2 = 0. Without knowing the correct ordering of the CLASS variable, the testing statistics might be calculated for a wrong null hypothesis. Pitfall A dose range studies are common in phase II clinical trials. If the study interest is on the right doses of single primary therapy drug as well and a co-administrated drug, then it is possible to have at least 10 treatment arms in this study. Whether the treatment group variable is a character variable or a numeric variable will cause a difference. The sorting order of the CLASS variables by default is ORDER = FORMATED. For unformatted numeric variables, the levels are ordered by their numeric value. However, for character variables, the sorting order is based on the following order: blank! " # $ % & ' ( ) * +, -. / : ; < = A B C D E F G H I J K L M N O P Q R S T U V W X Y Z[ \] ˆ_ a b c d e f g h i j k l m n o p q r s t u v w x y z { } ~ Based on this rule, if the treatment group variable is a character variable with length 2 bytes and with the following value: "1 ", "2 ", "3 ", "4 ", "5 ", "6 ", "7 ", "8 ", "9 ", "10". The ordering will not be like the ordering of numeric variable: Instead, it will be the following: On the other hand, if the treatment group variable is a character variable with length 2 bytes but with the following value: " 1", " 2", " 3", " 4", " 5", " 6", " 7", " 8", " 9", "10". The ordering will still be what people usually expect: Suggested Solution To see the ordering of the CLASS variable, add option E in CONTRAST statement. This will display the entire L vector and hence it is useful in confirming the ordering of parameters for specifying L. Here is the example code and output with E option. proc glm data=raw outstat=_outstat; class trta; model pchg= trta / ss3; CONTRAST 'TRENDPB_5' trta / E; The GLM Procedure Coefficients for Contrast TRENDPB_5 Row 1 Intercept 0 trta 1-2 5

6 trta 2-1 trta 3 0 trta 4 1 trta 5 2 trta 6 0 trta 7 0 trta 8 0 trta 9 0 trta 10 0 Dependent Variable: pchg Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE pchg Mean Source DF Type III SS Mean Square F Value Pr > F trta Contrast DF Contrast SS Mean Square F Value Pr > F TRENDPB_ It is recommended having that treatment group variables be numeric because it could avoid the possible wrong use of the contrast statement. EXAMPLE 4: PROC GLM (NON ESTIMABLE) In a clinical trial, efficacy and / or safety are always the main interest. The drug effect could be a categorical variable or a continuous variable. To estimate the treatment difference on a continuous variable, the ANOVA or ANCOVA model is a popular method. In SAS, PROC GLM or PROC MIXED are very often used for this purpose. When data have no multicollinearity issues, there is more than one way to calculate a treatment difference and its confidence level. One way is to use the LSMEANS statement and ODS to generate the output table LSMeanDIFFCL. Another way is to use the ESTIMATE statement. Here is an example of code in which the ODS output statement with LSMeanDIFFCL and LSMEANS statements are used together: ods output LSMeanDIFFCL=lsmdifci; proc glm data=lipids outstat=glmout ; class trt site region; model ldl = trt site region/e1 ss1 ss3; lsmeans trt/pdiff cl; 6

7 The output is here Obs Effect Dependent i j LowerCL Difference UpperCL 1 trt ldl Some statisticians and programmers use the ESTIMATE statement to calculate a confidence interval for a treatment difference. title 'GLM MODEL FOR TABLE CREATION'; ods output Estimates=estimate; proc glm data=lipids outstat=glmout ; class trt site region; model ldl = trt site region/e1 ss1 ss3; estimate 'diff2_1' trt -1 1 ; data rmse; set glmout (where=(_source_ eq 'ERROR') keep=_source_ DF SS); t = tinv(0.975, DF); rmse = sqrt(ss/df); byid = 1; call symput ('rmse', trim(left(put(rmse, 8.2))) ); data estimate; set estimate; byid = 1; data part2; merge estimate rmse; by byid; uci = -(estimate - tinv(.975,df)*stderr) ; lci = -(estimate + tinv(.975,df)*stderr) ; This way could generate exactly the same confidence interval as long as the LSMEANS are estimable. Obs Dependent Parameter Estimate uci lci The two approaches have the same confidence interval. 1 ldl diff2_ Pitfall Each LS-mean is computed as L * β where L is the coefficient matrix associated with the least-squares mean and β is the estimate of the fixed-effects parameter vector. As in the LSMEANS statement, the L is tested for estimability, and if this test fails, PROC GLM displays "Non-est" for the LS-means entries. It could be that the chosen independent variables are highly correlated. In this case, the model should be corrected. Here, the least squares means are non-estimable, and hence, using the ODS LSMeanDIFFCL statement combined with LSMEANS to retrieve the confidence interval will generate a warning message as following: WARNING: Output 'LSMeanDIFFCL' was not created 7

8 However, if the second way is used here, the output listing would have the following output, though the warning message won't show up. The GLM Procedure Least Squares Means trt ldl LSMEAN 1 Non-est 2 Non-est trt ldl LSMEAN 95% Confidence Limits Without checking the SAS output, this way still could generate a confidence interval but that is not valid. Obs Dependent Parameter Estimate uci lci 1 ldl diff2_ In a mass production environment, some statisticians and programmers may only check if the log has a "WARNING" or "ERROR" messages. In this scenario, an inappropriate model as in the example might not be noticed. Suggested Solution People should not just rely on checking if there are "WARNING" or "ERROR" messages in the log. They should read the output listings to see if the model is appropriate for the data. Using the ODS LSMeanDIFFCL statement combined with LSMEANS to retrieve the confidence interval is recommended to detect the Non-estimable issues. To be safe, programmers should not only check the log file but also output listing file for any "Non-est" message. CONCLUSION When generating large amounts, sometimes it could be hundreds or thousands, of analysis reports, the common practice is to save the output to the PDF or RTF or WORD document. Prior to delivery of the tables to the reviewer, there are several quality checks that should be done. First, the front line programmers and statisticians should review the SAS logs and SAS output listings. Second, the manager should review the tables. However, in a tight timeline, programmers and statisticians may end up only checking for warning or error message in the log file. A manager who reviews the results might just look at the numbers in the final tables. In this kind of a quality checking chain, checking the SAS output is often skipped. However, that missing step may be actually the most essential part. The first scenario illustrates how an expected warning message could be avoided. The second and third ones show how the model could be used incorrectly even though there is no warning and error message in the log. The last scenario demonstrates that even though there is more than one way in SAS to produce results, different methods provide certain advantages. In this paper, one of the advantages shown is that the method used could detect the data problem and the user alerted that there was an issue with the model used. REFERENCES Wuwei Wayne Feng and Dong Ding SAS@ APPLICATION IN 2 * 2 CROSSOVER CLINICAL TRIAL, in Proceedings of the Pharmaceutical SAS Users Group Conference (PharmaSUG 2004) John Troxell Bulletproofing and Knowledge Encapsulation in Statistical Macros, in Proceedings of the Pharmaceutical SAS Users Group Conference (PharmaSUG 2002) ACKNOWLEDGMENTS The author would like to thank John Troxell and Beilei Xu of Merck Research Laboratories for their Advices on this paper/presentation. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the authors at: 8

9 Huei-Ling Chen Merck & Co., Inc. 126 Lincoln Avenue P.O. Box 2000 Rahway, NJ Phone: Aiming Yang Merck & Co., Inc. 126 Lincoln Avenue P.O. Box 2000 Rahway, NJ Phone: TRADEMARK SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 9

CC13 An Automatic Process to Compare Files. Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ

CC13 An Automatic Process to Compare Files Simon Lin, Merck & Co., Inc., Rahway, NJ Huei-Ling Chen, Merck & Co., Inc., Rahway, NJ ABSTRACT Comparing different versions of output files is often performed