Assessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS

PharmaSUG2010 Paper SP10 Assessing superiority/futility in a clinical trial: from multiplicity to simplicity with SAS Phil d Almada, Duke Clinical Research Institute (DCRI), Durham, NC Laura Aberle, Duke Clinical Research Institute (DCRI), Durham, NC Abstract SAS macroprocessing was implemented to deal with a simple question that was demonstrated to be more extensive when pursued to completeness. That is, given the sample size, what proportion of successful events would need to be arrived at to consider a study drug as superior or the study as futile at some given significance level. Using the procedure that was developed, the consultant was able to provide the client with, not just a variety of scenarios by which to formulate clinical trial judgements but an exhaustive and complete review based upon the initial request. With this procedure, furthermore, the client was able to gain a sense of the scope of the enormity of a proper assessment of the request without the of SAS, and within a reasonable amount of time. In addition, upon increasing one single parameter in the client request, the exponentially escalating scale of affordable information seemed to provide the client with a sense of satisfaction about the particular pursuit since no further requests were submitted. Introduction During the initial stages of a clinical trial at the DCRI, the trial Sponsor became concerned with assessing the superiority/futility of the study drug under various scenarios. A hypothetical number of subjects per treatment arm was proposed and a number of proportions were tested with attained significance levels to be inspected. Initially, the manner of approach was manual and a sampling of proportions based on the given sample size were tested and submitted to the Sponsor. Since the effort was not exhaustive, and could not be within a reasonable amount of time, the Sponsor was challenged to be satisfied with the parsimonious nature of the initial response. This was evidenced by further inquiry by the Sponsor to assess even more scenarios and initial efforts were made by the DCRI to programmatically test a broader range of proportions. In concert with escalating requests, the programming became more challenging since the need was to address these requests within a favourable response time, and in a manner that would return a complete analysis, showing statistical test results, based on the initial request from the Sponsor. The purpose of this paper is to demonstrate the power of SAS macroprocessing to provide such a response, and yet even an answer to the question, Is this drug still worth the time and resources to test? Method The simple case The initial scenario requested by the Sponsor was to assess a sample size of 30 subjects in each treatment group, specifically, one group being administered the study drug and the other, a placebo. Consequently, the initial programming approach was to draw some realistic proportions of successes under each treatment arm and test for association. Thus, data sets were generated where there were simulated the number (wt) of remissions (p = 1) under the presence of placebo (t = 0) and under the administration of study drug (t = 1). One SAS data step scenario would be represented as: data test1; p = 0; t = 0; wt = 25; output; p = 0; t = 1; wt = 10; output; p = 1; t = 0; wt = 5; output; p = 1; t = 1; wt = 20; output; The SAS code for the subsequent test of association would look like the following: proc freq data=test1; weight wt; table p * t / chisq;

where the CHISQ option in the TABLE statement provides for the chi square test of association. A series of SAS data steps then would be constructed to simulate various proportions followed by tests of association in corresponding SAS procedure steps until some informative test outcome(s) was/were achieved. Clearly, this could be a labour-intensive task since there were, at minimum, 30 x 30 possible combinations of weights, that is, 900 2 x 2 tables and 900 tests of association to be simulated in 900 data steps and 900 procedure steps. Thus, given that the maximum number of subjects per treatment arm was identified, the most logical tool to accomplish all objectives and to be the least labour-intensive was SAS macroprocessing. Code upgrade for labour downgrade stage 1, data generation First, the correct maximum number of simulations was verified as 31 x 31 to accommodate for the scenarios of no remissions under either treatment arm and total remissions under at least one treatment arm. Thus, there would actually be 961 simulations. Further, the simulated weights could be conceived as a 31 x 31 grand matrix composed of elements or components that are each 2 x 2 matrices where any such 2 x 2 component matrix could be indexed by the row-column coordinates of the grand matrix, i,j where i = 1,, 31, and j = 1,, 31. Next, an algorithm was determined that would simulate the series representing changes in the four cell counts or weights and to begin with the scenario of no remissions in both treatment arms and ending with the scenario of total remissions in both treatment arms, and including the two scenarios where there are total remissions in one treatment arm and none in the other. The following code represents the modified data step: a macro program using two macro variables, row and col, that define the row-column position of any 2 x 2 component matrix in the 31 x 31 grand matrix. Any weight (wt) is, therefore, necessarily a function of the position coordinates of the corresponding 2 x 2 component matrix. %do row = 1 %to 31; %do col = 1 %to 31; data ds&row._&col; p = 0; t = 0; wt = &row - 1; output; p = 0; t = 1; wt = 30 - (&row - 1); output; p = 1; t = 0; wt = &col - 1; output; p = 1; t = 1; wt = 30 - (&col - 1); output; To identify each simulated data set that corresponds to a 2 x 2 component matrix, the macro variables, row and col that drive the cell weight computations were d in the data set nomenclature illustrated above. The corresponding procedure steps for the tests of association would be necessarily modified only in the data set name, as proc freq data= ds&row._&col; weight wt; table p * t / chisq; Clearly, 961 statistical tests present a formidable task for review of p values. However, with the ability now achieved to programmatically simulate all possible scenarios based upon 30 subjects in each treatment arm, the ability to assess the outcome of statistical tests must keep pace otherwise the labour saved to construct outcome scenarios would be lost to labour for review of results. Thus, a suitable means of presenting the simulated 2 x 2 tables with corresponding tests would complete the pursuit of decreasing the labour effort. Furthermore, given the completeness of the simulation, the expected cell size of some cells amongst the 961 sets of 2 x 2 tables would highly likely be less than 5. Thus, Fisher s exact test would be the preferred test in these cases and these cases would need to be identified. These features needed to be added and automated, if possible. Code upgrade for labour downgrade stage 2, data presentation To capture the results of Fisher s exact test, the Output Delivery System or ODS was implemented. Two output data sets were generated using two ODS output options, chisq and fishersexact, in each implementation of the FREQ procedure step: 961 sets of two output data sets. Since macroprocessing was already in place, each of these 961 pairs of output data sets was correspondingly indexed as follows: ods output chisq = CS&row._&col ; ods output fishersexact = fx&row._&col ;

Further, the FREQ procedure was modified in the TABLE statement with the out= option to generate a procedure output data set containing one record per 2 x 2 cell, with the outexpect option to include the expected cell counts in the procedure output data set, as follows: proc freq data= ds&row._&col; weight wt; table p * t / chisq out=sds&row._&col outexpect; Each procedure output data set containing the cell counts and the expected values was manipulated by the addition to two index variables. One of these was d to count the number of expected values less than 5 in the variable clt5. Such expected values were indicated as the values of the default variable expected. The other index variable was d to count the number of nonzero values in the variable cnzc. When there were counted, in any one output data set, one expected value less than 5 and three nonzero values, or two expected values less than 5 and four nonzero values, then a macro variable, FEndx, was defined that was to index the of Fisher s exact test. Thus, if not lastrec then do; call symput ('FEndx',0); if lastrec then do; if clt5 = 1 & cnzc = 3 clt5 = 2 & cnzc = 4 then do; call symput ('FEndx',1); It should be mentioned that empty cells from the FREQ procedure would default to missing in the procedure output data set so that the counting of nonmissing values in the procedure output data set is equivalent to the counting of nonzero cells. There were, therefore, 961 evaluations for expected values and nonzero cells in 961 data sets. The data set variable, FE1, representing the index to Fisher s exact test was derived from the assigned macro variable, FEndx, described in the preceding paragraph. Thus, %if &FEndx = 1 %then FE1 = "1";; Next, the final presentation of results needed to be simplified and to include the i th j th index that indicates when Fisher s exact test is to be implemented. The i th j th modified procedure output data set, the i th j th ODS output data set defined by the ODS output option chisq, and the i th j th ODS output data set defined by the ODS output option fishersexact, were each complemented with a merge variable defined as the i th j th cell position corresponding to that i th j th 2 x 2 component matrix in the 31 x 31 grand matrix. Such a merge variable, cell, could be defined as length cell $ 5; cell = "&row.:&col"; Thus, for each 2 x 2 simulated data set, the two corresponding ODS output data sets were modified and merged with the corresponding procedure output data set to yield a data set of one record. The two data sets derived from the two extreme scenarios of no remissions and total remissions were excluded leaving 959 pairs of data set modifications and 959 merges of data sets resulting in 959 one-record data sets. The cells of the 31 x 31 grand matrix representing the two excluded data sets were i,j = 1,1 and i,j = 31,31. Therefore, the processing for these 959 sets of modifications and merges was controlled by the following SAS code: %if (&row = 1 and &col > 1) or (&row > 1 and &row < 31) or (&row = 31 and &col < 31) %then %do; Finally, the 959 single-record data sets cells were concatenated with two dummy records representing the grand matrix positions 1,1 and 31,31. It should be noted that there were original table cells from the procedure output data set that contained a cell count of 0. These would default to missing in this final concatenated data set but these were corrected to 0 by direct coding. The final step was to identify whether superiority or futility had been arrived at in any of the 959 scenarios. The significance level had been previously decided upon as part of the study design and was d to compare the attained significance levels in two-tailed tests to detect nondirectional superiority or futility. A simple code

construction using the IF THEN cla was implemented to examine all chi-square p values, prob, including where appropriate, those p values from Fisher s exact test, tailt. Thus, encoding superiority and futility in the variable rule with futility as rule = F and superiority as rule = S, the following code is sufficient: The complex case if fe1 = "1" then do; if tailt > 0.9548 then rule = "F"; else if. < tailt < 0.0004 then rule = "S"; else if fe1 = " " then do; if prob > 0.9548 then rule = "F"; else if. < prob < 0.0004 then rule = "S"; Now that 959 2 x 2 tables, chi-square tests, and Fisher s exact tests could be represented in 959 rows in one SAS data set and indexed as to the of Fisher s exact test, the Sponsor considered and submitted another simple request: a change in the sample size per treatment arm. Increasing the treatment-arm sample size to 60, the grand matrix then becomes 61 x 61 with 3,721 possible rows in a final data set. A sample size of 90 per treatment arm would produce a final data set of 8,281 rows. Anticipating a series of such changes and considering the logistics of implementing such changes in the program, the superiority of SAS macroprocessing was again implemented in the perceived futility of defining the ultimate answer. The data generation code was modified with the macro variable n representing the sample size per treatment arm as, %do row = 1 %to %eval(&n+1); %do col = 1 %to %eval(&n+1); data ds&row._&col; p = 0; t = 0; wt = &row - 1; output; p = 0; t = 1; wt = &n - (&row - 1); output; p = 1; t = 0; wt = &col - 1; output; p = 1; t = 1; wt = &n - (&col - 1); output; The code for processing of all data sets excluding i,j = 1,1 and i,j = n+1,n+1 was also modified with the macro variable n as follows: %if (&row = 1 and &col > 1) or (&row > 1 and &row < %eval(&n+1)) or (&row = %eval(&n+1) and &col < %eval(&n+1)) %then %do; Results The following is a sample of nine consecutive rows of SAS output printed from the final data set of results that were generated from the simulation of 30 subjects per treatment arm. The column headings are the final SAS variable names where cell identifies the position in the grand matrix to which the particular row of results applies. In particular, the rows in the sample output represent the last three columns of row one of the grand matrix, and the first six columns of row two of the grand matrix. The variables count0 and count1 represent the number of remissions under each treatment arm. The chi-square statistic and associated degrees of freedom are provided in CS_stat and CS_df with the attained p value in CS_pval. The calculated index to Fisher s exact test is given by FE1=1, which appears in six rows, along with three p values associated with this test, FE_Lpval, FE_Rpval, FE_Tpval for one-tailed tests to the left and to the right and the two-tailed test, respectively. These nine records or observations illustrate, by the variable rule, three scenarios where superiority was detected, three scenarios where futility was detected, and three scenarios where neither superiority nor futility was detected. 29 1:29 30 2 1 52.5000 <.0001 0.0000 1.0000 0.0000 S 30 1:30 30 1 1 56.1290 <.0001 0.0000 1.0000 0.0000 S 31 1:31 30 0 1 60.0000 <.0001 0.0000 1.0000 0.0000 S

32 2:1 29 30 1 1 1.0169 0.3132 1.0000 0.5000 1.0000 F 33 2:2 29 29 1 1 0.0000 1.0000 0.7542 0.7542 1.0000 F 34 2:3 29 28 1 1 0.3509 0.5536 0.5000 0.8814 1.0000 F 35 2:4 29 27 1 1 1.0714 0.3006 0.3060 0.9438 0.6120 36 2:5 29 26 1 1 1.9636 0.1611 0.1766 0.9739 0.3533 37 2:6 29 25 1 1 2.9630 0.0852 0.0973 0.9881 0.1945 Thus, by selecting a significance level, one can identify outcomes where superiority/futility is conclusive. Having a significance level decided upon in advance then programmatically applied to the final data set, a decision rule was assessed to indicate scenarios where superiority was detected or scenarios where attained futility in the trial was detected. The decision rule takes into account the expected cell sizes and, therefore, implementation of either the chi-square test or Fisher s exact test. By the method illustrated herein, the r can now more easily inspect the outcome of all 961 scenarios, or restrict inspection simply to those scenarios where the decision rule indicates superiority or futility. Conclusion When faced with the challenge, during a clinical trial, of simulating a number of potential outcomes, performing appropriate statistical tests, and determining a course of action that would be based on these tests, the enormity of the scale for judicious reasoning becomes unnaturally and disconcertingly stretched. Simplification of any extravagant production process to provide the decision maker with a reasonable summary of critical points could be crucial to the clinical trial or could provide a deterrent to futile pursuits. The superiority of SAS macroprocessing was brought into focus in this exercise in the face of futility of continual investigating. That which appeared at first to be a simple task from the position of the author of the request became escalated conceivably beyond reason and evidently to the deterrence of further posing of regretful requests. The DCRI developed a software program using the SAS System, in particular, SAS macroprocessing, to address not only this current situation but any future query that any investigator might have. This could be done with a highly favourable response time that returns not just an answer to the specific request but an exhaustive and complete analysis based on the initial request. Furthermore, complexity of multiplicity in the problem was reduced to simplicity by SAS macroprocessing, yet the reader should note that the samples of SAS code herein are not collectively complete but are representative of the key steps in the process. Appendix A contains the first 110 records and the final 81 records from the same SAS data set of results. This Appendix appears at the end of this paper. Acknowledgements The authors are appreciative of the DCRI and this forum of SAS rs for the opportunity to promote this knowledge as far as it would contribute ultimately toward the improvement of patient care. Contact Information Phil d Almada DCRI, Duke Medical Center 300 W. Morgan St., Suite 800 Durham, NC 27701 Office phone number: 919-668-8013 Facsimile phone number: 919-668-7049 E-mail address: phil.dalmada@duke.edu SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies.

Appendix A Partial test results from the simulation of 30 subjects per treatment arm. Futility/superiority simulation 5 Chi-square and Fisher's Exact tests, 961 scenarios. 1 1:1 30 30...... 2 1:2 30 29 1 1 1.0169 0.3132 0.5000 1.0000 1.0000 F 3 1:3 30 28 1 1 2.0690 0.1503 0.2458 1.0000 0.4915 4 1:4 30 27 1 1 3.1579 0.0756 0.1186 1.0000 0.2373 5 1:5 30 26 1 1 4.2857 0.0384 0.0562 1.0000 0.1124 6 1:6 30 25 1 1 5.4545 0.0195 0.0261 1.0000 0.0522 7 1:7 30 24 1 1 6.6667 0.0098 0.0119 1.0000 0.0237 8 1:8 30 23 1 1 7.9245 0.0049 0.0053 1.0000 0.0105 9 1:9 30 22 1 1 9.2308 0.0024 0.0023 1.0000 0.0046 10 1:10 30 21 1 1 10.5882 0.0011 0.0010 1.0000 0.0019 11 1:11 30 20 1 12.0000 0.0005 0.0004 1.0000 0.0008 12 1:12 30 19 1 13.4694 0.0002 0.0002 1.0000 0.0003 S 13 1:13 30 18 1 15.0000 0.0001 0.0001 1.0000 0.0001 S 14 1:14 30 17 1 16.5957 <.0001 0.0000 1.0000 0.0000 S 15 1:15 30 16 1 18.2609 <.0001 0.0000 1.0000 0.0000 S 16 1:16 30 15 1 20.0000 <.0001 0.0000 1.0000 0.0000 S 17 1:17 30 14 1 21.8182 <.0001 0.0000 1.0000 0.0000 S 18 1:18 30 13 1 23.7209 <.0001 0.0000 1.0000 0.0000 S 19 1:19 30 12 1 25.7143 <.0001 0.0000 1.0000 0.0000 S 20 1:20 30 11 1 27.8049 <.0001 0.0000 1.0000 0.0000 S 21 1:21 30 10 1 30.0000 <.0001 0.0000 1.0000 0.0000 S 22 1:22 30 9 1 32.3077 <.0001 0.0000 1.0000 0.0000 S 23 1:23 30 8 1 34.7368 <.0001 0.0000 1.0000 0.0000 S 24 1:24 30 7 1 37.2973 <.0001 0.0000 1.0000 0.0000 S 25 1:25 30 6 1 40.0000 <.0001 0.0000 1.0000 0.0000 S 26 1:26 30 5 1 42.8571 <.0001 0.0000 1.0000 0.0000 S 27 1:27 30 4 1 45.8824 <.0001 0.0000 1.0000 0.0000 S 28 1:28 30 3 1 49.0909 <.0001 0.0000 1.0000 0.0000 S 29 1:29 30 2 1 52.5000 <.0001 0.0000 1.0000 0.0000 S 30 1:30 30 1 1 56.1290 <.0001 0.0000 1.0000 0.0000 S 31 1:31 30 0 1 60.0000 <.0001 0.0000 1.0000 0.0000 S 32 2:1 29 30 1 1 1.0169 0.3132 1.0000 0.5000 1.0000 F 33 2:2 29 29 1 1 0.0000 1.0000 0.7542 0.7542 1.0000 F 34 2:3 29 28 1 1 0.3509 0.5536 0.5000 0.8814 1.0000 F 35 2:4 29 27 1 1 1.0714 0.3006 0.3060 0.9438 0.6120 36 2:5 29 26 1 1 1.9636 0.1611 0.1766 0.9739 0.3533 37 2:6 29 25 1 1 2.9630 0.0852 0.0973 0.9881 0.1945 38 2:7 29 24 1 1 4.0431 0.0444 0.0514 0.9947 0.1028 39 2:8 29 23 1 1 5.1923 0.0227 0.0262 0.9977 0.0523 40 2:9 29 22 1 1 6.4052 0.0114 0.0128 0.9990 0.0257 41 2:10 29 21 1 7.6800 0.0056 0.0061 0.9996 0.0122 42 2:11 29 20 1 9.0167 0.0027 0.0028 0.9998 0.0056 43 2:12 29 19 1 10.4167 0.0012 0.0012 0.9999 0.0025 44 2:13 29 18 1 11.8822 0.0006 0.0005 1.0000 0.0011 45 2:14 29 17 1 13.4161 0.0002 0.0002 1.0000 0.0004 S 46 2:15 29 16 1 15.0222 0.0001 0.0001 1.0000 0.0002 S 47 2:16 29 15 1 16.7045 <.0001 0.0000 1.0000 0.0001 S 48 2:17 29 14 1 18.4679 <.0001 0.0000 1.0000 0.0000 S 49 2:18 29 13 1 20.3175 <.0001 0.0000 1.0000 0.0000 S 50 2:19 29 12 1 22.2593 <.0001 0.0000 1.0000 0.0000 S 51 2:20 29 11 1 24.3000 <.0001 0.0000 1.0000 0.0000 S 52 2:21 29 10 1 26.4469 <.0001 0.0000 1.0000 0.0000 S 53 2:22 29 9 1 28.7081 <.0001 0.0000 1.0000 0.0000 S 54 2:23 29 8 1 31.0928 <.0001 0.0000 1.0000 0.0000 S 55 2:24 29 7 1 33.6111 <.0001 0.0000 1.0000 0.0000 S

Futility/superiority simulation 6 Chi-square and Fisher's Exact tests, 961 scenarios. 56 2:25 29 6 1 36.2743 <.0001 0.0000 1.0000 0.0000 S 57 2:26 29 5 1 39.0950 <.0001 0.0000 1.0000 0.0000 S 58 2:27 29 4 1 42.0875 <.0001 0.0000 1.0000 0.0000 S 59 2:28 29 3 1 45.2679 <.0001 0.0000 1.0000 0.0000 S 60 2:29 29 2 1 48.6541 <.0001 0.0000 1.0000 0.0000 S 61 2:30 29 1 1 52.2667 <.0001 0.0000 1.0000 0.0000 S 62 2:31 29 0 1 56.1290 <.0001 0.0000 1.0000 0.0000 S 63 3:1 28 30 1 1 2.0690 0.1503 1.0000 0.2458 0.4915 64 3:2 28 29 1 1 0.3509 0.5536 0.8814 0.5000 1.0000 F 65 3:3 28 28 1 1 0.0000 1.0000 0.6940 0.6940 1.0000 F 66 3:4 28 27 1 1 0.2182 0.6404 0.5000 0.8234 1.0000 F 67 3:5 28 26 1 1 0.7407 0.3894 0.3354 0.9027 0.6707 68 3:6 28 25 1 1 1.4555 0.2276 0.2119 0.9486 0.4238 69 3:7 28 24 1 1 2.3077 0.1287 0.1271 0.9738 0.2542 70 3:8 28 23 1 1 3.2680 0.0706 0.0727 0.9872 0.1455 71 3:9 28 22 1 4.3200 0.0377 0.0399 0.9939 0.0797 72 3:10 28 21 1 5.4545 0.0195 0.0210 0.9972 0.0419 73 3:11 28 20 1 6.6667 0.0098 0.0106 0.9988 0.0211 74 3:12 28 19 1 7.9542 0.0048 0.0051 0.9995 0.0102 75 3:13 28 18 1 9.3168 0.0023 0.0024 0.9998 0.0048 76 3:14 28 17 1 10.7556 0.0010 0.0011 0.9999 0.0021 77 3:15 28 16 1 12.2727 0.0005 0.0005 1.0000 0.0009 78 3:16 28 15 1 13.8714 0.0002 0.0002 1.0000 0.0004 S 79 3:17 28 14 1 15.5556 <.0001 0.0001 1.0000 0.0001 S 80 3:18 28 13 1 17.3299 <.0001 0.0000 1.0000 0.0001 S 81 3:19 28 12 1 19.2000 <.0001 0.0000 1.0000 0.0000 S 82 3:20 28 11 1 21.1722 <.0001 0.0000 1.0000 0.0000 S 83 3:21 28 10 1 23.2536 <.0001 0.0000 1.0000 0.0000 S 84 3:22 28 9 1 25.4524 <.0001 0.0000 1.0000 0.0000 S 85 3:23 28 8 1 27.7778 <.0001 0.0000 1.0000 0.0000 S 86 3:24 28 7 1 30.2400 <.0001 0.0000 1.0000 0.0000 S 87 3:25 28 6 1 32.8507 <.0001 0.0000 1.0000 0.0000 S 88 3:26 28 5 1 35.6229 <.0001 0.0000 1.0000 0.0000 S 89 3:27 28 4 1 38.5714 <.0001 0.0000 1.0000 0.0000 S 90 3:28 28 3 1 41.7130 <.0001 0.0000 1.0000 0.0000 S 91 3:29 28 2 1 45.0667 <.0001 0.0000 1.0000 0.0000 S 92 3:30 28 1 1 48.6541 <.0001 0.0000 1.0000 0.0000 S 93 3:31 28 0 1 52.5000 <.0001 0.0000 1.0000 0.0000 S 94 4:1 27 30 1 1 3.1579 0.0756 1.0000 0.1186 0.2373 95 4:2 27 29 1 1 1.0714 0.3006 0.9438 0.3060 0.6120 96 4:3 27 28 1 1 0.2182 0.6404 0.8234 0.5000 1.0000 F 97 4:4 27 27 1 1 0.0000 1.0000 0.6646 0.6646 1.0000 F 98 4:5 27 26 1 1 0.1617 0.6876 0.5000 0.7881 1.0000 F 99 4:6 27 25 1 1 0.5769 0.4475 0.3532 0.8729 0.7065 100 4:7 27 24 1 1 1.1765 0.2781 0.2358 0.9273 0.4716 101 4:8 27 23 1 1.9200 0.1659 0.1495 0.9601 0.2990 102 4:9 27 22 1 2.7829 0.0953 0.0903 0.9790 0.1806 103 4:10 27 21 1 3.7500 0.0528 0.0521 0.9894 0.1042 104 4:11 27 20 1 4.8118 0.0283 0.0287 0.9949 0.0575 105 4:12 27 19 1 5.9627 0.0146 0.0152 0.9976 0.0303 106 4:13 27 18 1 7.2000 0.0073 0.0077 0.9989 0.0153 107 4:14 27 17 1 8.5227 0.0035 0.0037 0.9995 0.0074 108 4:15 27 16 1 9.9316 0.0016 0.0017 0.9998 0.0034 109 4:16 27 15 1 11.4286 0.0007 0.0008 0.9999 0.0015 110 4:17 27 14 1 13.0167 0.0003 0.0003 1.0000 0.0006 S

Futility/superiority simulation 21 Chi-square and Fisher's Exact tests, 961 scenarios. 881 29:13 2 18 1 19.2000 <.0001 1.0000 0.0000 0.0000 S 882 29:14 2 17 1 17.3299 <.0001 1.0000 0.0000 0.0001 S 883 29:15 2 16 1 15.5556 <.0001 1.0000 0.0001 0.0001 S 884 29:16 2 15 1 13.8714 0.0002 1.0000 0.0002 0.0004 S 885 29:17 2 14 1 12.2727 0.0005 1.0000 0.0005 0.0009 886 29:18 2 13 1 10.7556 0.0010 0.9999 0.0011 0.0021 887 29:19 2 12 1 9.3168 0.0023 0.9998 0.0024 0.0048 888 29:20 2 11 1 7.9542 0.0048 0.9995 0.0051 0.0102 889 29:21 2 10 1 6.6667 0.0098 0.9988 0.0106 0.0211 890 29:22 2 9 1 5.4545 0.0195 0.9972 0.0210 0.0419 891 29:23 2 8 1 4.3200 0.0377 0.9939 0.0399 0.0797 892 29:24 2 7 1 1 3.2680 0.0706 0.9872 0.0727 0.1455 893 29:25 2 6 1 1 2.3077 0.1287 0.9738 0.1271 0.2542 894 29:26 2 5 1 1 1.4555 0.2276 0.9486 0.2119 0.4238 895 29:27 2 4 1 1 0.7407 0.3894 0.9027 0.3354 0.6707 896 29:28 2 3 1 1 0.2182 0.6404 0.8234 0.5000 1.0000 F 897 29:29 2 2 1 1 0.0000 1.0000 0.6940 0.6940 1.0000 F 898 29:30 2 1 1 1 0.3509 0.5536 0.5000 0.8814 1.0000 F 899 29:31 2 0 1 1 2.0690 0.1503 0.2458 1.0000 0.4915 900 30:1 1 30 1 56.1290 <.0001 1.0000 0.0000 0.0000 S 901 30:2 1 29 1 52.2667 <.0001 1.0000 0.0000 0.0000 S 902 30:3 1 28 1 48.6541 <.0001 1.0000 0.0000 0.0000 S 903 30:4 1 27 1 45.2679 <.0001 1.0000 0.0000 0.0000 S 904 30:5 1 26 1 42.0875 <.0001 1.0000 0.0000 0.0000 S 905 30:6 1 25 1 39.0950 <.0001 1.0000 0.0000 0.0000 S 906 30:7 1 24 1 36.2743 <.0001 1.0000 0.0000 0.0000 S 907 30:8 1 23 1 33.6111 <.0001 1.0000 0.0000 0.0000 S 908 30:9 1 22 1 31.0928 <.0001 1.0000 0.0000 0.0000 S 909 30:10 1 21 1 28.7081 <.0001 1.0000 0.0000 0.0000 S 910 30:11 1 20 1 26.4469 <.0001 1.0000 0.0000 0.0000 S 911 30:12 1 19 1 24.3000 <.0001 1.0000 0.0000 0.0000 S 912 30:13 1 18 1 22.2593 <.0001 1.0000 0.0000 0.0000 S 913 30:14 1 17 1 20.3175 <.0001 1.0000 0.0000 0.0000 S 914 30:15 1 16 1 18.4679 <.0001 1.0000 0.0000 0.0000 S 915 30:16 1 15 1 16.7045 <.0001 1.0000 0.0000 0.0001 S 916 30:17 1 14 1 15.0222 0.0001 1.0000 0.0001 0.0002 S 917 30:18 1 13 1 13.4161 0.0002 1.0000 0.0002 0.0004 S 918 30:19 1 12 1 11.8822 0.0006 1.0000 0.0005 0.0011 919 30:20 1 11 1 10.4167 0.0012 0.9999 0.0012 0.0025 920 30:21 1 10 1 9.0167 0.0027 0.9998 0.0028 0.0056 921 30:22 1 9 1 7.6800 0.0056 0.9996 0.0061 0.0122 922 30:23 1 8 1 1 6.4052 0.0114 0.9990 0.0128 0.0257 923 30:24 1 7 1 1 5.1923 0.0227 0.9977 0.0262 0.0523 924 30:25 1 6 1 1 4.0431 0.0444 0.9947 0.0514 0.1028 925 30:26 1 5 1 1 2.9630 0.0852 0.9881 0.0973 0.1945 926 30:27 1 4 1 1 1.9636 0.1611 0.9739 0.1766 0.3533 927 30:28 1 3 1 1 1.0714 0.3006 0.9438 0.3060 0.6120 928 30:29 1 2 1 1 0.3509 0.5536 0.8814 0.5000 1.0000 F 929 30:30 1 1 1 1 0.0000 1.0000 0.7542 0.7542 1.0000 F 930 30:31 1 0 1 1 1.0169 0.3132 0.5000 1.0000 1.0000 F 931 31:1 0 30 1 60.0000 <.0001 1.0000 0.0000 0.0000 S 932 31:2 0 29 1 56.1290 <.0001 1.0000 0.0000 0.0000 S 933 31:3 0 28 1 52.5000 <.0001 1.0000 0.0000 0.0000 S 934 31:4 0 27 1 49.0909 <.0001 1.0000 0.0000 0.0000 S 935 31:5 0 26 1 45.8824 <.0001 1.0000 0.0000 0.0000 S

Futility/superiority simulation 22 Chi-square and Fisher's Exact tests, 961 scenarios. 936 31:6 0 25 1 42.8571 <.0001 1.0000 0.0000 0.0000 S 937 31:7 0 24 1 40.0000 <.0001 1.0000 0.0000 0.0000 S 938 31:8 0 23 1 37.2973 <.0001 1.0000 0.0000 0.0000 S 939 31:9 0 22 1 34.7368 <.0001 1.0000 0.0000 0.0000 S 940 31:10 0 21 1 32.3077 <.0001 1.0000 0.0000 0.0000 S 941 31:11 0 20 1 30.0000 <.0001 1.0000 0.0000 0.0000 S 942 31:12 0 19 1 27.8049 <.0001 1.0000 0.0000 0.0000 S 943 31:13 0 18 1 25.7143 <.0001 1.0000 0.0000 0.0000 S 944 31:14 0 17 1 23.7209 <.0001 1.0000 0.0000 0.0000 S 945 31:15 0 16 1 21.8182 <.0001 1.0000 0.0000 0.0000 S 946 31:16 0 15 1 20.0000 <.0001 1.0000 0.0000 0.0000 S 947 31:17 0 14 1 18.2609 <.0001 1.0000 0.0000 0.0000 S 948 31:18 0 13 1 16.5957 <.0001 1.0000 0.0000 0.0000 S 949 31:19 0 12 1 15.0000 0.0001 1.0000 0.0001 0.0001 S 950 31:20 0 11 1 13.4694 0.0002 1.0000 0.0002 0.0003 S 951 31:21 0 10 1 12.0000 0.0005 1.0000 0.0004 0.0008 952 31:22 0 9 1 1 10.5882 0.0011 1.0000 0.0010 0.0019 953 31:23 0 8 1 1 9.2308 0.0024 1.0000 0.0023 0.0046 954 31:24 0 7 1 1 7.9245 0.0049 1.0000 0.0053 0.0105 955 31:25 0 6 1 1 6.6667 0.0098 1.0000 0.0119 0.0237 956 31:26 0 5 1 1 5.4545 0.0195 1.0000 0.0261 0.0522 957 31:27 0 4 1 1 4.2857 0.0384 1.0000 0.0562 0.1124 958 31:28 0 3 1 1 3.1579 0.0756 1.0000 0.1186 0.2373 959 31:29 0 2 1 1 2.0690 0.1503 1.0000 0.2458 0.4915 960 31:30 0 1 1 1 1.0169 0.3132 1.0000 0.5000 1.0000 F 961 31:31 0 0......