Unit 1: The One-Factor ANOVA as a Generalization of the Two-Sample t Test

Size: px

Start display at page:

Download "Unit 1: The One-Factor ANOVA as a Generalization of the Two-Sample t Test"

Marvin Corey Lester
6 years ago
Views:

1 Minitab Notes for STAT 6305: Analysis of Variance Models Department of Statistics and Biostatistics CSU East Bay Unit 1: The One-Factor ANOVA as a Generalization of the Two-Sample t Test 1.1. Data and Worksheet Preparation Consider two randomly chosen samples of bottles of a particular drug. Bottles in Group 1 are chosen from current production, those in Group 2 have been stored under regulated conditions for one year. There are 10 bottles in each group. The potency of each bottle is assayed and recorded. The issue is whether potency of the population of year-old bottles is the same as for the population of the ones currently being made. The potency data are as shown below: These data are from Table 6.1 (page 294) Ott and Longnecker: An Introduction to Statistical Methods and Data Analysis 6th ed., Duxbury, One way to put these data into a Minitab worksheet is to "cut and paste" from this unit. Be sure Minitab commands are "enabled" before you start. The goal is to make the Session Window look as shown below by using the bulleted instructions. "Enable commands" in the Minitab Session Window using the EDITOR menu. (First, activate the Session window by clicking anywhere within it; you cannot modify the Session window when a Worksheet is active. Second, be sure to use the EDITOR menu, not EDIT.) Type the first two lines below (the ones with the name and set commands). The DATA> prompt should appear automatically at the beginning of the third line. In the third line, do the following instead of typing the data: In your browser, highlight the data for Group 1, and "cut" these 10 observations using CTRL-C. In the Minitab Session Window, make sure the cursor follows the DATA> prompt and "paste" the data with CTRL-V. Then press ENTER. (It's OK if the spacing is a little different than you see below, but make sure that you captured all 10 observations.) Similarly, cut and paste the data for Group 2 into the fourth line. Finally, type end on the fifth line to signal that data entry for c1 is complete. MTB > name c1 'Potency' MTB > set c1 DATA> 10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6 DATA> 9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9 DATA> end Now display the data in c1 using either the menu path or the command shown below: DATA > Display Data MTB > print c1 Group 1: 10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6 Group 2: 9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9

2 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-2 This produces a (horizontal) printout of the 20 observations in c1. Also look in the worksheet to see the data there. Next we need a column of "subscripts" in c2 to show which observations come from which group. Name c2 'Group' either with a command or by typing the name directly into the worksheet. Then enter the subscripts using either the menus (bold type) or the set command. Type Group atop column 2 in the Worksheet CALC > Patterned Data, Simple, values from 1 to 2, each individual value repeated 10 times MTB > name c2 'Group' MTB > set c2 DATA> (1:2)10 DATA> end This way of organizing data, with all observations in a single column and groups designated in a separate column of subscripts, is called "stacked" format in Minitab (and in some other software). For such a small dataset you could just type the 20 'Potency' determinations and the 20 'Group' numbers directly into the worksheet. However, when using documents in DOC, PDF, or HTML format, you may find it convenient to learn (i) to cut and paste data into a worksheet and (ii) to use the "patterned data" features of the set command. It is best to start learning with the current relatively simple data to do these two things. Once you have entered the data into a worksheet, you should always proofread your work before continuing. You can do this either by printing the data to the Session window (using the print command) or by looking directly at the Worksheet. Proofreading should become an automatic part of your data entry. Beyond the first few units these notes will not always be reminded to proofread, but do so anyway. Problems Here is an alternate way to prepare the worksheet. Follow through the steps, cutting and pasting data where appropriate. What menu choices would produce the same results? [Look at the DATA menu.] Explain what each command does. Compare c13 and c14 with c1 and c2. MTB > name c11 'Fresh' c12 'Stored' MTB > set c11 DATA> 10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6 DATA> end MTB > set c12 DATA> 9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9 DATA> end MTB > stack c11 c12 c13; SUBC> subs c In the process of working Problem you put the data for each group into a separate column (c11 and c12). Data in separate columns are said to be in "unstacked" format. Look at the DATA menu and figure out how the stacked data in c1 can be put into unstacked format using the subscripts in c2. (Use the column names c21 'New' and c22 'Old' for this.) What command/ subcommand combination could you use to unstack the data, without the help of the menus? (Minitab is a command-based package. The menus are sometimes a convenient way to generate the commands, which then appear in the Session window when the command language is enabled.)

3 1.2. Descriptive Methods Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-3 Whenever possible, data analysis should begin with descriptive methods, both numerical and graphical. Here, it seems clear from the dotplot below that there is a tendency for the potency of stored bottles to be less than the potency of the fresh ones: Group 1 (mean above 10.25) has generally higher values than Group 2 (mean below 10.00). GRAPH Dotplot With Groups (Makes 'professional' graphic display, different from the one shown below.) MTB > gstd (Puts Minitab into 'standard' graphics mode; use gpro to return to 'professional' graphics.) MTB > dotp c1; SUBC> by c2. Group 1 Group 2.. :.. : Potency. :. :. : Potency Now we compute numerical descriptive statistics, broken out by the subscript variable in c2 into two groups. STAT > Basic > Descriptive statistics, 'by variable' option MTB > describe c1; SUBC> by c2. Variable Group N N* Mean SE Mean StDev Minimum Q1 Median Potency Variable Group Q3 Maximum Potency Problems Minitab makes graphical displays in one of two formats: Standard (or Character) graphics. These are composed of text symbols and appear in the Session window. They have relatively low resolution, but they are easy to paste into reports using a work processor. They also help to keep file sizes small. (Be sure to use a monospace font such as Courier and to proofread to make sure the graph looks the same after pasting as it did before cutting from Minitab.) We often show standard graphics in these notes. To activate standard graphics, use the command gstd and then issue the command for the kind of graph desired. Standard graphics are not available from menus. Professional (or Pixel) graphics. These are true graphic images using Windows technology. They appear in separate boxes on your screen, not in the Session window. These images can be saved in a variety of graphics formats. They can be included as graphic images on the web and can be imported into word processing and desk-top publishing documents. They greatly increase the file size of documents that incorporate them. Minitab starts in professional graphics mode. To re-activate professional graphics after using character graphics, use the command gpro.

4 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-4 Illustrate both types of graphics by making boxplots as follows: MTB > gstd MTB > boxp c1; SUBC> by c2. MTB > gpro MTB > boxp c1 * c2 MTB > dotp c1 * c2 (Also accessible via menus.) Comment on the results as follows: (a) Do the boxplots show the differences between the two groups as clearly as do the dotplots? More clearly? Defend your answer. (b) Look at one of the dotplots above. Can you see exactly how many data points are represented? Now look at one of the boxplots above. Can you see how many data points are represented? (c) Minitab's boxplots sometimes indicate the presence of outliers. Are outliers indicated for either of our groups? (d) What descriptive statistics are used in making boxplots? (e) Describe the differences between standard-graphics and professional-graphics boxplots. [The two styles of boxplots use slightly different rules for computing quartiles. Particularly with small sample sizes, these differences may be noticeable.] (f) We have given several commands above. What menu choices can be used to produce professional-style boxplots? In R one prepares vectors for potencies of Stored and Fresh samples, finds descriptive statistics for each group, combines data into a single vector of potencies with a corresponding categorical vector of sample types, and makes stripcharts and boxplots of the data as shown below. Execute the code and show the results, and compare with corresponding results obtained in Minitab. (Note: The function as.factor designates typ as a categorical rather than a numerical variable. In this unit, the distinction in variable types is not always important because typ takes only two values. For some procedures, this distinction becomes crucial if an intended categorical variable takes more than two values.) fresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6) stored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9) potency = c(fresh, stored); n1 = length(fresh); n2 = length(stored) typ = as.factor(c(rep(1, times=n1), rep(2, times=n2))) summary(fresh); sd(fresh) summary(stored); sd(stored) par(mfrow=c(1,2)) # puts two graphs on one page stripchart(potency ~ typ, method="stack", vertical=t) boxplot(potency ~ typ) par(mfrow=c(1,1)) # return to default one graph per page

5 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit Comparing a t Test with a One-Factor ANOVA The descriptive methods in Section 1.2 strongly suggest that fresh samples of the drug tend to be more potent than stored ones. Now we look at several different ways to confirm this impression with formal statistical tests. That is, we test H 0 : the 2 groups have equal potency against H a : the 2 groups have different potencies. The first of these is the two-tailed, pooled two-sample t test. The command for a two-sample t test on stacked data is twot. Minitab defaults for two-sample t tests: The two-tailed (or two-sided) alternative is the default; one-sided alternatives require the subcommand alternative followed by either 1 (right-sided alternative) or -1 (left-sided). The separate variances ("t-prime") test is the default. Pooling requires the subcommand pool. Computer simulation results have established that the separate variances test is often preferable for two-sample tests. Here we use the pooled test because it generalizes more readily to the ANOVA methods of these notes. Note on stacked vs. unstacked data: The command twosample would be used if the potency measurements for the two groups had been entered into two separate columns one for Fresh and one for Stored. Such "unstacked" data are seldom used for computer analysis outside of elementary statistics classes. Minitab is one of the few serious computer packages that makes direct use of unstacked data and, even then, only for a few elementary procedures. STAT > Basic > 2-sample t, one column, assume equal variances MTB > twot c1 c2; SUBC> pool. Two-sample T for Potency Group N Mean StDev SE Mean Difference = mu (1) - mu (2) Estimate for difference: % CI for difference: ( , ) T-Test of difference = 0 (vs not =): T-Value = 4.24 P-Value = DF = 18 Both use Pooled StDev = We see (from the very small P-value) that the difference between the two groups is very highly significant. This is what we guessed would be the case from looking at the dotplots above. Either the Fresh samples were originally manufactured to have a higher potency or the potency of the Stored samples deteriorated with a year of storage. (Or perhaps a combination of these two mechanisms.) The one-factor or one-way ANOVA design (also sometimes called the "completely randomized design") is a generalization of the two-sided, pooled two-sample t test that can handle more than two groups. Thus, when it is applied to only two groups, its result should agree with that of the t test.

6 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-6 STAT ANOVA Oneway MTB > oneway c1 c2 (Alternatively: MTB > onew 'Potency' 'Group') One-way ANOVA: Potency versus Group Source DF SS MS F P Group Error Total S = R-Sq = 49.93% R-Sq(adj) = 47.15% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ( *------) (------* ) Pooled StDev = The P-value for both the t test and the one-way ANOVA is Depending on the release of Minitab this may be printed as (meaning less than ) or rounded to four places, for example: The square of a t-distributed random variable with 18 df is an F-distributed random variable with 1 df in the numerator and 18 df in the denominator. In fact, the squares of the.025 values for t(ν) are the.05 values for F(1, ν), as you can verify by looking at tables. [Upon squaring, the negative (left) and positive (right) tails of t both go into the right tail of F: =.05.] Also, the square of the t-statistic obtained in our t test above is the F-statistic in our ANOVA: = Note: In Minitab, the oneway procedure is the simplest of several ways to perform a one-way ANOVA on stacked data. This command requires column identifiers such as c1 and c2, or 'Potency' and 'Group' (column names inside single quotes). It does only one-way ANOVAs, and provides separate confidence intervals for each level (Fresh or Stored) of the single factor (Group). Problems: For a two-sample design with n = 10 observations in each group and a fixed significance level α =.05, find the critical values for the two-sided pooled t test and the F test discussed above. Use Minitab's invcdf command: MTB > invcdf 0.975; SUBC> t 18. MTB > invcdf 0.95; SUBC> F Compare your results with tables in your text. Verify that the square of the critical value for t is the critical value for F. In this problem, why do you need to use for the t distribution and 0.95 for the F distribution? (For each distribution, draw a sketch and shade in the area corresponding to probability 0.05.)

7 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-7 [Recall that the cumulative distribution function (cdf) F(x) of a random variable X is P(X x). Thus, the inverse cdf function for a particular value y gives the value c such that P(X c) = y. The inverse cdf function is sometimes called the quantile function.] Consider a balanced two-sample design in which each group has n observations. Let the group totals be T 1 and T 2, and denote the grand total of all observations as T 1 + T 2 = G. Express the formulas for both the pooled t-statistic and the F-statistic in terms of this notation. Then use simple algebra to verify that the F-statistic is the square of the t-statistic Starting with the same four lines of R code as in (in green below), one can perform the pooled two-sample t test and the one-way ANOVA as follows. Show the results and compare with the corresponding Minitab results. (The lines in green need not be repeated in a continuous R session.) fresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6) stored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9) potency = c(fresh, stored); n1 = length(fresh); n2 = length(stored) typ = as.factor(c(rep(1, times=n1), rep(2, times=n2))) t.test(potency ~ typ, var.equal=t) anova(lm(potency ~ typ)) 1.4. More-General Procedures Minitab's general anova procedure will handle a great variety of ANOVA models, many of which we shall study in these notes. With commands: designate the response variable (Potency here), followed by an equal sign, followed by the design or independent variables containing subscripts (here only one, 'Group'). Use of single quotes (apostrophes) around variable names is optional (unless the first character of the name is a number or a symbol). With Windows menus: you must select the response variable in one dialog box and the subscript variables that specify the model in another. (For now, ignore the box for "random" factors.) For more complicated designs than the completely randomized design, ANOVA will handle only balanced situations, i.e., only designs where each treatment (or treatment combination) has the same number of replications. Because it is programmed to handle such a wide variety of ANOVA designs, the general ANOVA procedure does not provide confidence intervals. STAT > ANOVA > Balanced, select 'Potency' as Response, 'Group' as Model MTB > anova Potency = Group Factor Type Levels Values Group fixed 2 1, 2

8 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-8 Analysis of Variance for Potency Source DF SS MS F P Group Error Total S = R-Sq = 49.93% R-Sq(adj) = 47.15% Finally, the GLM procedure (stands for "general linear model") has the same syntax as ANOVA. It requires more intensive computation and more computer memory (perhaps noticeable with large datasets and complex designs), can handle unbalanced cases, uses a regression approach, and automatically warns us about "unusual" observations. For more complex designs the two procedures have somewhat different options and capabilities. STAT > ANOVA > General linear model MTB > glm Potency = Group Factor Levels Values Group Analysis of Variance for Potency Source DF Seq SS Adj SS Adj MS F P Group Error Total S = R-Sq = 49.93% R-Sq(adj) = 47.15% Unusual Observations for Potency Obs. Potency Fit Stdev.Fit Residual St.Resid R R denotes an obs. with a large st. resid. Technical note: Because Group and Error correspond to orthogonal subspaces of the 20-dimensional vector space of observations, the Sequential and Adjusted Sums of Squares are identical for our data. Problems: The GLM procedure indicates that observation #5 is unusual. Minitab's criterion for calling an observation unusual is based on Studentized residuals of absolute value greater than 2. So this observation with its value of 2.11 is borderline. (We will not go into the computations involved in finding Studentized residuals. Very roughly, the idea is that this observation is relatively far from the mean of the rest of the observations in its group.) In this ANOVA, the (ordinary) residual of an observation is its difference from its group means. Using menus, in the one-way ANOVA procedure select the option to store residuals. Verify the values of the residuals for observations #1, #5, and #11 of the stacked data by hand. Make a box plot of the residuals. Does it indicate any outliers?

9 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit Use the menu path STAT > Basic statistics > Normality test to test the null hypothesis that the residuals fit a normal distribution (against the alternative that they are not normal). In the resulting normal probability plot, normal residuals should nearly fit a straight line. Do ours? What is the P- value of the Anderson-Darling test of normality? Test the hypothesis that the two groups come from populations with equal variances against the two-sided alternative. Use the cdf command to find the P-value of this test. Alternatively, look at the menu path STAT > Basic statistics > 2 variances for this test. (This test is known to have poor power; that is, to fail to reject the null hypothesis even when population variances differ.) 1.5. Traditional Nonparametric Alternatives Here we mention several nonparametric tests. You should read the descriptions of them in your text. In Windows, all menu paths for Minitab's implementations of these tests begin with STAT > Nonparametric. The nonparametric alternative to the two-sample t test is the Mann-Whitney-Wilcoxon test (command mann). It works only for unstacked data. Both of the nonparametric alternatives to the general one-way ANOVA are programmed to be used with stacked data: the Mood test (Minitab command mood) and the Kruskal-Wallis test (Minitab command kruskal). The Kruskal-Wallis test is a generalization of the Mann- Whitney-Wilcoxon test in the same sense that the one-way ANOVA is a generalization of a pooled two-sample t test. Unlike the t test and ANOVA, none of these nonparametric tests assume normal data. They all test null hypotheses about equal population medians (rather than means). Like their normal-theory counterparts, these nonparametric tests assume that: The data are random samples from their respective populations, The data for different levels (e.g., Fresh and Stored groups) are independent of one another, The population dispersions are equal. For the normal tests, the specific form of the "equal dispersion" assumption is that variances are equal. For the nonparametric tests, it is that all population distributions are of the same shape, differing (if at all) only by a translation that shifts the entire distribution along with the value of the median. The populations are continuous to the extent necessary to avoid "ties" (repeated values). Normal theory tests usually work quite well unless rounding (or some other process) has produced severe granularity (many clumps of repeated values. Nonparametric tests require approximate "correction" procedures to adjust for any ties that may be present due to rounding. There is no evidence that our present data are other than normally distributed. For example, the dotplots and boxplots show no marked skewness or probable outliers. Even so, you should experiment with the nonparametric procedures kruskal and mood to see how they work. Here, they yield the same conclusion as the normal theory tests: the potency of the stored bottles is less than for the fresh ones.

10 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-10 Problems: Theoretically, for continuous data, there should be no ties at all. In reality, we are always dealing with rounded data, so ties may be present. (For example, truly distinct values and would both be recorded here as "tied" at 10.2; even with two-decimal accuracy both would be recorded as ) Looking at the 20 observations in our dataset, do you find any ties? If so, how many observations are involved in ties? The W-statistic reported in the output of Minitab's implementation mann of the Mann- Whitney-Wilcoxon test is computed as follows: consider all of the data in both groups as a whole, find the ranks of these observations, and find the sum of the ranks of the observations in Group 1. A small value of W indicates that Group 1 comes from a population with a smaller median than Group 2; a large value indicates that the population median for Group 1 may be larger. (a) Under the null hypothesis that the two populations are the same, the expected value of W can be shown to be µ W = n 1 (n 1 + n 2 + 1)/2. What is this value for our data? (b) Assume that c5, c6, and c7 are empty columns, that the stacked data are in c1 and that the subscripts are in c2. Then the following Minitab commands can be used to illustrate how W is computed: MTB > rank c1 c5 MTB > unstack c5 c6 c7; SUBC> subs c2. MTB > sum c6. Go through these steps carefully, looking at the worksheet after each step and making sure you understand what each step does. Then unstack the data and use the mann command to perform the Mann-Whitney-Wilcoxon test. Compare the value of W with your computations above. Carefully compare the interpretation of this nonparametric test with the interpretation of the t test and the ANOVA above? Justify your answer In the Stored group change observed potency 9.5 to 2.0. (Maybe a stored sample gets damp and loses nearly all its potency. As a result, the group means become more different than for the real data.) What change does this make in the results of the pooled 2-sample t test? What change does this make in the results of the Wilcoxon test? In R the Mann-Whitney-Wilcoxon test is performed, on the original data, as shown below (notice the two variations, one with what Minitab would call stacked data and one with unstacked data). Compare the results with Minitab output. fresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6) stored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9) potency = c(fresh, stored); n1 = length(fresh); n2 = length(stored) typ = as.factor(c(rep(1, times=n1), rep(2, times=n2))) wilcox.test(potency ~ typ) wilcox.test(fresh, stored) 1.6. Additional Nonparametric Procedures One kind of nonparametric test is done by using a rank transform. Each observation in c1 (Potency) of the worksheet is replaced, in c5 by its rank (RankPote). Then a standard t-test or ANOVA is done on the ranked data.

11 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-11 MTB > rank c1 c5 MTB > name c5 'RankPote' MTB > twot c5 c2; SUBC> pool. Two-Sample T-Test and CI: RankPote, Group Two-sample T for RankPote Group N Mean StDev SE Mean Difference = mu (1) - mu (2) Estimate for difference: % CI for difference: ( , ) T-Test of difference = 0 (vs not =): T-Value = 4.33 P-Value = DF = 18 Both use Pooled StDev = There is no reason to believe the residuals from rank-transformed data are normal. But if the original data are far from normal (say there are a couple of far outliers among the residuals, one low and one high), then the transformed data may be more nearly normal than the original data. At best, the t statistic is only roughly normal, so the P-value is only approximate. But in this case, the P-value is not distinguishable from the P-value of the t test on the original data. (The CI from this procedure is difficult to interpret and is best ignored.) With a rank transformation, as with any transformation of the data, care has to be taken in interpreting estimates and confidence intervals. These are on the rank scale, not on the original potency scale. So rank-transformed data are more convenient for doing a test than for doing estimation. Yet another nonparametric procedure is the permutation test. Under the null hypothesis that the two groups are the same, any permutation of the Potency data in c1 is as likely as any other. If we could compute the t statistic corresponding to each of the 20! permutations of the data, computing as if the first 10 were Fresh and the last 10 were Stored, then we would get an empirical distribution of the t statistic. In practice, there are too many possible permutations to carry out this computational task, but we can get a pretty good idea of this empirical distribution by looking at a large number of randomly chosen permutations. That is what the program below in R does. Again here, the conclusion is that the P-value (area outside the vertical blue lines in the plot below) is very small, here about P = Also, we see that the empirical permutation distribution of the ' t values' generated is pretty close to a Student's t distribution with df = 18. fresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6) stored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9) potency = c(fresh, stored); n1 = length(fresh); n2 = length(stored) typ = as.factor(c(rep(1, times=n1), rep(2, times=n2))) m = 10000; d = numeric(m) for (i in 1:m) { xp = sample(potency, n1 + n2) # permutation of n1 + n2 potency values d[i] = mean(xp[typ==1]) - mean(xp[typ==2]) }

12 d.data = mean(fresh) - mean(stored); d.data mean(abs(d) >= abs(d.data)) hist(d, col="lightgrey") abline(v = d.data, col="blue", lwd=2) abline(v = -d.data, col="blue", lwd=2, lty="dashed") > d.data = mean(fresh) - mean(stored); d.data [1] 0.54 > mean(abs(d) >= abs(d.data)) [1] 8e-04 Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-12 In order to get a confidence interval without making distributional assumptions, one can do a bootstrap procedure. It is based on the idea that all the information we have about the populations corresponding to Fresh and Stored samples in contained in the samples themselves. By repeatedly sampling (with replacement) from the samples, viewing them as pseudo-populations, one can get a good idea of the variability of the difference in sample means. The resulting 95% nonparametric bootstrap CI is (0.31, 0.78), which is a little shorter than the CI (0.27, 0.81) from the t procedure, and noticeably shorter than the CI from the rank-based Mann-Whitney-Wilcoxon procedure. Because of the relatively small sample sizes of the groups, the accuracy of the bootstrap CI is in some doubt. (Often bootstrap CIs based on small samples are too short.) This example is shown just to introduce the idea of the bootstrap. Because there is no evidence of nonnormality, the t procedure is preferred. fresh = c(10.2, 10.5, 10.3, 10.8, 9.8, 10.6, 10.7, 10.2, 10.0, 10.6) stored = c(9.8, 9.6, 10.1, 10.2, 10.1, 9.7, 9.5, 9.6, 9.8, 9.9) potency = c(fresh, stored); n1 = length(fresh); n2 = length(stored) typ = c(rep(1, times=n1), rep(2, times=n2)) m = 10000; d = numeric(m) for (i in 1:m) { b.fresh = sample(fresh, n1, repl=t) b.stored = sample(stored, n1, repl=t) d[i] = mean(b.fresh) - mean(b.stored) }

qnt = quantile(d, c(.975,.025)) boot.ci = 2*(mean(fresh)-mean(stored)) - qnt boot.ci hist(d, col="lightgrey") abline(v = boot.ci, col="blue", lwd=2) > boot.ci 97.5% 2.5% 0.31 0.

13 qnt = quantile(d, c(.975,.025)) boot.ci = 2*(mean(fresh)-mean(stored)) - qnt boot.ci hist(d, col="lightgrey") abline(v = boot.ci, col="blue", lwd=2) > boot.ci 97.5% 2.5% Minitab Notes for STAT 6305 One-Factor ANOVA & 2-Sample t Unit 1-13 Notice that the traditional nonparametric procedures and the test on rank-transformed data lose information by considering ranks, whereas the permutation test uses precisely the observed data. Problems: Use the altered data of problem (a) What change does altering the data make in the results of the pooled 2-sample t test of the rank-transformed data? (b) What change does this make in the results of the permutation test? (c) What change in the bootstrap CI? (Hint: in making the histogram for the bootstrap distribution, use the parameter breaks=100 to force a more detailed look at the results.) (d) How do you account for the unusual appearance of the permutation and bootstrap distributions. Minitab Notes for Statistics 6305: ANOVA Models by Bruce E. Trumbo, Department of Statistics, CSU East Bay, East Bay CA, Copyright 1991, 2011 by Bruce E. Trumbo. All rights reserved. Partial support for the 1991 version from NSF grant USE The current version with Minitab professional graphics and examples using R is a draft. For comments, errata, selected answers, related materials, and permission to use beyond CSU East Bay please bruce.trumbo@csueastbay.edu or eric.suess@csueastbay.edu.

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created