UPPSALA UNIVERSITY Department of Mathematics Måns Thulin, thulin@math.uu.se Analysis of regression and variance Fall 2011 COMPUTER EXERCISE 2: One-way ANOVA In this computer exercise we will work with the analysis of variance in R. We ll take a look at the following topics: Inference about variances One-way ANOVA Diagnostics Pairwise comparisons Power of tests and the choice of sample size Comparison of tests for equal variances 1 Inference about variances Assume that we have two normally distributed samples and that we want to test whether or not the variances of the two samples are equal. This is done with an F-test (page 52-54 in the textbook). In R this is done using the var.test function; we give an example with simulated data. x <- rnorm(50, mean = 0, sd = 2) y <- rnorm(30, mean = 1, sd = 1) var.test(x, y) Try changing the means and variances of the two samples! What happens? Inference about variances in the more general case where we have more than two samples is done using Bartlett s test or Levene s test, which we will study later in the exercise. 2 One-way ANOVA We will study the etch rate data example from the book (described on pages 61-62): make sure that you understand what the different quantities in the experiment are. The data is given in the etchrun.dat file (which you find on the course page on the student portal) and to begin with we examine the data by graphical methods: etch <- read.table("etchrun.dat", col.names = c("rate","power")) plot(rate ~ Power, etch, lwd=4, col="blue"); grid() 1
In order to do the analysis, we must tell R that Power contains factors as this is not obvious from the data file. To do this, we use the command factor as follows: etch$power <- factor(etch$power, labels=c("160","180","200","220")) Together with the previous plot, a boxplot can be used to get some initial ideas about the data and to get some indications about whether the assumption of equal variances is correct or not. boxplot(rate~power, etch) 2.1 Estimation To perform the one-way ANOVA we can use the commands lm and anova. As before lm fits a linear model. anova gives a summary table about the result of the ANOVA. m1 <- lm(rate ~ Power, etch) anova(m1) Alternatively, we can use aov instead of lm: m2 <- aov(rate ~ Power, etch) anova(m2) (The result is the same but the data in m1 and m2 is stored in somewhat different ways.) What is the null and alternative hypotheses in ANOVA? What conclusions regarding the hypotheses can be drawn from the summary tables? 2.2 Diagnostics Behind the ANOVA lies the assumption the the errors are normal with homogeneous variance. As in the regression case, the normality assumption can be checked using a QQ-plot (qqnorm) or the Shapiro-Wilk test (shapiro.test). We can also plot the residuls in some form, for instance against the fitted values: plot(m1$fit, m1$res); grid() What does the figure tell us about constant variance? To further investigate the equality of variances we can use Bartlett s test. It is found on page 79 of the textbook. The test statistic χ 2 0 presented there uses logarithms with base 10, resulting in the constant 2.3026 being a part of the statistic. A somewhat easier way of writing the statistic is to use natural logarithms: χ 2 0 = q/c q = (N a) ln(ms E ) a (n i 1) ln Si 2 where c and Si 2 = 1 ni n i 1 j=1 (y ij ȳ i. ) 2 are defined as in the textbook. In R, we perform Bartlett s test as follows: bartlett.test(rate~power, etch) i=1 2
What conclusions can be drawn from the given p-value? As an alternative, we can use Levene s test. The idea behind the test is to compute d ij = y ij ỹ i, where ỹ i is the median for the i:th population, and to use these as a response variable in a new one-way ANOVA. The use of the median instead of the mean makes the test more robust, i.e. less sensitive to outliers and departures from normality (we investigate this further at the end of this computer exercise). An R function for Levene s test is found in the car library, which is opened by writing library(car). The function is called levene.test see the help file for instructions on how to use it. 2.3 A note on data import The data in etchrun.dat was stored in a way that was convenient for us to import into R all we needed to do after importing it was to mark one of the variables as a factor. At times the data can be stored in ways that makes it a little harder to import it. In the file etching.dat the data for each level of the factor is stored in separate columns (compare etchrun.dat and etching.dat to see the difference). Using the stack command we can store the data in a more suitable way in R: etch2 <- read.table("etching.dat", col.names = c("160","180","200","220")) etch3 <- stack(etch2) names(etch3) <- c("rate","power") anova(lm(rate~power,etch3)) You should get the same answer as before. 2.4 Pairwise comparisons Pairwise comparisons of means can be done using Tukey s HSD (page 92). The R function for this requires an object created with aov; in our case the model called m2: cm2 <- TukeyHSD(m2, "Power"); cm2 plot(cm2) How should the output from the function be interpreted? What conclusions can be drawn? EXERCISE. In an experiment the effect of shelf height on the sales of a particular doog food (Arf Dog Food) was studied. During a period of 8 days the daily sales (in hundreds of dollars) for three shelf heights, knee, waist and eye height, was registered. The shelf height was changed randomly three times daily. The data is stored in the file shelf.dat (column 1: knee height, column 2: waist height, column 3: eye height). Assume an ANOVA model, one-way with one factor at three levels (the different heights). Analyse the effect of shelf height on sales using the methods we ve used above. 3 Power of tests and sample size Next we will consider the problem of determining a suitable number of replicates see section 3.7 (page 101) in the textbook for a brief introduction to the topic. The book 3
uses the quantity Φ 2 which is used to read so-called OC-curves. With R this can be done directly, without having to use the figures in the appendix of the book. To do this, we must calculate a quantity corresponding to the Φ 2 used in the book: f 2 = 1 σ 2 k p i (µ i µ) 2 i=1 where k is the number of levels of the factor, p i = 1/k for balanced experiments and a suitable value for σ is choosen. Then f 2 = Φ 2 /n. f is sometimes called the effect size. Let us take a closer look at Example 3.10 (page 102) using R. The confidence level is α = 0.01, the required power is 1 β = 0.90 and the standard deviation σ = 25 /min is assumed. We could easily sum the four terms, but we will use a more general approach and show how to use the command apply (which in our example below operates on the matrix elm): library(pwr) k0 = 4 levs <- c(575,600,650,675) elm <- matrix( (levs-625)^2/25^2 ) f0 = sqrt( 1/k0*apply(elm,2,sum) ) pwr.anova.test(k=k0, f=f0, sig.level=0.01, power=0.9) How many replicates are needed to get the required test power? An alternate approach for selecting a sample size is described on pages 102-103. It involves the greatest difference D between the means of the different levels. We get f 2 = D 2 /2/k/σ 2 and can call pwr.anova.test as before. EXERCISE. Go through the procedure on pages 102-103 using R, i.e. choose D = 75 and so on. How many replicates are required? 4 Comparison of Bartlett s test and Levene s test Earlier in the computer exercise we claimed that Levene s test was more robust than Bartlett s test. This is not shown in the textbook, but we can investigate it using computer simulations. If you find this interesting, take a look at the R script bart.r in the student portal. Go through the code presented there, and then do the exercises below. EXERCISE. Change the variances so that at least one sample has a variance that is different from that of the others. Does the number of tests that result in rejection of the null hypothesis increase, as expected? Is one of the tests better in the sense that it has a higher amount of rejections (i.e. higher power)? EXERCISE. Use runif to simulate samples from the uniform distribution instead, first with equal variances and then with unequal variances. What happens? EXERCISE. In the sammer manner, use rexp to simulate samples from the exponential distribution. What happens? 4
Try inserting an outlier into one of the samples and see what hap- EXERCISE. pens. Example: x<-c(rnorm(n[1]-1,0,1), rnorm(1,10,1), rnorm(n[2],1,1), rnorm(n[3],1,1)) EXERCISE. Change the sample sizes so that they differ more. What happens? 5