General Factorial Models

Size: px

Start display at page:

Download "General Factorial Models"

Gerald Bradley
5 years ago
Views:

1 In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 1 1 / 31

2 It is possible to have many factors in a factorial experiment. We saw some three-way factorials earlier in the DDD book (HW 1 with 3 factors: ball size, height, surface). The general set-up can be extended to many factors, but higher-order interactions can be bothersome to deal with. Recall that a 3-factor interaction, like (αβγ) ijk, describes how a 2-factor interaction changes depending on the level of the third factor. 2 / 31

3 Three-Factor Factorial Effects Models Full Model (includes interaction): Y ijkl = µ + α i + β j + γ k + (αβ) ij + (αγ) ik + (βγ) jk + (αβγ) ijk + ɛ ijk iid with ɛ ijk N(0, σ 2 ) for i = 1,..., a j = 1,..., b k = 1,..., c and l = 1,..., n for a balanced design. Restriction for estimation of parameters as sum-to-zero constraints: 0 = a α i = i=1 b β j = j=1 c γ k = i=1 a (αβ) ij = i=1 b (αβ) ij =... = j=1 k (αβγ) ijk i=1 3 / 31

4 Three-Factor Factorial Effects Models The degrees of freedom work as before: Source df A a-1 B b-1 C c-1 AB (a-1)(b-1) AC (a-1)(c-1) BC (b-1)(c-1) ABC (a-1)(b-1)(c-1) error [1] abc(n-1) c. total [1] abcn-1 [1] For the total N = abcn and the error df, this is true if balanced with n observations in each cell. 4 / 31

5 Three-Factor Factorial Effects Models If there is 3-way interaction, then the 2-way interaction for a given level of the 3rd factor differs from the 2-way interaction at a different level fo the 3rd factor. 5 / 31

6 As we ve mentioned before, you should check (i.e. test) for higher-order interactions first before considering lower level interactions or main effects. If the higher-order interaction is significant, then you shouldn t look at the tests for lower-order effects because these tests won t necessarily be meaningful. One way to think about it, if there is interaction, then by doing a main effects test, you are essentially pooling things that should not be pooled (are not similar), and you can get some false impressions of what s going on with the effects. 6 / 31

7 If the higher-order interaction is significant, one option is to fit the full model, then perform a kind of slice analysis (a special contrast) which will perform a separate hypothesis test for differing levels of a factor (see graphic below). We saw this in an earlier 2-way interaction example... Or you could also physically partition the data into parts and do separate analyses, but some power is lost because you ll have fewer df for error in each separate analysis compared to when the data are all together. 7 / 31

8 The simplest scenario is when no higher-order interactions are present, and we can just consider a main effects model. In that case, we can fit a model where the interactions terms are removed from the model and are placed in the error term. 8 / 31

9 Example (SAS: 2 3 design) An engineer is interested in the effects of the following factors on life (in hours) of a machine tool: cutting angle (0=low, 1=high) tool geometry (two shapes: 0=shape 1, 1=shape 2) cutting speed (0=low,1=high). Three runs are done for each combination of factor levels, and all runs are done in random order. This is a completely randomized design (CRD). Eight treatment groups with n = 3, so N = 24 { D.C. Montgomery (2005). Design and analysis of experiments. John Wiley & Sons: USA. } 9 / 31

10 Example (SAS: 2 3 design) 10 / 31

11 Example (SAS: 2 3 design) 11 / 31

12 Example (SAS: 2 3 design) Diagnostic plots for contant variance and normality. The diagnostic plots look OK, and the 3-way interaction was not significant here (previous slide), so that term could be removed from the model (which places it in the error term). Or we can leave the 3-way interaction term in the model and look at the tests for the 2-way interactions in the ANOVA table. 12 / 31

13 Example (SAS: 2 3 design) According to the ANOVA table, the only significant 2-way interaction is between angle and speed or angle*speed. We will visually look at the marginal 2-way interaction plot (averaged across the 3rd factor) for each combination of factors: angle*speed, angle*geometry, and geometry*speed. These plots average over replicates in a cell and over the levels of the unplotted factor / 31

14 Example (SAS: 2 3 design) Marginal 2-way interaction plot for angle*geometry (not significant) This was not a significant interaction in the model. 14 / 31

15 Example (SAS: 2 3 design) Marginal 2-way interaction plot for geometry*speed (not significant) This was not a significant interaction in the model. 15 / 31

16 Example (SAS: 2 3 design) Marginal 2-way interaction plot for angle*speed (significant) This WAS a significant interaction in the model. 16 / 31

17 Example (SAS: 2 3 design) The type of interaction in the angle*speed plot causes concern for making global statements about the main effects for angle and speed. When angle is low (far left side), speed has a positive effect on life, and when angle is high (far right side), speed has a negative effect on life. The minimal model should include: geometry, angle, speed, angle*speed (following the hierarchy principle). 17 / 31

18 Example (SAS: 2 3 design) Most parsimonious model following hierarchical principle. We will consider the main effects for geometry and the interaction effect between angle*speed with a slice option. 18 / 31

Example (SAS: 2 3 design) The geometry factor has a simple a main effect. Averaged over all angles and all speeds, the average lifetime for a tool of shape=0 is 35.

19 Example (SAS: 2 3 design) The geometry factor has a simple a main effect. Averaged over all angles and all speeds, the average lifetime for a tool of shape=0 is 35.2 hours, while the average lifetime of a tool of shape=1 is 46.5 hours. Holding the speed and angle constant, changing from shape=0 to shape=1 is associated with an increased lifetime of about 11.5 hours. 19 / 31

Example (SAS: 2 3 design) For each angle level

20 Example (SAS: 2 3 design) For each angle level (low and high), speed makes a significant difference on tool lifetime (slices significant). When angle is set to low, a high speed gives a longer lifetime (9.2 hours). When angle is set to high, then a low speed gives a longer lifetime (8.5 hours). 20 / 31

21 What if the 3-way interaction had been significant? How should we proceed? We ll consider two options: 1 Subset the data and do separate analyses. 2 Fit the full model to the complete data set and perform a slice analysis. 21 / 31

22 : Partition data Let s partition the data into two parts by the angle factor (low, high), and do an analysis on the factors of geometry and speed for each part. Example (SAS: subset to angle=0) 22 / 31

23 : Partition data Example (SAS: subset to angle=0) There is no significant interaction, only main effects. 23 / 31

24 : Partition data Example (SAS: subset to angle=0) When angle is set to the low level (angle=0), there is no significant interaction between geometry and speed. There is a significant positive speed effect, and a significant positive geometry effect (both main effects). 24 / 31

25 : Partition data Example (SAS: subset to angle=1) 25 / 31

26 : Partition data Example (SAS: subset to angle=1) There is no significant interaction, only main effects. 26 / 31

27 : Partition data Example (SAS: subset to angle=1) When angle is set to the high level (angle=1), there is no significant interaction between geometry and speed. There is a significant negative speed effect, and a significant positive geometry effect (both main effects). 27 / 31

28 : Use slice option One could get a very similar analysis (with more degrees of freedom for error) by fitting the full model and then slicing by angle. We will approach it that way here. Example (SAS: full model, slice by angle) 28 / 31

Mean Squares for the two models we fit in the two subsetted analyses

29 : Use slice option Example (SAS: full model, slice by angle) If you compare the Mean Squares in the above slice output, they match the Mean Squares for the two models we fit in the two subsetted analyses (with 4 separate means), but the F -statistics are different. Why? 29 / 31

: Use slice option Example (SAS: full model, slice by angle) The full model (using all the data and all possible terms) provides ˆσ 2 = 30.17 with 16 d.f. for the error (output below): When we subsetted the data into the Angle low, we found ˆσ 2 = 45.

30 : Use slice option Example (SAS: full model, slice by angle) The full model (using all the data and all possible terms) provides ˆσ 2 = with 16 d.f. for the error (output below): When we subsetted the data into the Angle low, we found ˆσ 2 = with 8 d.f. for the error. When we subsetted the data into the Angle high, we found ˆσ 2 = with 8 d.f. for the error. As we have made the assumption that σ 2 is the same across all cell means, the full model estimate of σ 2 is a pooled estimate taken from the two subsetted data sets. They are all estimating the same constant variance σ 2, but we gain in d.f. for the error when we use the pooled estimate. 30 / 31

31 : Use slice option Example (SAS: full model, slice by angle) Test for a difference in the four means where Angle held constant at either low or high with α = 0.05 H 0 : µ a 11 = µ a 12 = µ a 21 = µ a 22 vs. H 1 : not H 0 Using the slice option (i.e. using all the data), the threshold for significance is F (0.05,3,16) = 3.23 Using the subsetted data, the threshold for significance is F (0.05,3,8) = 4.07 The threshold for significance is lower when we have more degrees of freedom for error. 31 / 31

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 34 It is possible to have many factors in a factorial experiment. In DDD we saw an example of a 3-factor study with ball size, height, and surface