Random coefficients models

Size: px

Start display at page:

Download "Random coefficients models"

Merry Norman
5 years ago
Views:

1 enote 9 1 enote 9 Random coefficients models

2 enote 9 INDHOLD 2 Indhold 9 Random coefficients models Introduction Example: Constructed data Simple regression analysis Fixed effects analysis Two step analysis Random coefficient analysis Example: Consumer preference mapping of carrots Random coefficient models in perspective R-TUTORIAL: Constructed data R-TUTORIAL: Consumer preference mapping of carrots Exercises Introduction Random coefficient models emerge as natural mixed model extensions of simple linear regression models in a hierarchical (nested) data setup. In the standard situation, we are interested in the relationship between x and y. Assume we have observations

3 enote INTRODUCTION 3 (x 1, y 1 ),... (x n, y n ) for a subject. Then we would fit the linear regression model, given by y j = α + βx j + ɛ j Assume next that such regression data are available on a number of subjects. Then a model that expresses different regression lines for each subject is expressed by: or using the more general notation: y ij = α i + β i x ij + ɛ ij y i = α(subject i ) + β(subject i )x i + ɛ i (9-1) This model has the same structure as the different slopes ANCOVA model of the previous module, only now the regression relationships are in focus. Assume finally that the interest lies in the average relationship across subjects. A commonly used ad hoc approach is to employ a two-step procedure: 1. Carry out a regression analysis for each subject. 2. Do subsequent calculations on the parameter estimates from these regression analyzes to obtain the average slope (and intercept) and their standard errors. Since the latter treats the subjects as a random sample, it would be natural to incorporate this in the model, by assuming the subject effects (intercepts and slopes) to be random: where y i = a(subject i ) + b(subject i )x i + ɛ i a(k) N(α, σ 2 a ), b(k) N(β, σ 2 b ), ɛ i N(0, σ 2 ) and where k = 1,..., K with K being the number of subjects. The parameters α and β are the unknown population values for the intercept and slope. This is a mixed model, although a few additional considerations are required to identify the typical mixed model expression. The expected value is and the variance is Ey i = α + βx i Vary i = σ 2 a + σ 2 b x2 i + σ 2 So, an equivalent way of writing the model is the following where the fixed and the random part is split: y i = α + βx i + a(subject i ) + b(subject i )x i + ɛ i (9-2)

4 enote EXAMPLE: CONSTRUCTED DATA 4 where a(k) N(0, σ 2 a ), b(k) N(0, σ 2 b ), ɛ i N(0, σ 2 ) (9-3) Now the linear mixed model structure is apparent. Although we do not always explicitly state this, there is the additional assumption that the random effects a(k), b(k) and ɛ i are mutually independent. For randomly varying lines (a(k), b(k)) in the same x- domain this may be an unreasonable assumption since the slope and intercept values may very well be related to each other. It is possible to extend the model to allow for such a correlation/covariance between the intercept and slope by assuming a bi-variate normal distribution for each set of line parameters: ( ) σ 2 (a(k), b(k)) N(0, a σ ab ), ɛ i N(0, σ 2 ) (9-4) σ ab The model given by (9-2) and (9-4) is the standard random coefficient mixed model. σ 2 b 9.2 Example: Constructed data To illustrate the basic principles we start with two constructed data sets of 100 observations of y for 10 different x-values, see figure 9.1. It reflects that a raw scatter plot of a data set can be hiding quite different structures, if the data is in fact hierarchical (repeated observations on each individual rather than exactly one observation for each individual) Simple regression analysis Had the data NOT been hierarchical, but in stead observations on 100 subjects, a simple regression analysis, corresponding to the model y i = α + βx i + ɛ i (9-5) where ɛ i N(0, σ 2 ), i = 1,..., 100 would be a reasonable approach. For comparison we state the results of such an analysis for the two data sets. The parameter estimates are: Data 1 Data 2 Parameter Estimate SE P-value Estimate SE P-value σ α β < <0.0001

5 enote EXAMPLE: CONSTRUCTED DATA 5 See figure 9.1(left) for the estimated lines Fixed effects analysis If we had special interest in the 10 subjects, a fixed effects analysis corresponding to model (9-1) could be carried out. The F-tests and P-values from the Type 1 (successive) ANOVA tables become: Data set 1 Data set 2 Source DF F P-value F P-value x < <.0001 subject < x*subject < y y y x x x y y y x x x Figur 9.1: Constructed data: Top: data set 1, bottom: data set 2. Left: Raw scatter plot with simple regression line, middle: Individual patterns, right: individual lines

6 enote EXAMPLE: CONSTRUCTED DATA 6 For data set 1 the slopes are clearly different whereas for data set 2 the slopes can be assumed equal, but the intercepts (subjects) are different. Although it is usually recommended to rerun the analysis without an insignificant interaction effect, the Type I table shows that the result of this will clearly be that the subject (intercept) effect is significant for data set 2, cf. the discussion of Type I/Type III tables in Module 3. So for data set 1, the (fixed effect) story is told by providing the 10 intercept and slope estimates and/or possibly as described for the different slopes ANCOVA model in the previous module. For data set 2, an equal slopes ANCOVA model can be used to summarize the results. The common slope and error variance estimates are: ˆβ = , SE ˆβ = , ˆσ2 = The confidence band for the common slope, using the 89 error degrees of freedom becomes ± t (89) which, since t (89) = 1.987, gives [0.9279, ] The subjects could be described and compared as for the common slopes ANCOVA model of the previous module Two step analysis If the interest is NOT in the individual subjects but rather in the average line, then a natural ad hoc approach is simply to start by calculating the individual intercepts and slopes and then subsequently treat those as simple random samples and calculate average, variance and standard error to obtain confidence limits for the population average values. So e.g. for the slopes we have ˆβ 1,..., ˆβ 10 and calculate the average ˆβ = ˆβ i, i=1 the variance and the standard error s 2ˆβ = i=1 SE ˆβ = ( ˆβ i ˆβ) 2 s ˆβ 10

7 enote EXAMPLE: CONSTRUCTED DATA 7 to obtain the 95% confidence interval: (using that t (9) = 2.26) The variances for data set 1 are: ˆβ ± 2.26SE ˆβ s 2ˆα = , = s2ˆβ and for data set 2: s 2ˆα = , = s2ˆβ The results for the intercepts and slopes for the two data sets are given in the following table: Data set 1 Data set 2 α β α β Average SE Lower Upper Note that for data set 2, the standard error for the slope is almost identical to the standard error from the fixed effect equal slopes model from above. However, due to the smaller degrees of freedom, 9 instead of 89, the confidence band is somewhat larger here. This reflects the difference in interpretation: In the fixed effects analysis the β estimates the common slope for these specific 10 subjects. Here the estimate is of the population average slope (the population from which these 10 subjects were sampled). This distinction does not alter the estimate itself, but does change the statistical inference that is made. Note, by the way, that for estimating the individual lines, it does not make a difference whether an overall different slopes model is used or 10 individual ( small ) regression models separately. Although not used, the observed correlation between the intercepts and slopes in each case can be found: corr 1 = 0.382, corr 2 = 0.655

8 enote EXAMPLE: CONSTRUCTED DATA Random coefficient analysis The results of fitting the random coefficient model given by (9-2) and (9-4) to each data set is given in the following table: Data set 1 Data set 2 α β α β Estimate SE Lower Upper Note that this table is an exact copy of the result table for the two-step analysis above! The parameters of the variance part of the mixed model for data set 1 is estimated at: (read off from R-output) ˆσ a = 4.031, ˆσ b = 0.496, ˆρ ab = 0.38, ˆσ = which corresponds to the following variances: and for data set 2: ˆσ 2 a = 16.25, ˆσ 2 b = 0.246, ˆσ2 = ˆσ a = 1.086, ˆσ b = 0.145, ˆρ ab = 1.00, ˆσ = which corresponds to the following variances: ˆσ 2 a = 1.18, ˆσ 2 b = 0.021, ˆσ2 = Compare with the variances calculated in the two-step procedure: For data set 1, the random coefficient model estimates are slightly smaller, whereas for data set 2, they are considerably smaller. This makes good sense, as the variances in the two-step procedure also will include some additional variation due to the residual error variance (just like the mean squares in a standard hierarchical model). For data set 1, this residual error variance is estimated at a very small value (0.0732) whereas for data set 2 it is This illustrates how the random coefficient model provides the proper story about what is going on, and directly distinguishes between the two quite different situations exemplified here. Note also that for data set 1, the correlation estimate ˆρ ab = 0.38 is close to the observed correlation calculated in the two-step procedure. However, for data set 2 the

9 enote EXAMPLE: CONSTRUCTED DATA 9 estimated correlation becomes ˆρ ab = 1!!! This obviously makes no sense! We encounter a situation similar to the the negative variance problem discussed previously. The correlation may become meaningless when some of the variances are estimated very small, which is the case for the slope variance here. To put it differently, for data set 2 the model we have specified include components (in the variance) that is not actually present in the data. We already new this, since the equal slopes model was a reasonable description of this data. In the random coefficient framework the equal slopes model is expressed by y i = α + βx i + a(subject i ) + ɛ i (9-6) where a(k) N(0, σ 2 a ), ɛ i N(0, σ 2 ) (9-7) The adequacy of this model can be tested by a residual likelihood ratio test, cf. Module 5. For data set 2 we obtain G = 2l REML,1 ( 2l REML,2 ) = 0.65 which is non-significant using a χ 2 distribution with 2 degrees of freedom. For data set 1 the similar test becomes which is extremely significant. G = 2l REML,1 ( 2l REML,2 ) = For data set 2 the conclusions should be based on the equal slopes model given by (9-6) and (9-7), and we obtain the following: Data set 2 α β Estimate SE Lower Upper We see a minor change in the confidence bands: believing in equal slopes increases the (estimated) precision (smaller confidence interval) for this slope, whereas the precision of the average intercept decreases.

10 enote EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS Example: Consumer preference mapping of carrots In a consumer study 103 consumers scored their preference of 12 danish carrot types on a scale from 1 to 7. The carrots were harvested in autumn 1996 and tested in march A number of background information variables were recorded for each consumer, see the data description in Module 13 for details. The data file can be downloaded as: carrots.txt and is described also in enote13. The aim of a so-called external preference mapping is to find the sensory drivers of the consumer preference behaviour and to investigate if these are different in different segments of the population. To do this, in addition to the consumer survey, the carrot products are evaluated by a trained panel of tasters, the sensory panel, with respect to a number of sensory (taste, odour and texture) properties. Since usually a high number of (correlated) properties(variables) are used, in this case 14, it is a common procedure to use a few, often 2, combined variables that contain as much of the information in the sensory variables as possible. This is achieved by extracting the first two principal components in a principal components analysis(pca) on the product-by-property panel average data matrix. PCA is a commonly used multivariate technique to explore and/or decompose high dimensional data. We call these two variables sens1 and sens2 and they are given by sens1 i = 14 a j v i 14 j and sens2 i = b j v i j j=1 j=1 where v1 i,..., vi 14 are the 14 average sensory scores for carrot product i and the coefficients a j and b j defining the two combined sensory variables are as depicted in figure 9.2. So sens1 is a variable that (primarily) measures bitterness vs. nutty taste whereas sens2 measures sweetness (and related properties). The actual preference mapping is carried out by first fitting regression models for the preference as a function of the sensory variables for each individual consumer using the 12 observations across the carrot products. Next, the individual regression coefficients are investigated, often in an explorative manner in which a scatter plot is used to look for a possible segmentation of consumers in these regression coefficients. In stead of looking for segmentation ( Cluster analysis ) we investigate whether we see any differences with respect to the background variables in the data, e.g. the gender or homesize (number of persons in the household). Let y i be the ith preference score. The natural model for this is a model that expresses randomly varying individual relations to the sensory variables, but with average (expected) values that may depend on the homesize. Let us consider the factor structure of the setting. The basic setting is a randomized

11 enote EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 11 sweet_ta fruit_ta Sens nut_ta carrot_af juicy colour bitter_ta car_od bitter_af earty_od crisp earthy_ta transp hard Sens1 Figur 9.2: Loadings plot for PCA of sensory variables: Scatter plot of coefficients b j versus a j. block experiment with 12 treatments (carrot products), the factor prod, and 103 blocks (consumers), the factor cons. Homesize (size) is a factor that partitions the consumers into two groups, those with homesize of 1 or 2, and those with a larger homesize. So the factor cons is nested within size, or equivalently size is coarser than cons. This basic structure is depicted in figure 9.3. (Note that the corresponding diagram plot in the video/audio based presentation of this module has a couple errors compared to the correct one given here)

12 enote EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS [prod] [I] [cons] 101 size 1 2 Figur 9.3: The factor structure diagram for the carrots data The linear effect of the sensory variables is a part of the prod effect, since these covariates are on product level. So they are both coarser than the product effect. The sensory variables in the model will therefore explain some of the product differences. Including prod in the model as well will enable us to test whether the sensory variables can explain all the product differences. As we do no expect this to be the case, we adopt the point of view that the 12 carrot products is a random sample from the population of carrot products in Denmark, that is, the product effect is considered as a random effect. In other words, we consider the deviations in the product variation from what can be explained by the regression on the sensory variables, as random variation. Finally, the interactions between homesize and the sensory variables should enter the model as fixed effects, allowing for different average slopes for the two homesizes, leading to the model given by where y i = α(size i ) + β 1 (size i ) sens1 i + β 2 (size i ) sens2 i + a(cons i ) +b 1 (cons i ) sens1 i + b 2 (cons i ) sens2 i + d(prod i ) + ɛ i (9-8) a(k) N(0, σa 2 ), b 1 (k) N(0, σb1 2 ), b 2(k) N(0, σb2 2 ), k = 1, (9-9) and d(prod i ) N(0, σ 2 P ), ɛ i N(0, σ 2 ) (9-10)

13 enote EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 13 To finish the specification of a general random coefficient model, we need the assumption of the possibility of correlations between the random coefficients: (a(k), b 1 (k), b 2 (k)) N(0, σa 2 σ ab1 σ ab2 σ ab1 σb 2 1 σ b1 b 2 σ ab2 σ b1 b 2 σb 2 2 ) (9-11) Before studying the fixed effects, the variance part of the model is investigated further. We give details in the R-tutorial section on how we end up simplifying this 8-parameter variance model down to the 4-parameter variance model, where the σ 2 b 1 -parameter and the two related correlations first can be tested non-significant, and after that the correlation between the b 2 -effect and the intercept (which can make sense here as the sens2- values are mean centered, see the discussion in the tutorial section): where and y i = α(size i ) + β 1 (size i ) sens1 i + β 2 (size i ) sens2 i + a(cons i ) +b 2 (cons i ) sens2 i + d(prod i ) + ɛ i (9-12) a(k) N(0, σa 2 ), b 2 (k) N(0, σb2 2 ), k = 1, (9-13) d(prod i ) N(0, σ 2 P ), ɛ i N(0, σ 2 ) (9-14) and where there are no more correlations in the model. The three remaining variance parameters (not counting the residual variance) are now all significant. With this variance structure, we investigate the fixed effects - here showing the results of the automated step-function of lmertest: Sum Sq Mean Sq NumDF DenDF F.value elim.num Pr(>F) Homesize:sens sens Homesize:sens Homesize kept 0.02 sens kept 0.00 The final model for these data is therefore given by: y i = α(size i ) + β 2 sens2 i + a(cons i ) +b 2 (cons i ) sens2 i + d(prod i ) + ɛ i (9-15)

14 enote EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 14 where and a(k) N(0, σa 2 ), b 2 (k) N(0, σb2 2 ), k = 1, (9-16) d(prod i ) N(0, σ 2 P ), ɛ i N(0, σ 2 ) (9-17) The estimates of the variances are listed in the following table: ˆσ b ˆσ a ˆσ P ˆσ ˆα(Homesize1) 4.91 ˆα(Homesize3) 4.67 ˆβ With confidence intervals as they come from the confint function: 2.5 % 97.5 %.sig sig sig sigma Homesize Homesize3-Homesize sens The conclusions regarding the relation between the preference and the sensory variables are that no significant relation was found to sens1, but indeed so for sens2. The relation does not depend on the homesize and is estimated at:(with 95% confidence interval) ˆβ 2 = 0.071, [0.04, 0.10] So two products with a difference of 10 in the 2nd sensory dimension (this is the span in the data set) are expected to differ in average preference with between 0.4 and 10. Sweet products are preferred to non-sweet products, cf. figure 9.2 above. The expected values for the two homesizes (for an average product) and their differences are estimated at: ˆα(1) + ˆβ 2 sens2 = 4.91, [4.73, 5.09] ˆα(2) + ˆβ 2 sens2 = 4.67, [4.47, 4.85] ˆα(1) ˆα(2) = 0.25, [0.04, 0.46] So homes with more persons tend to have a slightly lower preference in general for such carrot products.

15 enote RANDOM COEFFICIENT MODELS IN PERSPECTIVE Random coefficient models in perspective Although the factor structure diagrams with all the features of finding expected mean squares and degrees of freedom are only strictly valid for balanced designs and models with no quantitative covariates, they may still be useful as a more informal structure visualization tool for these non-standard situations. The setting with hierarchical regression data is really an example of what also could be characterized as repeated measures data. A common situation is that repeated measurements on a subject(animal, plant, sample) are taken over time then also known as longitudinal data. So apart from appearing as natural extensions of fixed regression models, the random coefficient models are one option for analyzing repeated measures data. The simple models can be extended to polynomial models to cope with non-linear structures in the data. Also additional residual correlation structures can be incorporated. In Modules 11 and 12 a thorough treatment of repeated measures data is given with a number of different methods simple as well as more complex approaches. 9.5 R-TUTORIAL: Constructed data The data file can be downloaded as: randcoef.txt and is described also in enote13. The simple linear regression analyses of the two response y1 and y2 in the data set randcoef are obtained using lm: randcoef <- read.table("randcoef.txt", sep=",", header=true) randcoef$subject <- factor(randcoef$subject) model1y1 <- lm(y1 ~ x, data = randcoef) model1y2 <- lm(y2 ~ x, data = randcoef) The parameter estimates with corresponding standard errors in the two models are: summary(model1y1) Call: lm(formula = y1 ~ x, data = randcoef)

16 enote R-TUTORIAL: CONSTRUCTED DATA 16 Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** x e-09 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 4 on 98 degrees of freedom Multiple R-squared: 0.301,Adjusted R-squared: F-statistic: 42.2 on 1 and 98 DF, p-value: 3.41e-09 summary(model1y2) Call: lm(formula = y2 ~ x, data = randcoef) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-12 *** x e-11 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 4.53 on 98 degrees of freedom Multiple R-squared: 0.377,Adjusted R-squared: F-statistic: 59.4 on 1 and 98 DF, p-value: 1.08e-11 The raw scatter plots for the data with superimposed regression lines are obtained using the plot and abline functions:

17 enote R-TUTORIAL: CONSTRUCTED DATA 17 par(mfrow=c(1,2)) with(randcoef, {plot(x,y1) abline(model1y1) plot(x,y2) abline(model1y2)}) y y x x par(mfrow=c(1,1)) The individual patterns in the data can be seen from the next plot: par(mfrow=c(1,2)) with(randcoef, {plot(x,y1) for (i in 1:10) {lines(x[subject==i],y1[subject==i],lty=i)} plot(x,y2) for (i in 1:10) {lines(x[subject==i],y2[subject==i],lty=i)}})

18 enote R-TUTORIAL: CONSTRUCTED DATA 18 y y x x par(mfrow=c(1,1)) The function lines connects points with line segments. Notice how the repetetive plotting is solved using a for loop: For each i between 1 and 10 the relevant subset of the data is plotted with a line type that changes as the subject changes. Alternatively we could have used 10 lines lines for each response. The fixed effects analysis with the two resulting ANOVA tables are: model2y1 <- lm(y1 ~ x + subject + x * subject, data = randcoef) model2y2 <- lm(y2 ~ x + subject + x * subject, data = randcoef) library(xtable) print(xtable(anova(model2y1))) print(xtable(anova(model2y2))) A plot of the data with individual regression lines based on model2y1 and model2y2 is again produced using a for loop. First we fit the two models in a different parameteri-

19 enote R-TUTORIAL: CONSTRUCTED DATA 19 Df Sum Sq Mean Sq F value Pr(>F) x subject x:subject Residuals Df Sum Sq Mean Sq F value Pr(>F) x subject x:subject Residuals sation (to obtain the estimates in a convenient form of one intercept and one slope per subject) model3y1 <- lm(y1 ~ subject x * subject - x, data = randcoef) model3y2 <- lm(y2 ~ subject x * subject - x, data = randcoef) The plots are produced using par(mfrow=c(1,2)) with(randcoef, {plot(x,y1) for (i in 1:10) {abline(coef(model3y1)[c(i,i+10)],lty=i)} plot(x,y2) for (i in 1:10) {abline(coef(model3y2)[c(i,i+10)],lty=i)}})

20 enote R-TUTORIAL: CONSTRUCTED DATA 20 y y x x par(mfrow=c(1,1)) Explanation: Remember that coef extracts the parameter estimates. Now the first 10 estimates will be the intercept estimates and the next 10 will be the slope estimates. Thus the component pairs (1, 11), (2, 12),..., (10, 20) will be belong to the subjects 1, 2,..., 10, respectively. This is exploited in the for loop in the part [c(i,i+10)]! which produces these pairs as i runs from 1 to 10. The equal slopes model for the second data set with parameter estimates is: model4y2 <- lm(y2 ~ subject + x, data = randcoef) summary(model4y2) Call: lm(formula = y2 ~ subject + x, data = randcoef) Residuals:

21 enote R-TUTORIAL: CONSTRUCTED DATA 21 Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** subject * subject subject subject subject subject subject subject ** subject x e-13 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: 4.15 on 89 degrees of freedom Multiple R-squared: 0.524,Adjusted R-squared: F-statistic: 9.81 on 10 and 89 DF, p-value: 7.23e-11 The summary of the two step analysis can be obtained using the functions mean and sd (computing empirical mean and standard deviation of a vector, respectively) to the vector of intercept estimates and to the vector of slope estimates (from the different slopes models) to perform the computations in this Module 9. Here it comes from data set 1, but it is done similarly for data set 2: ainty1 <- mean(coef(model3y1)[1:10]) sdinty1 <- sd(coef(model3y1)[1:10])/sqrt(10) uinty1 <- ainty * sdinty1 linty1 <- ainty * sdinty1 asloy1 <- mean(coef(model3y1)[11:20]) sdsloy1 <- sd(coef(model3y1)[11:20])/sqrt(10) usloy1 <- asloy * sdsloy1 lsloy1 <- asloy * sdsloy1

22 enote R-TUTORIAL: CONSTRUCTED DATA 22 The correlations between intercepts and between slopes in the two data set are computed using corr cor(coef(model3y1)[1:10], coef(model3y1)[11:20]) [1] cor(coef(model3y2)[1:10], coef(model3y2)[11:20]) [1] The random coefficients analysis is done with lmer. The different slopes random coefficient model is library(lmertest) model5y1 <- lmer(y1 ~ x + (1 + x subject), data = randcoef) model5y2 <- lmer(y2 ~ x + (1 + x subject), data = randcoef) After random the part 1x+ specifies the terms to which the random factors after are assigned. One way to think about this, is that 1 is multiplied by subject and that x is multiplied by subject yielding the terms 1 subject + x subject which corresponds to the random part in formula (9.2). The (fixed effects) parameter estimates andn their standard errors are obtained from the model summary: summodel5y1 <- summary(model5y1) summodel5y2 <- summary(model5y2) print(xtable(summodel5y1$coefficients)) Estimate Std. Error df t value Pr(> t ) (Intercept) x

23 enote R-TUTORIAL: CONSTRUCTED DATA 23 print(xtable(summodel5y2$coefficients)) Estimate Std. Error df t value Pr(> t ) (Intercept) x The variance parameter, including the correlations between intercept and slope, estimates are obtained using: summodel5y1$varcor Groups Name Std.Dev. Corr subject (Intercept) x Residual summodel5y2$varcor Groups Name Std.Dev. Corr subject (Intercept) x Residual The equal slopes models within the random coefficient framework are specified as model6y1 <- lmer(y1 ~ x + (1 subject), data = randcoef) model6y2 <- lmer(y2 ~ x + (1 subject), data = randcoef) Likelihood ratio tests for reduction from different slopes to equal slopes can be obtained using anova with two lmer result objects as arguments (the first argument (model) is less general than the second argument (model)). print(xtable(anova(model6y1, model5y1, refit=false)))

24 enote R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 24 Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) object print(xtable(anova(model6y2, model5y2, refit=false))) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) object Confidence intervals for the relevant final models may be obtained by: print(xtable(confint(model6y2))) 2.5 % 97.5 %.sig sigma (Intercept) x print(xtable(confint(model5y1))) In the latter case, three of the variance parameters cannot be profiled (only for the subject-main-effect variance component a finite confidence interval is found). This is not necessarily a problem, as the likelihood and CIs and tests for the fixed effects still make good sense. The (fixed effects) parameter estimates for the final model for data set 2 are print(xtable(summary(model6y2)$coefficients)) 9.6 R-TUTORIAL: Consumer preference mapping of carrots The data file can be downloaded as: carrots.txt and is described also in enote13.

25 enote R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS % 97.5 %.sig sig sig Inf.sigma 0.00 Inf (Intercept) x Estimate Std. Error df t value Pr(> t ) (Intercept) x Recall that the most general model ((9.8) to (9.11) in Module 9) states that for each level of Consumer the random intercept and random slopes of sens1 and sens2 are correlated in an arbitrary way (the specification in (9.11)). It can be specified as follows carrots <- read.table("carrots.txt", header = TRUE, sep = ",") carrots$homesize <- factor(carrots$homesize) carrots$consumer <- factor(carrots$consumer) carrots$product <- factor(carrots$product) model1 <- lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1+sens1+sens2 Consumer),data=carrots) summary(model1) Linear mixed model fit by REML [ mermodlmertest ] Formula: Preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 + sens1 + sens2 Consumer) Data: carrots REML criterion at convergence: 3748 Scaled residuals: Min 1Q Median 3Q Max Random effects:

26 enote R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 26 Groups Name Variance Std.Dev. Corr Consumer (Intercept) sens sens product (Intercept) Residual Number of obs: 1233, groups: Consumer, 103; product, 12 Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) <2e-16 *** Homesize * sens sens ** Homesize3:sens Homesize3:sens Signif. codes: 0 *** ** 0.01 * Correlation of Fixed Effects: (Intr) Homsz3 sens1 sens2 Hms3:1 Homesize sens sens Homsz3:sns Homsz3:sns The random part deserves some explanation. The structure (9.11) amounts to the term (1sens1+sens2 Consumer)+, for each level of Consumer we have 3 random effects, one intercept and two slopes, and they are arbitrarily correlated. In addition there is the random effect product. Let us check what the step function of lmertest can tell us about the random effects of this model: mystep <- step(model1) print(xtable(mystep$rand.table)) We note that the sens1:consumer effect is tested with 3 degrees of freedom (and is NS, hence eliminated). This is because elminating this term from the model means that the

27 enote R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 27 Chi.sq Chi.DF elim.num p.value sens1:consumer product kept 0.00 sens2:consumer kept 0.02 variance AND the correlations between this coefficient and the sens2 coefficients and the intercepts are are all assumed to be zero. This is the elimination principle implemented here. Remark 9.1 Random coefficient correlations Generally it is recommended to include these correlations in the models (and also what R is doing for us, when implemented as shown above). This is so, as correlations between the x es will induce such correlations between the coefficients by construction and hence it would be wrong not to allow for them in the model. The basic example is a non-centred x in a regression that will lead to a relation between the slope and the intercept. However, IF the x is centered (and hence the x has correlation zero with the constant ) this relation disappears. And generally if the x es are independent (orthogonal) then models with independent coefficients could make sense and could be a reasonable approach to stabilize the random effect part of the model. In this case the sens1 and sens2 are in fact both mean centred and independent by construction (scores from a principal component analysis), but check: mean(carrots$sens1) [1] 6.667e-11 mean(carrots$sens2) [1] -7.5e-11 cor(carrots$sens1, carrots$sens2) [1] -1.93e-11

28 enote R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 28 The models with (A) and without (B) correlation between intercepts and sens2-slopes (Model 1 in Module 9) is (note the difference in R-syntax for the random effects) model2a <- lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1+sens2 Consumer), data=carrots) model2b <- lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 Consumer) + (0+sens2 Consumer), data=carrots) print(xtable(anova(model2a, model2b, refit=false))) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) object So we do not need the correlation in the model. We could anyway without any problems use the results of the step call above or we could redo by applying the step function on the model2b-fit: mystep2 <- step(model2b) Warning: Model failed to converge with max grad = (tol = 0.002) Warning: Model failed to converge with max grad = (tol = 0.002) print(xtable(mystep2$rand.table)) Chi.sq Chi.DF elim.num p.value product kept 0.00 Consumer kept 0.00 sens2:consumer kept 0.01 Now we also see a test for the random main (intercept) effect of Consumer, which was not part of the above.

29 enote R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 29 The warnings do not worry us to much here: In one or two of the models the convergence just barely failed due to one of the convergence criteria, but clearly it was pretty close. There are ways to work with setting various optimizer options including extending the number of iterations etc. but we will not pursue these here. Instead we check that the final model converges: finalmodel <- lmer(preference ~ Homesize + sens2 + (1 product) + (1 Consumer) + (0 + sens2 Consumer), data = carrots) After having reduced the covariance structure in the model, we turn attention to the mean structure, ie the fixed effects: print(xtable(mystep2$anova.table)) Sum Sq Mean Sq NumDF DenDF F.value elim.num Pr(>F) Homesize:sens sens Homesize:sens Homesize kept 0.02 sens kept 0.00 And various model parameter summaries and post hoc: VarCorr(mystep2$model) Groups Name Std.Dev. Consumer sens Consumer.1 (Intercept) product (Intercept) Residual print(xtable(confint(mystep2$model))) print(xtable(mystep$lsmeans))

30 enote EXERCISES % 97.5 %.sig sig sig sigma (Intercept) Homesize sens Homesize Estimate Standard Error DF t-value Lower CI Upper CI p-va Homesize Homesize print(xtable(mystep$diffs.lsmeans)) Estimate Standard Error DF t-value Lower CI Upper CI p-value Homesize Exercises Exercise 1 Carrots data Consider the carrots data of this Chapter. The data file can be downloaded as: carrots.txt and is described also in enote13. module. Carry out a similar analysis using (at least) one of the other three response variables (Sweetness,Bitter or Crisp) instead of the preference. Try to include (at least) one other background variable than the homesize, e.g. gender.

Random coefficients models

enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................