Random coefficients models

Similar documents
Random coefficients models

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

The linear mixed model: modeling hierarchical and longitudinal data

A short explanation of Linear Mixed Models (LMM)

enote 3 1 enote 3 Case study

enote 3 1 enote 3 Case study

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Introduction to mixed-effects regression for (psycho)linguists

Practical 4: Mixed effect models

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

lme for SAS PROC MIXED Users

Regression Analysis and Linear Regression Models

Model Selection and Inference

Output from redwing2.r

Multiple Linear Regression

PSY 9556B (Feb 5) Latent Growth Modeling

Section 2.2: Covariance, Correlation, and Least Squares

Performing Cluster Bootstrapped Regressions in R

36-402/608 HW #1 Solutions 1/21/2010

Week 4: Simple Linear Regression III

Exercise 2.23 Villanova MAT 8406 September 7, 2015

Package simr. April 30, 2018

Section 2.1: Intro to Simple Linear Regression & Least Squares

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Inference in mixed models in R - beyond the usual asymptotic likelihood ratio test

Analysis of variance - ANOVA

Applied Statistics and Econometrics Lecture 6

Clustering and Visualisation of Data

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:

Week 4: Simple Linear Regression II

Section 4 General Factorial Tutorials

Statistical Analysis of Series of N-of-1 Trials Using R. Artur Araujo

Repeated Measures Part 4: Blood Flow data

Variable selection is intended to select the best subset of predictors. But why bother?

Week 5: Multiple Linear Regression II

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Introduction to Mixed Models: Multivariate Regression

Section 2.3: Simple Linear Regression: Predictions and Inference

Generalized additive models I

Chapter 15 Mixed Models. Chapter Table of Contents. Introduction Split Plot Experiment Clustered Data References...

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel

SAS/STAT 13.1 User s Guide. The NESTED Procedure

Stat 5303 (Oehlert): Unbalanced Factorial Examples 1

Lecture 13: Model selection and regularization

Section 3.4: Diagnostics and Transformations. Jared S. Murray The University of Texas at Austin McCombs School of Business

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

The NESTED Procedure (Chapter)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Cross-validation and the Bootstrap

MAGMA joint modelling options and QC read-me (v1.07a)

Recent advances in Metamodel of Optimal Prognosis. Lectures. Thomas Most & Johannes Will

SPSS INSTRUCTION CHAPTER 9

Generalized Additive Models

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

General Factorial Models

One Factor Experiments

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

Lecture 25: Review I

Additional Issues: Random effects diagnostics, multiple comparisons

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

Poisson Regression and Model Checking

Exploratory model analysis

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Quantitative - One Population

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Multiple Regression White paper

Estimation of Item Response Models

Descriptive Statistics, Standard Deviation and Standard Error

An Experiment in Visual Clustering Using Star Glyph Displays

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison

Package lmertest. September 21, 2013

Generalized Additive Models

Splines and penalized regression

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

Model selection. Peter Hoff. 560 Hierarchical modeling. Statistics, University of Washington 1/41

RSM Split-Plot Designs & Diagnostics Solve Real-World Problems

Package lmertest. November 30, 2017

Bivariate Linear Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Problem set for Week 7 Linear models: Linear regression, multiple linear regression, ANOVA, ANCOVA

Study Guide. Module 1. Key Terms

Outline. Topic 16 - Other Remedies. Ridge Regression. Ridge Regression. Ridge Regression. Robust Regression. Regression Trees. Piecewise Linear Model

General Factorial Models

Unit 5 Logistic Regression Practice Problems

Multivariate Analysis Multivariate Calibration part 2

1. Estimation equations for strip transect sampling, using notation consistent with that used to

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

Stat 4510/7510 Homework 4

An introduction to SPSS

Machine Learning: An Applied Econometric Approach Online Appendix

Using Machine Learning to Optimize Storage Systems

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Factorial ANOVA. Skipping... Page 1 of 18

Transcription:

enote 9 1 enote 9 Random coefficients models

enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data........................... 4 9.2.1 Simple regression analysis........................ 4 9.2.2 Fixed effects analysis........................... 5 9.2.3 Two step analysis............................. 6 9.2.4 Random coefficient analysis....................... 8 9.3 Example: Consumer preference mapping of carrots............. 10 9.4 Random coefficient models in perspective................... 14 9.5 R-TUTORIAL: Constructed data......................... 14 9.6 R-TUTORIAL: Consumer preference mapping of carrots........... 24 9.7 Exercises...................................... 30 9.1 Introduction Random coefficient models emerge as natural mixed model extensions of simple linear regression models in a hierarchical (nested) data setup. In the standard situation, we are interested in the relationship between x and y. Assume we have observations

enote 9 9.1 INTRODUCTION 3 (x 1, y 1 ),... (x n, y n ) for a subject. Then we would fit the linear regression model, given by y j = α + βx j + ɛ j Assume next that such regression data are available on a number of subjects. Then a model that expresses different regression lines for each subject is expressed by: or using the more general notation: y ij = α i + β i x ij + ɛ ij y i = α(subject i ) + β(subject i )x i + ɛ i (9-1) This model has the same structure as the different slopes ANCOVA model of the previous enote, only now the regression relationships are in focus. Assume finally that the interest lies in the average relationship across subjects. A commonly used ad hoc approach is to employ a two-step procedure: 1. Carry out a regression analysis for each subject. 2. Do subsequent calculations on the parameter estimates from these regression analyzes to obtain the average slope (and intercept) and their standard errors. Since the latter treats the subjects as a random sample, it would be natural to incorporate this in the model, by assuming the subject effects (intercepts and slopes) to be random: where y i = a(subject i ) + b(subject i )x i + ɛ i a(k) N(α, σ 2 a ), b(k) N(β, σ 2 b ), ɛ i N(0, σ 2 ) and where k = 1,..., K with K being the number of subjects. The parameters α and β are the unknown population values for the intercept and slope. This is a mixed model, although a few additional considerations are required to identify the typical mixed model expression. The expected value is and the variance is Ey i = α + βx i Var y i = σ 2 a + σ 2 b x2 i + σ 2 So, an equivalent way of writing the model is the following where the fixed and the random part is split: y i = α + βx i + a(subject i ) + b(subject i )x i + ɛ i (9-2)

enote 9 9.2 EXAMPLE: CONSTRUCTED DATA 4 where a(k) N(0, σ 2 a ), b(k) N(0, σ 2 b ), ɛ i N(0, σ 2 ) (9-3) Now the linear mixed model structure is apparent. Although we do not always explicitly state this, there is the additional assumption that the random effects a(k), b(k) and ɛ i are mutually independent. For randomly varying lines (a(k), b(k)) in the same x- domain this may be an unreasonable assumption since the slope and intercept values may very well be related to each other. It is possible to extend the model to allow for such a correlation/covariance between the intercept and slope by assuming a bi-variate normal distribution for each set of line parameters: ( ) σ 2 (a(k), b(k)) N(0, a σ ab ), ɛ i N(0, σ 2 ) (9-4) σ ab The model given by (9-2) and (9-4) is the standard random coefficient mixed model. σ 2 b 9.2 Example: Constructed data To illustrate the basic principles we start with two constructed data sets of 100 observations of y for 10 different x-values, see Figure 9.1. It reflects that a raw scatter plot of a data set can be hiding quite different structures, if the data is in fact hierarchical (repeated observations on each individual rather than exactly one observation for each individual). 9.2.1 Simple regression analysis Had the data NOT been hierarchical, but in stead observations on 100 subjects, a simple regression analysis, corresponding to the model y i = α + βx i + ɛ i (9-5) where ɛ i N(0, σ 2 ), i = 1,..., 100 would be a reasonable approach. For comparison we state the results of such an analysis for the two data sets. The parameter estimates are: Data 1 Data 2 Parameter Estimate SE P-value Estimate SE P-value σ 2 15.9899 20.5229 α 10.7280 0.8638 7.8356 0.9786 β 0.90461 0.1392 <0.0001 1.21519 0.1577 <0.0001

enote 9 9.2 EXAMPLE: CONSTRUCTED DATA 5 y1 10 15 20 25 y1 10 15 20 25 y1 10 15 20 25 2 4 6 8 10 x 2 4 6 8 10 x 2 4 6 8 10 x y2 5 10 15 20 25 y2 5 10 15 20 25 y2 5 10 15 20 25 2 4 6 8 10 x 2 4 6 8 10 x 2 4 6 8 10 x Figur 9.1: Constructed data: Top: data set 1, bottom: data set 2. Left: Raw scatter plot with simple regression line, middle: Individual patterns, right: individual lines See Figure 9.1(left) for the estimated lines. 9.2.2 Fixed effects analysis If we had special interest in the 10 subjects, a fixed effects analysis corresponding to model (9-1) could be carried out. The F-tests and P-values from the Type 1 (successive) ANOVA tables become: Data set 1 Data set 2 Source DF F P-value F P-value x 1 9220.98 <.0001 70.74 <.0001 subject 9 2091.49 <.0001 3.07 0.0033 x*subject 9 277.71 <.0001 1.02 0.4311

enote 9 9.2 EXAMPLE: CONSTRUCTED DATA 6 For data set 1 the slopes are clearly different whereas for data set 2 the slopes can be assumed equal, but the intercepts (subjects) are different. Although it is usually recommended to rerun the analysis without an insignificant interaction effect, the Type I table shows that the result of this will clearly be that the subject (intercept) effect is significant for data set 2, cf. the discussion of Type I/Type III tables in enote 3. So for data set 1, the (fixed effect) story is told by providing the 10 intercept and slope estimates and/or possibly as described for the different slopes ANCOVA model in the previous enote. For data set 2, an equal slopes ANCOVA model can be used to summarize the results. The common slope and error variance estimates are: ˆβ = 1.2152, SE ˆβ = 0.1446, ˆσ2 = 17.2582 The confidence band for the common slope, using the 89 error degrees of freedom becomes 1.2152 ± t 0.975 (89)0.1446 which, since t 0.975 (89) = 1.987, gives [0.9279, 1.5025] The subjects could be described and compared as for the common slopes ANCOVA model of the previous enote. 9.2.3 Two step analysis If the interest is NOT in the individual subjects but rather in the average line, then a natural ad hoc approach is simply to start by calculating the individual intercepts and slopes and then subsequently treat those as simple random samples and calculate average, variance and standard error to obtain confidence limits for the population average values. So e.g. for the slopes we have ˆβ 1,..., ˆβ 10 and calculate the average ˆβ = 1 10 10 ˆβ i, i=1 the variance and the standard error s 2ˆβ = 1 9 10 i=1 SE ˆβ = ( ˆβ i ˆβ) 2 s ˆβ 10

enote 9 9.2 EXAMPLE: CONSTRUCTED DATA 7 to obtain the 95% confidence interval: (using that t 0.975 (9) = 2.26) The variances for data set 1 are: ˆβ ± 2.26SE ˆβ s 2ˆα = 16.2779, = 0.2465 s2ˆβ and for data set 2: s 2ˆα = 8.5663, = 0.2130 s2ˆβ The results for the intercepts and slopes for the two data sets are given in the following table: Data set 1 Data set 2 α β α β Average 10.7279 0.9046 7.8356 1.2152 SE 1.2759 0.1570 0.9255 0.1460 Lower 7.8416 0.5495 5.7419 0.8850 Upper 13.6142 1.2597 9.9293 1.5454 Note that for data set 2, the standard error for the slope is almost identical to the standard error from the fixed effect equal slopes model from above. However, due to the smaller degrees of freedom, 9 instead of 89, the confidence band is somewhat larger here. This reflects the difference in interpretation: In the fixed effects analysis the β estimates the common slope for these specific 10 subjects. Here the estimate is of the population average slope (the population from which these 10 subjects were sampled). This distinction does not alter the estimate itself, but does change the statistical inference that is made. Note, by the way, that for estimating the individual lines, it does not make a difference whether an overall different slopes model is used or 10 individual ( small ) regression models separately. Although not used, the observed correlation between the intercepts and slopes in each case can be found: corr 1 = 0.382, corr 2 = 0.655

enote 9 9.2 EXAMPLE: CONSTRUCTED DATA 8 9.2.4 Random coefficient analysis The results of fitting the random coefficient model given by (9-2) and (9-4) to each data set is given in the following table: Data set 1 Data set 2 α β α β Estimate 10.7279 0.9046 7.8356 1.2152 SE 1.2759 0.1570 0.9255 0.1460 Lower 7.8416 0.5495 5.7419 0.8850 Upper 13.6142 1.2597 9.9293 1.5454 Note that this table is an exact copy of the result table for the two-step analysis above! The parameters of the variance part of the mixed model for data set 1 is estimated at: (read off from R-output) ˆσ a = 4.031, ˆσ b = 0.496, ˆρ ab = 0.38, ˆσ = 0.271 which corresponds to the following variances: and for data set 2: ˆσ 2 a = 16.25, ˆσ 2 b = 0.246, ˆσ2 = 0.073 ˆσ a = 1.086, ˆσ b = 0.147, ˆρ ab = 1.00, ˆσ = 4.132 which corresponds to the following variances: ˆσ 2 a = 1.18, ˆσ 2 b = 0.022, ˆσ2 = 17.07 Compare with the variances calculated in the two-step procedure: For data set 1, the random coefficient model estimates are slightly smaller, whereas for data set 2, they are considerably smaller. This makes good sense, as the variances in the two-step procedure also will include some additional variation due to the residual error variance (just like the mean squares in a standard hierarchical model). For data set 1, this residual error variance is estimated at a very small value (0.0732) whereas for data set 2 it is 17.07. This illustrates how the random coefficient model provides the proper story about what is going on, and directly distinguishes between the two quite different situations exemplified here. Note also that for data set 1, the correlation estimate ˆρ ab = 0.38 is close to the observed correlation calculated in the two-step procedure. However, for data set 2 the estimated

enote 9 9.2 EXAMPLE: CONSTRUCTED DATA 9 correlation becomes ˆρ ab = 1!!! This obviously makes no sense! We encounter a situation similar to the the negative variance problem discussed previously. The correlation may become meaningless when some of the variances are estimated very small, which is the case for the slope variance here. To put it differently, for data set 2 the model we have specified include components (in the variance) that is not actually present in the data. We already new this, since the equal slopes model was a reasonable description of this data. In the random coefficient framework the equal slopes model is expressed by where y i = α + βx i + a(subject i ) + ɛ i (9-6) a(k) N(0, σ 2 a ), ɛ i N(0, σ 2 ) (9-7) The adequacy of this model can be tested by a residual likelihood ratio test, cf. enote 5. For data set 2 we obtain G = 2l REML,1 ( 2l REML,2 ) = 0.65 which is non-significant using a χ 2 distribution with 2 degrees of freedom. For data set 1 the similar test becomes which is extremely significant. G = 2l REML,1 ( 2l REML,2 ) = 249.9 For data set 2 the conclusions should be based on the equal slopes model given by (9-6) and (9-7), and we obtain the following: Data set 2 α β Estimate 7.8356 1.2152 SE 1.0774 0.1446 Lower 5.6544 0.9278 Upper 10.0168 1.5026 We see a minor change in the confidence bands: believing in equal slopes increases the (estimated) precision (smaller confidence interval) for this slope, whereas the precision of the average intercept decreases.

enote 9 9.3 EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 10 9.3 Example: Consumer preference mapping of carrots In a consumer study 103 consumers scored their preference of 12 danish carrot types on a scale from 1 to 7. The carrots were harvested in autumn 1996 and tested in march 1997. A number of background information variables were recorded for each consumer, see the data description in enote 13 for details. Data are available in the carrots.txt file. The aim of a so-called external preference mapping is to find the sensory drivers of the consumer preference behaviour and to investigate if these are different in different segments of the population. To do this, in addition to the consumer survey, the carrot products are evaluated by a trained panel of tasters, the sensory panel, with respect to a number of sensory (taste, odour and texture) properties. Since usually a high number of (correlated) properties (variables) are used, in this case 14, it is a common procedure to use a few, often 2, combined variables that contain as much of the information in the sensory variables as possible. This is achieved by extracting the first two principal components in a principal components analysis (PCA) on the product-by-property panel average data matrix. PCA is a commonly used multivariate technique to explore and/or decompose high dimensional data. We call these two variables sens1 and sens2 and they are given by sens1 i = 14 a j v i 14 j and sens2 i = b j v i j j=1 j=1 where v1 i,..., vi 14 are the 14 average sensory scores for carrot product i and the coefficients a j and b j defining the two combined sensory variables are as depicted in Figure 9.2. So sens1 is a variable that (primarily) measures bitterness vs. nutty taste whereas sens2 measures sweetness (and related properties). The actual preference mapping is carried out by first fitting regression models for the preference as a function of the sensory variables for each individual consumer using the 12 observations across the carrot products. Next, the individual regression coefficients are investigated, often in an explorative manner in which a scatter plot is used to look for a possible segmentation of consumers in these regression coefficients. In stead of looking for segmentation ( Cluster analysis ) we investigate whether we see any differences with respect to the background variables in the data, e.g. the gender or homesize (number of persons in the household). Let y i be the ith preference score. The natural model for this is a model that expresses randomly varying individual relations to the sensory variables, but with average (expected) values that may depend on the homesize. Let us consider the factor structure of the setting. The basic setting is a randomized block experiment with 12 treatments (carrot products), the factor prod, and 103 blocks

enote 9 9.3 EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 11 sweet_ta fruit_ta Sens2 0.4 0.2 0.0 0.2 0.4 nut_ta carrot_af juicy colour bitter_ta car_od bitter_af earty_od crisp earthy_ta transp hard 0.2 0.0 0.2 0.4 Sens1 Figur 9.2: Loadings plot for PCA of sensory variables: Scatter plot of coefficients b j versus a j. (consumers), the factor cons. Homesize (size) is a factor that partitions the consumers into two groups, those with homesize of 1 or 2, and those with a larger homesize. So the factor cons is nested within size, or equivalently size is coarser than cons. This basic structure is depicted in Figure 9.3. The linear effect of the sensory variables is a part of the prod effect, since these covariates are on product level. So they are both coarser than the product effect. The sensory

enote 9 9.3 EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 12 12 [prod] 11 1236 [I] 1122 0 1 1 103 [cons] 101 size 1 2 Figur 9.3: The factor structure diagram for the carrots data variables in the model will therefore explain some of the product differences. Including prod in the model as well will enable us to test whether the sensory variables can explain all the product differences. As we do not expect this to be the case, we adopt the point of view that the 12 carrot products is a random sample from the population of carrot products in Denmark, that is, the product effect is considered as a random effect. In other words, we consider the deviations in the product variation from what can be explained by the regression on the sensory variables, as random variation. Finally, the interactions between homesize and the sensory variables should enter the model as fixed effects, allowing for different average slopes for the two homesizes, leading to the model given by where y i = α(size i ) + β 1 (size i ) sens1 i + β 2 (size i ) sens2 i + a(cons i ) +b 1 (cons i ) sens1 i + b 2 (cons i ) sens2 i + d(prod i ) + ɛ i (9-8) a(k) N(0, σa 2 ), b 1 (k) N(0, σb1 2 ), b 2(k) N(0, σb2 2 ), k = 1,... 103. (9-9) and d(prod i ) N(0, σ 2 P ), ɛ i N(0, σ 2 ) (9-10)

enote 9 9.3 EXAMPLE: CONSUMER PREFERENCE MAPPING OF CARROTS 13 To finish the specification of a general random coefficient model, we need the assumption of the possibility of correlations between the random coefficients: σa 2 σ ab1 σ ab2 (a(k), b 1 (k), b 2 (k)) N(0, σ ab1 σb 2 1 σ b1 b 2 ) (9-11) σ ab2 σ b1 b 2 σb 2 2 Before studying the fixed effects, the variance part of the model is investigated further. We give details in the R-TUTORIAL section on how we end up simplifying this 8-parameter variance model down to the 5-parameter variance model, where the σ 2 b 1 - parameter and the two related correlations can be tested non-significant. The model therefore reduces to: y i = α(size i ) + β 1 (size i ) sens1 i + β 2 (size i ) sens2 i + a(cons i ) +b 2 (cons i ) sens2 i + d(prod i ) + ɛ i (9-12) where and [ σ 2 ] (a(k), b 2 (k)) N(0, a σ ab2 σ ab2 σb 2 ), k = 1,... 103. (9-13) 2 d(prod i ) N(0, σ 2 P ), ɛ i N(0, σ 2 ) (9-14) With this variance structure, we investigate the fixed effects. Successively removing insignificant terms we find that the following final model is an appropriate description of the data: y i = α(size i ) + β 2 sens2 i + a(cons i ) +b 2 (cons i ) sens2 i + d(prod i ) + ɛ i (9-15) where and [ σ 2 ] (a(k), b 2 (k)) N(0, a σ ab2 σ ab2 σb 2 ), k = 1,... 103. (9-16) 2 d(prod i ) N(0, σ 2 P ), ɛ i N(0, σ 2 ) (9-17) Estimates of the variance-parameters are given in the following table: ˆσ a 2 0.444 2 ˆσ b 2 2 0.0545 2 ˆρ ab2 0.178 ˆσ P 2 0.1775 2 ˆσ 2 1.0194 2

enote 9 9.4 RANDOM COEFFICIENT MODELS IN PERSPECTIVE 14 The conclusions regarding the relation between the preference and the sensory variables are that no significant relation was found to sens1, but indeed so for sens2. The relation does not depend on the homesize and is estimated (with 95% confidence interval) to: ˆβ 2 = 0.071, [0.033, 0.107] So two products with a difference of 10 in the 2nd sensory dimension (this is the span in the data set) are expected to differ in average preference with between 0.33 and 1.07. Sweet products are preferred to non-sweet products, cf. Figure 9.2 above. The expected values for the two homesizes (for an average product) and their differences are estimated at: ˆα(1) + ˆβ 2 sens2 = 4.91, [4.73, 5.09] ˆα(2) + ˆβ 2 sens2 = 4.66, [4.47, 4.85] ˆα(1) ˆα(2) = 0.25, [0.04, 0.46] So homes with more persons tend to have a slightly lower preference in general for such carrot products. 9.4 Random coefficient models in perspective Although the factor structure diagrams with all the features of finding expected mean squares and degrees of freedom are only strictly valid for balanced designs and models with no quantitative covariates, they may still be useful as a more informal structure visualization tool for these non-standard situations. The setting with hierarchical regression data is really an example of what also could be characterized as repeated measures data. A common situation is that repeated measurements on a subject (animal, plant, sample) are taken over time then also known as longitudinal data. So apart from appearing as natural extensions of fixed regression models, the random coefficient models are one option for analyzing repeated measures data. The simple models can be extended to polynomial models to cope with non-linear structures in the data. Also additional residual correlation structures can be incorporated. In enotes 11 and 12 a thorough treatment of repeated measures data is given with a number of different methods simple as well as more complex approaches. 9.5 R-TUTORIAL: Constructed data The constructed data are available in the file randcoef.txt.

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 15 The simple linear regression analyses of the two response y1 and y2 in the data set randcoef are obtained using lm: randcoef <- read.table("randcoef.txt", sep=",", header=true) randcoef$subject <- factor(randcoef$subject) model1y1 <- lm(y1 ~ x, data = randcoef) model1y2 <- lm(y2 ~ x, data = randcoef) The parameter estimates with corresponding standard errors in the two models are: coef(summary(model1y1)) Estimate Std. Error t value Pr(> t ) (Intercept) 10.7279125 0.8638263 12.419063 7.781938e-22 x 0.9046114 0.1392182 6.497795 3.407867e-09 coef(summary(model1y2)) Estimate Std. Error t value Pr(> t ) (Intercept) 7.835596 0.9786407 8.006611 2.455879e-12 x 1.215190 0.1577222 7.704623 1.076728e-11 The raw scatter plots for the data with superimposed regression lines are obtained using the plot and abline functions: par(mfrow=c(1, 2)) with(randcoef, { plot(x, y1, las=1) abline(model1y1) plot(x, y2, las=1) abline(model1y2) })

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 16 2 4 6 8 10 10 15 20 25 x y1 2 4 6 8 10 5 10 15 20 25 x y2 par(mfrow=c(1, 1)) The individual patterns in the data can be seen from the next plot: par(mfrow=c(1, 2)) with(randcoef, { plot(x, y1, las=1) for (i in 1:10) lines(x[subject==i], y1[subject==i], lty=i) plot(x, y2, las=1) for (i in 1:10) lines(x[subject==i], y2[subject==i], lty=i) })

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 17 2 4 6 8 10 10 15 20 25 x y1 2 4 6 8 10 5 10 15 20 25 x y2 par(mfrow=c(1, 1)) The function lines connects points with line segments. Notice how the repetetive plotting is solved using a for loop: For each i between 1 and 10 the relevant subset of the data is plotted with a line type that changes as the subject changes. Alternatively we could have used 10 lines lines for each response. The fixed effects analysis with the two resulting (type III) ANOVA tables are: model2y1 <- lm(y1 ~ x*subject, data = randcoef) model2y2 <- lm(y2 ~ x*subject, data = randcoef) library(car) # for Anova Anova(model2y1, type=3) Anova Table (Type III tests) Response: y1

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 18 Sum Sq Df F value Pr(>F) (Intercept) 225.781 1 3083.80 < 2.2e-16 *** x 33.225 1 453.80 < 2.2e-16 *** subject 313.958 9 476.46 < 2.2e-16 *** x:subject 182.995 9 277.71 < 2.2e-16 *** Residuals 5.857 80 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Anova(model2y2, type=3) Anova Table (Type III tests) Response: y2 Sum Sq Df F value Pr(>F) (Intercept) 116.04 1 6.7375 0.01123 * x 69.36 1 4.0274 0.04814 * subject 165.21 9 1.0658 0.39700 x:subject 158.18 9 1.0205 0.43107 Residuals 1377.79 80 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 A plot of the data with individual regression lines based on model2y1 and model2y2 is again produced using a for loop. First we fit the two models in a different parameterisation (to obtain the estimates in a convenient form of one intercept and one slope per subject) model3y1 <- lm(y1 ~ subject - 1 + x:subject, data = randcoef) model3y2 <- lm(y2 ~ subject - 1 + x:subject, data = randcoef) The plots are produced using:

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 19 par(mfrow=c(1, 2)) with(randcoef, { plot(x, y1, las=1) for (i in 1:10) abline(coef(model3y1)[c(i, i+10)], lty=i) plot(x, y2, las=1) for (i in 1:10) abline(coef(model3y2)[c(i, i+10)], lty=i) }) 2 4 6 8 10 10 15 20 25 x y1 2 4 6 8 10 5 10 15 20 25 x y2 par(mfrow=c(1, 1)) Explanation: Remember that coef extracts the parameter estimates. Now the first 10 estimates will be the intercept estimates and the next 10 will be the slope estimates. Thus the component pairs (1, 11), (2, 12),..., (10, 20) will be belong to the subjects 1, 2,..., 10, respectively. This is exploited in the for loop in the part [c(i, i+10)] which produces these pairs as i runs from 1 to 10. The equal slopes model for the second data set with parameter estimates is:

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 20 model4y2 <- lm(y2 ~ subject + x, data = randcoef) coef(summary(model4y2)) Estimate Std. Error t value Pr(> t ) (Intercept) 5.7182184 1.535779 3.7233342 3.442800e-04 subject2 4.6966550 1.857858 2.5279952 1.323547e-02 subject3-0.0298946 1.857858-0.0160909 9.871979e-01 subject4 3.0224610 1.857858 1.6268529 1.073039e-01 subject5 3.1704193 1.857858 1.7064921 9.140364e-02 subject6 2.7096116 1.857858 1.4584604 1.482338e-01 subject7 0.3073916 1.857858 0.1654549 8.689612e-01 subject8 1.9357258 1.857858 1.0419129 3.002741e-01 subject9 6.2554680 1.857858 3.3670331 1.123757e-03 subject10-0.8940619 1.857858-0.4812327 6.315321e-01 x 1.2151903 0.144634 8.4018292 6.476536e-13 The summary of the two step analysis can be obtained using the functions mean and sd (computing empirical mean and standard deviation of a vector, respectively) to the vector of intercept estimates and to the vector of slope estimates (from the different slopes models). Here it is shown for data set 1, but it is done similarly for data set 2: ainty1 <- mean(coef(model3y1)[1:10]) sdinty1 <- sd(coef(model3y1)[1:10])/sqrt(10) uinty1 <- ainty1 + 2.26 * sdinty1 linty1 <- ainty1-2.26 * sdinty1 asloy1 <- mean(coef(model3y1)[11:20]) sdsloy1 <- sd(coef(model3y1)[11:20])/sqrt(10) usloy1 <- asloy1 + 2.26 * sdsloy1 lsloy1 <- asloy1-2.26 * sdsloy1 The correlations between intercepts and between slopes in the two data set are computed using corr cor(coef(model3y1)[1:10], coef(model3y1)[11:20]) [1] -0.3822264

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 21 cor(coef(model3y2)[1:10], coef(model3y2)[11:20]) [1] -0.6547748 The random coefficients analysis is done with lmer. The different slopes random coefficient model is: library(lmertest) model5y1 <- lmer(y1 ~ x + (1 + x subject), data = randcoef) model5y2 <- lmer(y2 ~ x + (1 + x subject), data = randcoef) The random part of the model specification, (1 + x subject) specifies that the regression model 1 + x, i.e. intercept and slope for x, should be allowed for each subject. This corresponds to the random part in formula (9.2). The (fixed effects) parameter estimates and their standard errors are obtained from the model summary: coef(summary(model5y1)) Estimate Std. Error df t value Pr(> t ) (Intercept) 10.7279125 1.2759028 8.997104 8.408095 1.487256e-05 x 0.9046114 0.1569897 9.000192 5.762235 2.720516e-04 coef(summary(model5y2)) Estimate Std. Error df t value Pr(> t ) (Intercept) 7.835596 0.9563569 22.71909 8.193172 3.115731e-08 x 1.215190 0.1511383 27.93575 8.040256 9.517424e-09 The variance parameter, including the correlations between intercept and slope, estimates are obtained using: VarCorr(model5y1) Groups Name Std.Dev. Corr

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 22 subject (Intercept) 4.03052 x 0.49555-0.381 Residual 0.27058 VarCorr(model5y2) Groups Name Std.Dev. Corr subject (Intercept) 1.08582 x 0.14659 1.000 Residual 4.13189 The equal slopes models within the random coefficient framework are specified as model6y1 <- lmer(y1 ~ x + (1 subject), data = randcoef) model6y2 <- lmer(y2 ~ x + (1 subject), data = randcoef) Likelihood ratio tests for reduction from different slopes to equal slopes can be obtained using anova with two lmer objects as arguments (the first argument (model) is less general than the second argument (model)): anova(model6y1, model5y1, refit=false) Data: randcoef Models: object: y1 ~ x + (1 subject)..1: y1 ~ x + (1 + x subject) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) object 4 409.67 420.09-200.836 401.67..1 6 163.81 179.44-75.903 151.81 249.87 2 < 2.2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 anova(model6y2, model5y2, refit=false) Data: randcoef Models:

enote 9 9.5 R-TUTORIAL: CONSTRUCTED DATA 23 object: y2 ~ x + (1 subject)..1: y2 ~ x + (1 + x subject) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) object 4 586.63 597.05-289.31 578.63..1 6 589.97 605.61-288.99 577.97 0.6549 2 0.7208 Confidence intervals for the final model for y2 are: pr <- profile(model6y2, which=1:2, signames=false) confint(pr) 2.5 % 97.5 % sd_(intercept) subject 0.643088 3.399202 sigma 3.593941 4.816837 For y1, the profile function for the current version of the lme4 package (version 1.1.13) generates numerous warning messages and ends up not converging so we do not show any results: pr <- profile(model6y1, which=1:2, signames=false) confint(pr) An alternative is to compute simulation based bootstrap confidence intervals: ci <- confint(model5y1, parm=1:4, method="boot", nsim=1000, oldnames=false) Computing bootstrap confidence intervals... ci 2.5 % 97.5 % sd_(intercept) subject 2.1644231 5.9609896 cor_x.(intercept) subject -0.8132804 0.2720441 sd_x subject 0.2670859 0.7390522 sigma 0.2265141 0.3134644

enote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 24 Here we use nsim=1000 simulations or bootstrap samples but in practice we should use 10 or 100 times as many for a reasonable accuracy. Note also that the confidence limit will vary from run to run as different random numbers are simulated. The (fixed effects) parameter estimates for the final model for data set 2 are: coef(summary(model6y2)) Estimate Std. Error df t value Pr(> t ) (Intercept) 7.835596 1.077441 37.97818 7.272413 1.059214e-08 x 1.215190 0.144634 89.00002 8.401829 6.477041e-13 9.6 R-TUTORIAL: Consumer preference mapping of carrots Data are available in the file carrots.txt carrots <- read.table("carrots.txt", header = TRUE, sep = ",") carrots <- within(carrots, { Homesize <- factor(homesize) Consumer <- factor(consumer) product <- factor(product) }) Recall that the most general model (9.8) to (9.11) states that for each level of Consumer the random intercept and random slopes of sens1 and sens2 are correlated in an arbitrary way (the specification in (9.11)). This model can be specified as follows: model1 <- lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 + sens1 + sens2 Consumer), data=carrots) print(summary(model1), corr=false) Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of freedom [lmermod] Formula: Preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 + sens1 + sens2 Consumer)

enote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 25 Data: carrots REML criterion at convergence: 3747.9 Scaled residuals: Min 1Q Median 3Q Max -3.6998-0.5493 0.0242 0.6099 2.8641 Random effects: Groups Name Variance Std.Dev. Corr Consumer (Intercept) 0.1977646 0.44471 sens1 0.0002897 0.01702-0.20 sens2 0.0030504 0.05523 0.17 0.93 product (Intercept) 0.0336097 0.18333 Residual 1.0336345 1.01668 Number of obs: 1233, groups: Consumer, 103; product, 12 Fixed effects: Estimate Std. Error df t value Pr(> t ) (Intercept) 4.906679 0.088240 35.300000 55.606 < 2e-16 *** Homesize3-0.240826 0.105645 101.000000-2.280 0.02474 * sens1 0.013543 0.016446 13.100000 0.823 0.42501 sens2 0.061916 0.019307 16.600000 3.207 0.00528 ** Homesize3:sens1-0.006113 0.014822 316.500000-0.412 0.68032 Homesize3:sens2 0.019545 0.019246 103.100000 1.016 0.31224 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 The random part deserves some explanation. The structure (9.11) amounts to the term (1 + sens1 + sens2 Consumer), where for each consumer we fit an intercept and two slopes, one for each of sens1 and sens2. Further, these three terms are allowed to be arbitrarily correlated. In addition there is the random effect for product. There are two relevant sub-models to to consider in order to assess or simplify the random-effects structure of the model. The first sub-model would reduce the general random-effects structure for Consumer from (1 + sens1 + sens2 Consumer) to (1 + sens1 Consumer); the other sub-model would reduce it to (1 + sens2 Consumer). We can assess each of these with likelihood ratio tests as follows here exemplified for

enote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 26 the first sub-model: model2 <- lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 + sens1 Consumer), data=carrots) anova(model1, model2, refit=false) Data: carrots Models:..1: Preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize *..1: sens2 + (1 product) + (1 + sens1 Consumer) object: Preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * object: sens2 + (1 product) + (1 + sens1 + sens2 Consumer) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq)..1 11 3779.9 3836.2-1878.9 3757.9 object 14 3775.9 3847.5-1874.0 3747.9 9.9728 3 0.0188 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 This test i significant, so leaving out sens2 is not warrented. Note that this test is on 3 degrees of freedom: the variance for sens2 and the two covariances with the intercept and sens1. The rand function in the lmertest package automates the likelihood ratio tests of randomeffects terms and provides an ANOVA-like summary table: rand(model1) Analysis of Random effects Table: Chi.sq Chi.DF p.value product 16.34 1 5e-05 *** sens1:consumer 2.08 3 0.56 sens2:consumer 9.97 3 0.02 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 This shows that the sens1 random-effect for consumer is not significant and we can simplify the model.

enote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 27 We can now fit the reduced model and check that the random-effect structure cannot be simplified any further: model3 <- lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 + sens2 Consumer), data=carrots) rand(model3) Analysis of Random effects Table: Chi.sq Chi.DF p.value product 16.18 1 6e-05 *** sens2:consumer 8.05 2 0.02 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Note that it is possible to fit a model where we enforce independence of the random intercepts and slopes for consumer, i.e. we fix the correlation between these terms to zero as the following code illustrates. We do not show the results of this model and we warn against fitting and interpreting such models the reason is given in the following remark. lmer(preference ~ Homesize + sens1 + sens2 + Homesize * sens1 + Homesize * sens2 + (1 product) + (1 Consumer) + (-1 + sens2 Consumer), data=carrots)

enote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 28 Remark 9.1 Random coefficient correlations Correlations between random intercepts and slopes should always be retained in the model. The reason is that the model is only invariant to a shift in origin of the covariate if the correlation between the random intercept and slope is included in the model and estimated from the data. For example, if the covariate is temperature and we omit the correlation, then we would obtain different models depending on whether temperature was measured in, say, Kelvin, Celcius or Fahrenheit. Since we want our models to be invariant to arbitrary aspects such as the unit of measurement, we need to include correlations between random intercepts and slopes in our randomcoefficient models. It also means that a likelihood ratio test of the correlation parameter is usually not meaningful since the size of the test statistic, and consequently also the size of the p value, depends on shifts in the origin of the covariate. In conclusion, the correlation parameters are necessary for the models to make sence and we should not attempt to fix them at zero or test their significance. After having reduced the covariance structure in the model, we turn attention to the mean structure, i.e. the fixed effects. After successively removing insignificant terms, we find that the following model is an appropriate description of the data: model4 <- lmer(preference ~ Homesize + sens2 + (1 product) + (1 + sens2 Consumer), data=carrots) anova(model4) Analysis of Variance Table of type III with Satterthwaite approximation for degrees of freedom Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) Homesize 5.8511 5.8511 1 100.973 5.6305 0.019544 * sens2 18.1646 18.1646 1 12.192 17.4797 0.001232 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Estimates of the variance parameters are obtained with:

enote 9 9.6 R-TUTORIAL: CONSUMER PREFERENCE MAPPING OF CARROTS 29 VarCorr(model4) Groups Name Std.Dev. Corr Consumer (Intercept) 0.444137 sens2 0.054509 0.178 product (Intercept) 0.177492 Residual 1.019402 and their confidence intervals with: confint(model4, parm=1:5, oldnames=false) Computing profile confidence intervals... 2.5 % 97.5 % sd_(intercept) Consumer 0.35780847 0.53480490 cor_sens2.(intercept) Consumer -0.23944383 0.61053240 sd_sens2 Consumer 0.02636724 0.07798007 sd_(intercept) product 0.08707584 0.28523178 sigma 0.97683766 1.06559573 LS-means and the difference of these for Homesize are obtained with (lms_size <- lsmeans::lsmeans(model4, "Homesize")) Homesize lsmean SE df lower.cl upper.cl 1 4.910970 0.08714916 39.88 4.734819 5.087121 3 4.661173 0.09368247 48.89 4.472900 4.849446 Degrees-of-freedom method: satterthwaite Confidence level used: 0.95 confint(pairs(lms_size)) contrast estimate SE df lower.cl upper.cl 1-3 0.249797 0.1052726 100.97 0.04096381 0.4586303 Confidence level used: 0.95

enote 9 9.7 EXERCISES 30 and confidence interval for the slope of sens2 can be extracted with: lstrends(model4, specs="1", var="sens2") 1 sens2.trend SE df lower.cl upper.cl overall 0.0706383 0.01689558 12.19 0.03389019 0.1073864 Results are averaged over the levels of: Homesize Degrees-of-freedom method: satterthwaite Confidence level used: 0.95 9.7 Exercises