enote 3 1 enote 3 Case study

Size: px
Start display at page:

Download "enote 3 1 enote 3 Case study"

Transcription

1 enote 3 1 enote 3 Case study

2 enote 3 INDHOLD 2 Indhold 3 Case study Introduction Initial explorative analysis Test of overall effects/model reduction Post hoc analysis and summarizing the results Estimates of the variance parameters Estimates of the fixed parameters Comparisons of the fixed parameters R-TUTORIAL: Creating report ready tables and figures Plot devices Plotting with colours Report ready tables with xtable R-TUTORIAL: Initial explorative analysis Test of overall effects/model reduction R-TUTORIAL: Post hoc analysis and summarizing the results Exercises

3 enote INTRODUCTION Introduction This module consists of the first part of a complete analysis of the beech wood data presented as an example in module 2. The aim is to show that the principles for data analysis and result summary for fixed ANOVA and/or regression models also apply for mixed models. And maybe some readers will find it helpful to have some of these principles reviewed. For completeness we repeat here the description and initial factor structure considerations. To investigate the effect of drying of beech wood on the humidity percentage, the following experiment was conducted. Each of 20 planks was dryed in a certain period of time. Then the humidity percentage was measured in 5 depths and 3 widths for each plank: depth 1: close to the top depth 5: in the center depth 9: close to the bottom depth 3: between 1 and 5 depth 7: between 5 and 9 width 1: close to the side width 3: in the center width 2: between 1 and 3 So there are 3 5 = 15 measurements for each plank and all together 300 observations. The data is can be found as planks.txt and is reproduced in the following table.

4 enote INTRODUCTION 4 Width 1 Width 2 Width 3 Depth Depth Depth Planks In this experiment we have 3 factors apart from the trivial factors I and 0. Let us use the factor names plank, width and depth. The factor plank has 20 levels, width has 3 and depth has 5 levels. For the ith measurement of humidity, plank i denotes the plank on which this measurement was performed. And correspondingly width i and depth i denotes the width and depth, respectively, of this ith measurement. It would be natural to include the interaction between width and depth corresponding to the product factor width depth. The product factor has in this case 15 levels. A natural model would include plank as a block factor while depth and width enter together with their interaction. If Y i denotes the humidity percentage corresponding to the ith measurement, the model with fixed block effect can be written as: Y i = µ + α(width i ) + β(depth i ) + γ(width i, depth i ) + δ(plank i ) + ɛ i, (3-1) where i = 1,..., 300 and where the ɛ i s are independent and normally distributed random variables. Or similarly: Y ijk = µ + α i + β j + γ ij + δ k + ɛ ijk

5 enote INITIAL EXPLORATIVE ANALYSIS 5 Figur 3.1: The factor structure diagram where Y ijk is the kth measurement within the (i, j)th combination of the two factors, i = 1,..., 3, j = 1,..., 5 and k = 1,..., 20. As pointed out in Module 1 the block (plank) effect should be considered as a random effect, leading to the mixed model: Y i = µ + α(width i ) + β(depth i ) + γ(width i, depth i ) + d(plank i ) + ɛ i, (3-2) where d(plank i ) N(0, σplank 2 ) and ɛ ijk N(0, σ 2 ). This model corresponds to the factor structure diagram given in figure Initial explorative analysis Having realized the complete structure of the data, it is time to do initial plotting/ explorative analysis. Throughout this module, figures and results are presented without

6 enote INITIAL EXPLORATIVE ANALYSIS 6 mean of humidity mean of humidity width depth mean of humidity mean of humidity width depth Figur 3.2: Four average humidity profiles showing R code or raw R output. This can be seen as a standard for reports in the course! Typically, numerous figures not entering a final project report should be studied, since this phase is explorative, and final figures to present the key results are chosen after the statistical analysis is completed. The plotting of various average profiles is usually a helpful tool for data with several factors. In figure 3.2 four of these are presented. In the top left diagram the width humidity patterns for each plank is depicted by plotting the average humidity (taking the average of the five depths for each width and plank) against the widths. It is immediately clear that there is extensive plank-to-plank variations in the level of humidity. The message about the width effect is less clear. In the top right the similar plot for the depth effect is seen. Here the message is much clearer: The humidity is high in the center (depth=5) and low at the top (depth=1) and at the bottom (depth=9). As pointed out, this is the effect seen when the three widths are averaged. It could be that the depth effect is different for widths close to the side of the plank (width=1) than for widths in the center (width=3). In other words, there could be a plank*width interaction effect, that we wouldn t find in the plots above. Instead similar plots are given in the bottom diagrams of figure 3.2 for the widths and depths by averaging over the planks (that is, plotting the 15 average values). The depth structure already seen is recognized. Also, it is seen that there is a clear shift in humidity level from width to width and that the depth humidity pattern seems to be

7 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 7 roughly the same for the three widths. However, there are some deviations from parallel patterns and the uncertainties in the deviations from parallel patterns are not visible. A similar increasing-decreasing width pattern, that was not clearly visible from the top diagram is now seen. This pattern seems to be roughly the same for all depths (with the same precautions as before) and the low humidity levels for the top and bottom depths are clearly seen. Note again that the two bottom plots contain the same information: had there been clearly non-parallel patterns in one figure (an interaction effect) this would also appear in the other figure. The next step is to start the actual statistical analysis of the data. 3.3 Test of overall effects/model reduction A statistical analysis of this kind is commonly carried out in several steps, starting with the basic model found from the factor structure considerations. This model usually contains every possible effect there may be in the data. However, it is of interest to simplify things into easily interpretable results, if possible! So, the idea is to remove nonsignifcant complex stuff from the model before summarizing the results. Carrying out the mixed model analysis corresponding to the model given by (3-2) gives the following ANOVA table of fixed effects: Source of Numerator degrees Denominator degrees F- P- variation of freedom of freedom statistics values depth < width < depth*width We see, that the depth*width interaction effect is non-significant. Hence, we remove the interaction term and do the analysis based on the model: Y i = µ + α(width i ) + β(depth i ) + d(plank i ) + ɛ i, (3-3) where d(plank i ) N(0, σ 2 Plank ) and ɛ i N(0, σ 2 ). This model is illustrated by the factor structure diagram in figure 3.3. Note how the 8 degrees of freedom from the interaction effect has now been added to the error degrees of freedom. The table of fixed effects then becomes:

8 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 8 Figur 3.3: The factor structure diagram Source of Numerator degrees Denominator degrees F- P- variation of freedom of freedom statistics values depth < width < Note that the removal of the non-significant interaction effect only has minor effects on the conclusions regarding the depth and width effects: They are both extremely significant, confirming what we explored above. Since there are no more non-significant

9 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 9 fixed effects, the model given by 3-3 is the final model to use for summarizing the results. 3.4 Post hoc analysis and summarizing the results Estimates of the variance parameters The final model is given by (3-3), since main effects of as well width as depth are clearly significant. Estimates of the two variance parameters are: ˆσ 2 Planks = , ˆσ2 = Uncertainties of these estimates are given by: 2.5 % 97.5 %.sig sigma The remaining part of this subsection on post-hoc analysis and presentation of results illustrates how the information in factors can be summarized whenever the factor does not interact with any other factor Estimates of the fixed parameters Estimates of the expected values (LSMEANS) for each level of depth, together with their uncertainties and 95% confidence intervals are: Estimate SE Lower Upper Depth Depth Depth Depth Depth and correspondingly for each level of width:

10 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 10 Estimate SE Lower Upper Width Width Width Comparisons of the fixed parameters A commonly used post hoc analysis is to compare either specific pairs of depths (resp. widths) or compare all combinations within each factor. For the former, a standard t- tests can be used, e.g. ˆβ(1) ˆβ(2) t = SE ( ˆβ(1) ˆβ(2) ) using the error degrees of freedom (274). Or equivalently expressed by a 95% confidence interval: ˆβ(1) ˆβ(2) ± t.975,274 SE ( ˆβ(1) ˆβ(2) ) In this case, the estimates of the fixed effects are raw averages of the data based on the same number of observations for each level, so the standard error of the difference between two depth levels is given by SE ( ˆβ(1) ˆβ(2) ) = 2 ˆσ 2 /60 This means that two depth levels are claimed signifcantly different if they differ by more than t.975,274 2 ˆσ 2 /60 from each other. This is also called the 95% Least Significant Difference (LSD) value. It would be tempting to do such tests for all combinations of levels within each factor. This is generally NOT an acceptable approach, since the probability of significance-bychance becomes too large when many tests are performed simultaneously. This is called the multiplicity problem. With five depth levels there are 5 4/2 = 10 possible depth pairs to compare. Comparing two specific (decided before seeing the data) levels is not the same as comparing the smallest among five with the largest among five. In a case with no effects one would always expect the latter two to be more different by chance than the former. There are numerous solutions to properly handle this problem, if all comparisons indeed are made. All of them amounts to requiring differences to be larger than required

11 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 11 by the usual t-test to be claimed significant. One general idea, that can be used whenever numerous tests are performed simultaneously, is the Bonferroni correction: If k tests are performed simultaneously, then use level α/k in each test rather than α. For instance, if all depth levels are compared, standard pair-wise t-test output can be used, but employing level 0.5% in each test rather than 5%: So only claiming those differences significant for which the usual P-value is less than This method is known to be somewhat conservative, meaning that it may be too critical, or in other words again: it may miss some actual differences. Another solution is to use another distribution than the t-distribution, when comparisons are made. With the so-called Tukey-Kramer method two depth levels would be claimed signifcantly different if they differ by more than ν.975,j,274 ˆσ 2 /60 from each other, where J is the number of groups to be compared and ν 0.975,J,274 is the 97.5%-quantile of the so-called studentized range distribution with J groups. This distribution takes into account that the two levels that we compare in a single test is coming from J groups all together. This distribution is, just like the t-distribution, tabulated or available in the computer. Note that if J = 2, then the studentized range distribution corresponds to the t-distribution, The Tukey-adjusted results are: ν.975,2,274 = t.975,274 2 Depth Parameter Estimate SE Lower Upper P-value difference 1-3 β(1) β(2) < β(1) β(3) < β(1) β(4) < β(1) β(5) β(2) β(3) β(2) β(4) β(2) β(5) < β(3) β(4) β(3) β(5) < β(4) β(5) < Note that since the P-values are corrected, that is, based on the more proper studentized range distribution, they can be used directly without any additional Bonferroni correction. Similarly for the width effect:

12 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 12 Width Parameter Estimate SE Lower Upper P-value difference 1-2 α(1) α(2) α(1) α(3) < α(2) α(3) < Frequently, the key information of the two tables for each effect is summarized into a single table in which the lsmeans are ordered by size: Depth 9 Depth 1 Depth 7 Depth 3 Depth 5 Estimate a a b bc c The letter subscripts express the 5% significance results of the 10 pair-wise comparisons: Two depths sharing a subscript are NOT significantly different Two depths NOT sharing a subscript are significantly different So the pattern already observed in Figure 3.2 can now be statistically confirmed: there is a clear lower humidity close to the top and the bottom (and no difference between top and bottom). Also there is an indication that the center position has significantly higher humidity than the in between positions (between which no difference is seen). For the width effect, the summary table becomes particularly simple, since all three differences are significant: Width 3 Width 1 Width 2 Estimate a b c For these data, a figure of the raw data, like one of the bottom plots of figure 3.2 together with a statement of the lack of significant width*depth interaction and the two summary tables would probably suffice for most purposes. In later modules we will see how

13 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 13 additional plots of the model expectations/details will provide informative figures for interpretation. Other types (than the multiple comparison approach) of post hoc analysis may be employed, especially when quantitative information about the factor levels are available. In this case we know exactly the positions that corresponds to the different widths and depths and this could be used in the analysis. For instance, it could be investigated whether a quadratic function of the depths could be used to describe the humidity pattern. Apart from the nice direct functional interpretation of the dependence of humidity on depth, it could possibly provide more powerful tests for interaction effects. In fact this would still be a linear model, and could be handled by lmer We will return to such analyzes in a later module. Non-linear models (using e.g. exponentials etc) could also be an option in some cases, but then the model will no longer be a linear model, and additional theory and packages would be needed. The summary approach above was based on the assumption of no interaction between width and depth, that is, the conclusions regarding widths hold for all the depths, and vice versa. Had there been a significant interaction, we would have to present, say, the depth effects for each of the three widths (and/or vice versa), since the significance tells us that these three conclusions will NOT be the same. In practice, we proceed as above, BUT for the combined width*depth factor with 15 levels rather than for each of them separately. We will see examples of this later. One important step in the analysis given is missing: An investigation of the validity of the model assumptions! We return to this issue in Module 6, where we then finish the analysis of this data set on the humidity of beech wood planks. 3.5 R-TUTORIAL: Creating report ready tables and figures Since reports witout raw R-code or raw R-output are requested as well in this course as generally, it is useful to be able to apply some of the tools given in R to create nice tables (and figures) for LaTex and/or Word-based report writing Plot devices First of all, there are different device functions for saving plots in various formats, e.g. to save a plot as a pdf, write:

14 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 14 pdf("myplanksinteractionplot.pdf") with(planks, interaction.plot(depth,width,humidity,legend=f,col=2:4)) dev.off() Or as a png: (you choose the extension of the output file yourself, but it is clearly highly recommended to choose the right extension) png("myplanksinteractionplot.png") with(planks, interaction.plot(depth,width,humidity,legend=f,col=2:4)) dev.off() And similarly there are bmp and jpeg device functions. Plots can also be exported directly from the plots-windows in Rstudio Plotting with colours Colors can be specified in several different ways. And various plot functions may have various colour options for colouring different aspects of the plot. The simplest way to specify a colour is with a character string giving the color name (e.g., red ). A list of the possible colors can be obtained with the function colours, write: colors (distinct = FALSE) to see all the possible choices. Have a look at this website to see what all these colours look like, or go to: the QuickR website. Even more easily you can use integers as colour codes. As a default R uses a palette of 8 colours: palette() [1] "black" "red" "green3" "blue" "cyan" "magenta" "yellow" [8] "gray" which can then be refered to by the numbers 1-8. And then it would cycle modulus 8, meaning that using 9 would give black again.

15 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 15 There are a number pre-defined palettes that can be used when more (and better) collection of colours are needed, e.g. functions hsv, rainbow and hsv, e.g. write:?heat.colors which then could be used e.g. as: par(mfrow=c(2,2)) with(planks, interaction.plot(width,plank,humidity,legend=f,col=heat.colors(20))) with(planks, interaction.plot(depth,plank,humidity,legend=f,col=terrain.colors(20))) with(planks, interaction.plot(width,depth,humidity,legend=f,col=topo.colors(5))) with(planks, interaction.plot(depth,width,humidity,legend=f,col=cm.colors(3))) par(mfrow=c(1,1)) Or: # Rainbow color # you notice the value 10 is used to tell that you want 10 colors # e.g. rainbow(10) gives 10 different colors. rainbow(5) gives 5 colors par(mfrow=c(2,2)) with(planks, interaction.plot(width,plank,humidity,legend=f,col=rainbow(20))) with(planks, interaction.plot(depth,plank,humidity,legend=f,col=rainbow(20))) with(planks, interaction.plot(width,depth,humidity,legend=f,col=rainbow(5))) with(planks, interaction.plot(depth,width,humidity,legend=f,col=rainbow(3))) par(mfrow=c(1,1)) Or: par(mfrow=c(2,2)) with(planks, interaction.plot(width,plank,humidity,legend=f,col=hsv(1:20/20))) with(planks, interaction.plot(depth,plank,humidity,legend=f,col=hsv(1:20/20))) with(planks, interaction.plot(width,depth,humidity,legend=f,col=hsv(1:5/5))) with(planks, interaction.plot(depth,width,humidity,legend=f,col=hsv(1:3/3))) par(mfrow=c(1,1)) Report ready tables with xtable Nice tables can be produced by the xtable function of the xtable-package. An example:

16 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 16 means=as.matrix(with(planks, tapply(humidity,width,mean))) xtable(means) % latex table generated in R by xtable package % Fri Sep 18 13:46: \begin{table}[ht] \centering \begin{tabular}{rr} \hline & x \\ \hline 1 & 5.51 \\ 2 & 5.79 \\ 3 & 5.10 \\ \hline \end{tabular} \end{table} And then when this tex-code is included in your tex-file it will appear in the report as: x Note how the input to xtable was a matrix here. The function is prepared to recognize a number of different R-objects, see e.g.: methods(xtable) [1] xtable.anova* xtable.aov* [3] xtable.aovlist* xtable.coxph* [5] xtable.data.frame* xtable.glm* [7] xtable.lm* xtable.matrix* [9] xtable.prcomp* xtable.summary.aov* [11] xtable.summary.aovlist* xtable.summary.glm* [13] xtable.summary.lm* xtable.summary.prcomp* [15] xtable.table* xtable.ts*

17 enote R-TUTORIAL: INITIAL EXPLORATIVE ANALYSIS 17 [17] xtable.zoo* see?methods for accessing help and source code For instance, ANOVA-tables will be recognized. So a LaTex-user can then copy these tex-lines into the report.tex-document. Or to integrate the R-code into the tex-code, use the knitr-package to create the pure tex-file from a.rnw file, which is a kind of tex-file with all the R-code integrated into it, with a lot of flexibility in controlling what will be showed/evaluated etc in the output. This can be used for both raw code/results, tables and figures. A word user may also use xtable through the html-print-option: print(xtable(means), type = "html") <!-- html table generated in R by xtable package --> <!-- Fri Sep 18 13:46: > <table border=1> <tr> <th> </th> <th> x </th> </tr> <tr> <td align="right"> 1 </td> <td align="right"> 5.51 </td> </tr> <tr> <td align="right"> 2 </td> <td align="right"> 5.79 </td> </tr> <tr> <td align="right"> 3 </td> <td align="right"> 5.10 </td> </tr> </table> And then print the table directly into a file: print(xtable(means), type = "html", file = "myhtmltable.html") Open the file in a browser and copy-paste to Word. 3.6 R-TUTORIAL: Initial explorative analysis The data set planks is imported as described in R Module 1. Assume that the data set is called planks in R. The plots in figure 3.2 in Module 3 are produced using the function interaction.plot which requires three arguments: first the factor that is to be on the x- axis, then the factor that separates the data into distinct graphs and finally the response

18 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 18 variable. An optional parameter legend which takes either FALSE (F) or TRUE (T) specifies whether or not a legend should be added (relating the graphs to the factor levels) par(mfrow=c(2,2)) planks <- read.table("planks.txt", header = TRUE, sep = ",") with(planks, interaction.plot(width,plank,humidity,legend=f,col=2:11)) with(planks, interaction.plot(depth,plank,humidity,legend=f,col=2:11)) with(planks, interaction.plot(width,depth,humidity,legend=f,col=2:11)) with(planks, interaction.plot(depth,width,humidity,legend=f,col=2:11)) Notice that the with{... } function around the interaction.plot statements results in evaluation of the statements within a frame where the data set planks is available. This approach avoids having to attach data sets. To obtain all four plots in a two-by-two setup exactly like in figure 3.2, the statement par(mfrow=c(2,2)) should be issued prior to the above with statements. As already mentioned in the R Module 1, the function par is used to set a variety of graphical parameters (try typing?par for details). The parameter mfrow is a vector of length two where the first component is the number of rows on the graphical device and the second component is the number of columns on the graphical device. To return to the default use par(mfrow=c(1,1)). 3.7 Test of overall effects/model reduction In the previous section we did not need to define factors (Module 2)to use interaction.plot, but now we do. Configure the three variables depth, plank and width as factors planks$plank <- factor(planks$plank) planks$depth <- factor(planks$depth) planks$width <- factor(planks$width) Analysis of models including random effects can be done using the lmre function in the package lme4. The general model with fixed-effects structure consisting of the interaction between two factors and random effects assigned to the plank is specified as follows

19 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 19 model1 <- lmer(humidity ~ depth*width +(1 plank), data = planks) Notice that the fixed-effects structure is specified as either depth+width+depth:width or depth*width as more short used here - they give the same result. The relevant tests of the fixed-effects structure are obtained applying anova(model1) after making sure the lmertest-package is available require(lmertest) anova(model1) Analysis of Variance Table of type III with Satterthwaite approximation for degrees of freedom Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) depth < 2.2e-16 *** width e-12 *** depth:width Signif. codes: 0 *** ** 0.01 * Or using the xtable: xtable(anova(model1)) Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) depth width depth:width or using ANOVA from the car package: require(car) xtable(anova(model1, test.statistic = "F", type = 3)) The interaction is not significant and a reduced model can be formulated

20 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS20 F Df Df.res Pr(>F) (Intercept) depth width depth:width model2 <- lmer(humidity ~ depth + width + (1 plank), data = planks) xtable(anova(model2)) Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) depth width Both factors are highly significant and no further reduction is possible. 3.8 R-TUTORIAL: Post hoc analysis and summarizing the results The so-called likelihood profile based confidence intervals for the two variance parameters are found as:: summary(model2)$varcor Groups Name Std.Dev. plank (Intercept) Residual m2prof <- profile(model2,which=1:2) xtable(confint(m2prof)) 2.5 % 97.5 %.sig sigma As in R Module 1 we can use lsmeans to compute the estimated mean levels and their differences:

21 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS21 require(lsmeans) lsmeans::lsmeans(model2, pairwise ~ depth) $lsmeans depth lsmean SE df lower.cl upper.cl Results are averaged over the levels of: width Confidence level used: 0.95 $contrasts contrast estimate SE df t.ratio p.value < < < < < <.0001 Results are averaged over the levels of: width P value adjustment: tukey method for comparing a family of 5 estimates lsmeans::lsmeans(model2, pairwise ~ width) $lsmeans width lsmean SE df lower.cl upper.cl Results are averaged over the levels of: depth

22 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS22 Confidence level used: 0.95 $contrasts contrast estimate SE df t.ratio p.value < <.0001 Results are averaged over the levels of: depth P value adjustment: tukey method for comparing a family of 3 estimates or used together with the xtable function: print(xtable(summary(lsmeans::lsmeans(model2, pairwise ~ depth)$lsmeans))) depth lsmean SE df lower.cl upper.cl print(xtable(summary(lsmeans::lsmeans(model2, pairwise ~ width)$lsmeans))) width lsmean SE df lower.cl upper.cl The multcomp package also includes the so-called compact letter displays: require(multcomp) tuk2 <- glht(model2, linfct = mcp(depth = "Tukey")) tuk.cld2 <- cld(tuk2) tuk.cld "a" "bc" "c" "b" "a"

23 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS23 ### use sufficiently large upper margin old.par <- par(mai=c(1,1,1.25,1), no.readonly=true) plot(tuk.cld2, col=2:6) a b c c b a linear predictor depth par(old.par) tuk2 <- glht(model2, linfct = mcp(width = "Tukey")) tuk.cld2 <- cld(tuk2) tuk.cld "b" "c" "a" ### use sufficiently large upper margin

24 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS24 old.par <- par(mai=c(1,1,1.25,1), no.readonly=true) plot(tuk.cld2, col=2:6) b c a linear predictor width par(old.par) The lmertest package also offers some differences of lsmeans posthoc analysis (based on the Satterthwaite s DF method) together with some plotting: summodel2 <- step(model2,reduce.fixed = FALSE, reduce.random = FALSE) ## Tests for random effects xtable(summodel2$rand.table)

25 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS25 Chi.sq Chi.DF p.value plank ## Tests for fixed effects xtable(summodel2$anova.table) Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) depth width ## LSMEANS table names(summodel2$lsmeans.table)[4]="se" names(summodel2$lsmeans.table)[7]="lowci" names(summodel2$lsmeans.table)[8]="uppci" xtable(summodel2$lsmeans.table ) depth width Estimate SE DF t-value LowCI UppCI p-value depth depth depth depth depth width width width ## DIFF LSMEANS table xtable(summodel2$diffs.lsmeans.table) ## Plots of all LSMEANS and DIFFLSMEANS: plot(summodel2)

26 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS26 Estimate Standard Error DF t-value Lower CI Upper CI p-value depth depth depth depth depth depth depth depth depth depth width width width depth width humidity Significance NS p value < p value < 0.01 p value < levels Using the generic plotting of LSMEANS and DIFFLSMEANS from the lmertest-package

27 enote EXERCISES 27 like this has currently the (unfortunate) feaure that it ignores any definition of mfrow for multiple-plot-pr-page setting one might have, and simply lists the plots on a number of pages with one plot pr. page. 3.9 Exercises Exercise 1 Colour of spinage Spinage heated to 90 or 100 degrees Celcius was vacuum packed and stored for 0, 1 or 2 weeks before the packs were opened and chill stored in normal atmosphere for 0, 1 or 2 days. Then the colour was measured on a Hunter Lab. Two of the colour coordinates, a and b (measuring respectively something like red and yellow colour), were recorded and are given in the data set below. The variable batch is a blocking variable referring to two batches of spinage. The data is available here and listed below: Batch temp weeks days a b A A A A A A A A A A A A A A A A A A B B B B

28 enote EXERCISES 28 B B B B B B B B B B B B B B a) Write down all the factors relevant for the analysis, and their levels and mutual structure. Are they crossed or nested, for example? Make the factor structure diagram. b) Analyse the effect of the different factors on the two colour measurements and summarize the significant effects. (lsmeans etc) Exercise 2 Sensory evaluation of spinage In the spinage experiment from exercise 1 sensory evaluations were performed beside the colour measurements. The treatments were still the same, so the factors were heating temperature, original storage (weeks), storage after opening (days), and batch. The products from each treatment combination from each batch were assessed by (some of) 7 assessors who gave a score (between 0 and 15) for each of 6 different sensory properties (see the list further below). There was one sesssion for each combination of batch and weeks, and at each session the assessors evaluated the same 6 products (6 combinations of days and temperature). Note that not all assessors were present at all sessions.

29 enote EXERCISES 29 The results, with one line per evaluation, are given in the order: weeks of storage, days after opening, batch, temperature, session number, assessor number, and the six sensory properties hay flavour 1, hay flavour 2, hay taste, spinage flavour 1, spinage flavour 2, spinage taste. The data is available here and listed partly below: 0 0 A A A A A A A A A A A A A A B B (252 lines in total) 2 2 B B a) Write down the factors relevant for the analysis, and their levels and mutual structure. [You should include a production factor corresponding to the combinations of temperature, weeks, days, and batch.] b) Specify which effects you want to include in the model. Pay particular attention to which interactions you want in the model. [Include at least some of the interactions between assessor and treatment factors]. Which effects are random and which are fixed?

30 enote EXERCISES 30 c) Perform the analysis for one of the sensory properties and draw conclusions.

enote 3 1 enote 3 Case study

enote 3 1 enote 3 Case study enote 3 1 enote 3 Case study enote 3 INDHOLD 2 Indhold 3 Case study 1 3.1 Introduction.................................... 3 3.2 Initial explorative analysis............................ 5 3.3 Test of overall

More information

Random coefficients models

Random coefficients models enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................

More information

Random coefficients models

Random coefficients models enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................

More information

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10 St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................

More information

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC. Mixed Effects Models Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC March 6, 2018 Resources for statistical assistance Department of Statistics

More information

Module 3: SAS. 3.1 Initial explorative analysis 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE

Module 3: SAS. 3.1 Initial explorative analysis 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 3: SAS 3.1 Initial explorative analysis....................... 1 3.1.1 SAS JMP............................

More information

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation: Topic 11. Unbalanced Designs [ST&D section 9.6, page 219; chapter 18] 11.1 Definition of missing data Accidents often result in loss of data. Crops are destroyed in some plots, plants and animals die,

More information

Analysis of variance - ANOVA

Analysis of variance - ANOVA Analysis of variance - ANOVA Based on a book by Julian J. Faraway University of Iceland (UI) Estimation 1 / 50 Anova In ANOVAs all predictors are categorical/qualitative. The original thinking was to try

More information

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Statistics Lab #7 ANOVA Part 2 & ANCOVA Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")

More information

One Factor Experiments

One Factor Experiments One Factor Experiments 20-1 Overview Computation of Effects Estimating Experimental Errors Allocation of Variation ANOVA Table and F-Test Visual Diagnostic Tests Confidence Intervals For Effects Unequal

More information

Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy.

Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. JMP Output from Chapter 5 Factorial Analysis through JMP Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. Fitting

More information

Model Selection and Inference

Model Selection and Inference Model Selection and Inference Merlise Clyde January 29, 2017 Last Class Model for brain weight as a function of body weight In the model with both response and predictor log transformed, are dinosaurs

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 34 It is possible to have many factors in a factorial experiment. In DDD we saw an example of a 3-factor study with ball size, height, and surface

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt )

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) JMP Output from Chapter 9 Factorial Analysis through JMP Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) Fitting the Model and checking conditions Analyze > Fit Model

More information

Pair-Wise Multiple Comparisons (Simulation)

Pair-Wise Multiple Comparisons (Simulation) Chapter 580 Pair-Wise Multiple Comparisons (Simulation) Introduction This procedure uses simulation analyze the power and significance level of three pair-wise multiple-comparison procedures: Tukey-Kramer,

More information

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R In this course we will be using R (for Windows) for most of our work. These notes are to help students install R and then

More information

Lab 5 - Risk Analysis, Robustness, and Power

Lab 5 - Risk Analysis, Robustness, and Power Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors

More information

Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews?

Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews? Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews? Eduardo Magalhaes Barbosa 17 de novembro de 2015 1 Introduction Star classification features are ubiquitous in apps world,

More information

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 1 1 / 31 It is possible to have many factors in a factorial experiment. We saw some three-way factorials earlier in the DDD book (HW 1 with 3 factors:

More information

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes. Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department

More information

36-402/608 HW #1 Solutions 1/21/2010

36-402/608 HW #1 Solutions 1/21/2010 36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together

More information

A Knitr Demo. Charles J. Geyer. February 8, 2017

A Knitr Demo. Charles J. Geyer. February 8, 2017 A Knitr Demo Charles J. Geyer February 8, 2017 1 Licence This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/.

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 245 Introduction This procedure generates R control charts for variables. The format of the control charts is fully customizable. The data for the subgroups can be in a single column or in multiple

More information

Section 4 General Factorial Tutorials

Section 4 General Factorial Tutorials Section 4 General Factorial Tutorials General Factorial Part One: Categorical Introduction Design-Ease software version 6 offers a General Factorial option on the Factorial tab. If you completed the One

More information

Tutorial for the SensMixed application

Tutorial for the SensMixed application Tutorial for the SensMixed application Alexandra Kuznetsova 1. The SensMixed package - an overview The SensMixed package is an R package for analysing Sensory and Consumer data in a mixed model framework

More information

E-Campus Inferential Statistics - Part 2

E-Campus Inferential Statistics - Part 2 E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence

More information

For our example, we will look at the following factors and factor levels.

For our example, we will look at the following factors and factor levels. In order to review the calculations that are used to generate the Analysis of Variance, we will use the statapult example. By adjusting various settings on the statapult, you are able to throw the ball

More information

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that The theory of the linear model 41 Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that E(Y X) =X 0 b 0 0 the F-test statistic follows an F-distribution with (p p 0, n p) degrees

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

Stat 411/511 MULTIPLE COMPARISONS. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 MULTIPLE COMPARISONS. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 MULTIPLE COMPARISONS Nov 16 2015 Charlotte Wickham stat511.cwick.co.nz Thanksgiving week No lab material next week 11/24 & 11/25. Labs as usual this week. Lectures as usual Mon & Weds next

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

We have seen that as n increases, the length of our confidence interval decreases, the confidence interval will be more narrow.

We have seen that as n increases, the length of our confidence interval decreases, the confidence interval will be more narrow. {Confidence Intervals for Population Means} Now we will discuss a few loose ends. Before moving into our final discussion of confidence intervals for one population mean, let s review a few important results

More information

Multi-Factored Experiments

Multi-Factored Experiments Design and Analysis of Multi-Factored Experiments Advanced Designs -Hard to Change Factors- Split-Plot Design and Analysis L. M. Lye DOE Course 1 Hard-to-Change Factors Assume that a factor can be varied,

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

Demo yeast mutant analysis

Demo yeast mutant analysis Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................

More information

lme for SAS PROC MIXED Users

lme for SAS PROC MIXED Users lme for SAS PROC MIXED Users Douglas M. Bates Department of Statistics University of Wisconsin Madison José C. Pinheiro Bell Laboratories Lucent Technologies 1 Introduction The lme function from the nlme

More information

Tutorial for the SensMixed application

Tutorial for the SensMixed application Tutorial for the SensMixed application Alexandra Kuznetsova, Per Bruun Brockhoff 1. The SensMixed package - an overview The SensMixed package is an R package for analysing Sensory and Consumer data in

More information

The linear mixed model: modeling hierarchical and longitudinal data

The linear mixed model: modeling hierarchical and longitudinal data The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44 Contents 1 Modeling Hierarchical

More information

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C.

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C. 219 Lemma J For all languages A, B, C the following hold i. A m A, (reflexive) ii. if A m B and B m C, then A m C, (transitive) iii. if A m B and B is Turing-recognizable, then so is A, and iv. if A m

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Logical operators: R provides an extensive list of logical operators. These include

Logical operators: R provides an extensive list of logical operators. These include meat.r: Explanation of code Goals of code: Analyzing a subset of data Creating data frames with specified X values Calculating confidence and prediction intervals Lists and matrices Only printing a few

More information

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C is one of many capability metrics that are available. When capability metrics are used, organizations typically provide

More information

Multiple Linear Regression: Global tests and Multiple Testing

Multiple Linear Regression: Global tests and Multiple Testing Multiple Linear Regression: Global tests and Multiple Testing Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

SPSS INSTRUCTION CHAPTER 9

SPSS INSTRUCTION CHAPTER 9 SPSS INSTRUCTION CHAPTER 9 Chapter 9 does no more than introduce the repeated-measures ANOVA, the MANOVA, and the ANCOVA, and discriminant analysis. But, you can likely envision how complicated it can

More information

pairwise.t.test(dataset$measurement, dataset$group, p.adj = bonferroni ) TukeyHSD(aov(dataset$measurement~dataset$group))

pairwise.t.test(dataset$measurement, dataset$group, p.adj = bonferroni ) TukeyHSD(aov(dataset$measurement~dataset$group)) Tutorial 9: Comparing Three or More Groups One-way (single-factor) ANOVA (analysis of variance) Used to compare means of 3 or more groups based on a single explanatory (independent) variable, or factor.

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

Multiple Comparisons of Treatments vs. a Control (Simulation)

Multiple Comparisons of Treatments vs. a Control (Simulation) Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 INTRODUCTION Graphs are one of the most important aspects of data analysis and presentation of your of data. They are visual representations

More information

STATS PAD USER MANUAL

STATS PAD USER MANUAL STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11

More information

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables: Regression Lab The data set cholesterol.txt available on your thumb drive contains the following variables: Field Descriptions ID: Subject ID sex: Sex: 0 = male, = female age: Age in years chol: Serum

More information

SAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/

SAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/ STAT:5201 Applied Statistic II (Factorial with 3 factors as 2 3 design) Three-way ANOVA (Factorial with three factors) with replication Factor A: angle (low=0/high=1) Factor B: geometry (shape A=0/shape

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

BART STAT8810, Fall 2017

BART STAT8810, Fall 2017 BART STAT8810, Fall 2017 M.T. Pratola November 1, 2017 Today BART: Bayesian Additive Regression Trees BART: Bayesian Additive Regression Trees Additive model generalizes the single-tree regression model:

More information

The Kenton Study. (Applied Linear Statistical Models, 5th ed., pp , Kutner et al., 2005) Page 1 of 5

The Kenton Study. (Applied Linear Statistical Models, 5th ed., pp , Kutner et al., 2005) Page 1 of 5 The Kenton Study The Kenton Food Company wished to test four different package designs for a new breakfast cereal. Twenty stores, with approximately equal sales volumes, were selected as the experimental

More information

An Experiment in Visual Clustering Using Star Glyph Displays

An Experiment in Visual Clustering Using Star Glyph Displays An Experiment in Visual Clustering Using Star Glyph Displays by Hanna Kazhamiaka A Research Paper presented to the University of Waterloo in partial fulfillment of the requirements for the degree of Master

More information

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Abstract In this project, we study 374 public high schools in New York City. The project seeks to use regression techniques

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Meet MINITAB. Student Release 14. for Windows

Meet MINITAB. Student Release 14. for Windows Meet MINITAB Student Release 14 for Windows 2003, 2004 by Minitab Inc. All rights reserved. MINITAB and the MINITAB logo are registered trademarks of Minitab Inc. All other marks referenced remain the

More information

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS Experiment 1 CH 222 - Fall 2004 INTRODUCTION TO SPREADSHEETS Introduction Spreadsheets are valuable tools utilized in a variety of fields. They can be used for tasks as simple as adding or subtracting

More information

Multiple Regression White paper

Multiple Regression White paper +44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms

More information

2010 by Minitab, Inc. All rights reserved. Release Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are

2010 by Minitab, Inc. All rights reserved. Release Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are 2010 by Minitab, Inc. All rights reserved. Release 16.1.0 Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are registered trademarks of Minitab, Inc. in the United

More information

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 3.2: Multiple Linear Regression II Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Multiple Linear Regression: Inference and Understanding We can answer new questions

More information

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB Keppel, G. Design and Analysis: Chapter 17: The Mixed Two-Factor Within-Subjects Design: The Overall Analysis and the Analysis of Main Effects and Simple Effects Keppel describes an Ax(BxS) design, which

More information

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data Introduction About this Document This manual was written by members of the Statistical Consulting Program as an introduction to SPSS 12.0. It is designed to assist new users in familiarizing themselves

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information

STATISTICS FOR PSYCHOLOGISTS

STATISTICS FOR PSYCHOLOGISTS STATISTICS FOR PSYCHOLOGISTS SECTION: JAMOVI CHAPTER: USING THE SOFTWARE Section Abstract: This section provides step-by-step instructions on how to obtain basic statistical output using JAMOVI, both visually

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

HydroOffice Diagrams

HydroOffice Diagrams Hydro Office Software for Water Sciences HydroOffice Diagrams User Manual for Ternary 1.0, Piper 2.0 and Durov 1.0 tool HydroOffice.org Citation: Gregor M. 2013. HydroOffice Diagrams user manual for Ternary1.0,

More information

Factorial ANOVA. Skipping... Page 1 of 18

Factorial ANOVA. Skipping... Page 1 of 18 Factorial ANOVA The potato data: Batches of potatoes randomly assigned to to be stored at either cool or warm temperature, infected with one of three bacterial types. Then wait a set period. The dependent

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software December 2017, Volume 82, Issue 13. doi: 10.18637/jss.v082.i13 lmertest Package: Tests in Linear Mixed Effects Models Alexandra Kuznetsova Technical University of Denmark

More information

Fractional. Design of Experiments. Overview. Scenario

Fractional. Design of Experiments. Overview. Scenario Design of Experiments Overview We are going to learn about DOEs. Specifically, you ll learn what a DOE is, as well as, what a key concept known as Confounding is all about. Finally, you ll learn what the

More information

Exercise: Graphing and Least Squares Fitting in Quattro Pro

Exercise: Graphing and Least Squares Fitting in Quattro Pro Chapter 5 Exercise: Graphing and Least Squares Fitting in Quattro Pro 5.1 Purpose The purpose of this experiment is to become familiar with using Quattro Pro to produce graphs and analyze graphical data.

More information

6. Advanced Topics in Computability

6. Advanced Topics in Computability 227 6. Advanced Topics in Computability The Church-Turing thesis gives a universally acceptable definition of algorithm Another fundamental concept in computer science is information No equally comprehensive

More information

Problem Set 3. MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009

Problem Set 3. MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009 Problem Set 3 MATH 778C, Spring 2009, Austin Mohr (with John Boozer) April 15, 2009 1. Show directly that P 1 (s) P 1 (t) for all t s. Proof. Given G, let H s be a subgraph of G on s vertices such that

More information

AA BB CC DD EE. Introduction to Graphics in R

AA BB CC DD EE. Introduction to Graphics in R Introduction to Graphics in R Cori Mar 7/10/18 ### Reading in the data dat

More information

Stat 5303 (Oehlert): Response Surfaces 1

Stat 5303 (Oehlert): Response Surfaces 1 Stat 5303 (Oehlert): Response Surfaces 1 > data

More information

Introduction to Mixed Models: Multivariate Regression

Introduction to Mixed Models: Multivariate Regression Introduction to Mixed Models: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #9 March 30, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate

More information

R Programming: Worksheet 6

R Programming: Worksheet 6 R Programming: Worksheet 6 Today we ll study a few useful functions we haven t come across yet: all(), any(), `%in%`, match(), pmax(), pmin(), unique() We ll also apply our knowledge to the bootstrap.

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

Salary 9 mo : 9 month salary for faculty member for 2004

Salary 9 mo : 9 month salary for faculty member for 2004 22s:52 Applied Linear Regression DeCook Fall 2008 Lab 3 Friday October 3. The data Set In 2004, a study was done to examine if gender, after controlling for other variables, was a significant predictor

More information

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine

More information

Introduction to R. Introduction to Econometrics W

Introduction to R. Introduction to Econometrics W Introduction to R Introduction to Econometrics W3412 Begin Download R from the Comprehensive R Archive Network (CRAN) by choosing a location close to you. Students are also recommended to download RStudio,

More information

Approximation Algorithms for Wavelength Assignment

Approximation Algorithms for Wavelength Assignment Approximation Algorithms for Wavelength Assignment Vijay Kumar Atri Rudra Abstract Winkler and Zhang introduced the FIBER MINIMIZATION problem in [3]. They showed that the problem is NP-complete but left

More information

A Multiple-Line Fitting Algorithm Without Initialization Yan Guo

A Multiple-Line Fitting Algorithm Without Initialization Yan Guo A Multiple-Line Fitting Algorithm Without Initialization Yan Guo Abstract: The commonest way to fit multiple lines is to use methods incorporate the EM algorithm. However, the EM algorithm dose not guarantee

More information

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.

More information

Additional Issues: Random effects diagnostics, multiple comparisons

Additional Issues: Random effects diagnostics, multiple comparisons : Random diagnostics, multiple Austin F. Frank, T. Florian April 30, 2009 The dative dataset Original analysis in Bresnan et al (2007) Data obtained from languager (Baayen 2008) Data describing the realization

More information

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010 THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE

More information

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem. STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,

More information

Package simr. April 30, 2018

Package simr. April 30, 2018 Type Package Package simr April 30, 2018 Title Power Analysis for Generalised Linear Mixed Models by Simulation Calculate power for generalised linear mixed models, using simulation. Designed to work with

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 4, 2016 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Sweave Dynamic Interaction of R and L A TEX

Sweave Dynamic Interaction of R and L A TEX Sweave Dynamic Interaction of R and L A TEX Nora Umbach Dezember 2009 Why would I need Sweave? Creating reports that can be updated automatically Statistic exercises Manuals with embedded examples (like

More information

Lab 1 Introduction to R

Lab 1 Introduction to R Lab 1 Introduction to R Date: August 23, 2011 Assignment and Report Due Date: August 30, 2011 Goal: The purpose of this lab is to get R running on your machines and to get you familiar with the basics

More information

Refactoring the xtable Package

Refactoring the xtable Package David J Scott 1 Daniel Geals 1 Paul Murrell 1 1 Department of Statistics, The University of Auckland July 10, 2015 Outline 1 Introduction 2 Analysis 3 Testing Outline 1 Introduction 2 Analysis 3 Testing

More information

Lab 1: Elementary image operations

Lab 1: Elementary image operations CSC, KTH DD2423 Image Analysis and Computer Vision : Lab 1: Elementary image operations The goal of the labs is to gain practice with some of the fundamental techniques in image processing on grey-level

More information

Exploring and Understanding Data Using R.

Exploring and Understanding Data Using R. Exploring and Understanding Data Using R. Loading the data into an R data frame: variable

More information