enote 3 1 enote 3 Case study

Size: px
Start display at page:

Download "enote 3 1 enote 3 Case study"

Transcription

1 enote 3 1 enote 3 Case study

2 enote 3 INDHOLD 2 Indhold 3 Case study Introduction Initial explorative analysis Test of overall effects/model reduction Post hoc analysis and summarizing the results Estimates of the variance parameters Estimates of the fixed parameters Comparisons of the fixed parameters R-TUTORIAL: Creating report ready tables and figures Plot devices Plotting with colours Report ready tables with xtable R-TUTORIAL: Initial explorative analysis Test of overall effects/model reduction R-TUTORIAL: Post hoc analysis and summarizing the results Exercises

3 enote INTRODUCTION Introduction This module consists of the first part of a complete analysis of the beech wood data presented as an example in module 2. The aim is to show that the principles for data analysis and result summary for fixed ANOVA and/or regression models also apply for mixed models. And maybe some readers will find it helpful to have some of these principles reviewed. For completeness we repeat here the description and initial factor structure considerations. To investigate the effect of drying of beech wood on the humidity percentage, the following experiment was conducted. Each of 20 planks was dryed in a certain period of time. Then the humidity percentage was measured in 5 depths and 3 widths for each plank: depth 1: close to the top depth 5: in the center depth 9: close to the bottom depth 3: between 1 and 5 depth 7: between 5 and 9 width 1: close to the side width 3: in the center width 2: between 1 and 3 So there are 3 5 = 15 measurements for each plank and all together 300 observations. The data is can be found as planks.txt and is reproduced in the following table.

4 enote INTRODUCTION 4 Width 1 Width 2 Width 3 Depth Depth Depth Planks In this experiment we have 3 factors apart from the trivial factors I and 0. Let us use the factor names plank, width and depth. The factor plank has 20 levels, width has 3 and depth has 5 levels. For the ith measurement of humidity, plank i denotes the plank on which this measurement was performed. And correspondingly width i and depth i denotes the width and depth, respectively, of this ith measurement. It would be natural to include the interaction between width and depth corresponding to the product factor width depth. The product factor has in this case 15 levels. A natural model would include plank as a block factor while depth and width enter together with their interaction. If Y i denotes the humidity percentage corresponding to the ith measurement, the model with fixed block effect can be written as: Y i = µ + α(width i ) + β(depth i ) + γ(width i, depth i ) + δ(plank i ) + ɛ i, (3-1) where i = 1,..., 300 and where the ɛ i s are independent and normally distributed random variables. Or similarly: Y ijk = µ + α i + β j + γ ij + δ k + ɛ ijk

5 enote INITIAL EXPLORATIVE ANALYSIS 5 Figur 3.1: The factor structure diagram where Y ijk is the kth measurement within the (i, j)th combination of the two factors, i = 1,..., 3, j = 1,..., 5 and k = 1,..., 20. As pointed out in Module 1 the block (plank) effect should be considered as a random effect, leading to the mixed model: Y i = µ + α(width i ) + β(depth i ) + γ(width i, depth i ) + d(plank i ) + ɛ i, (3-2) where d(plank i ) N(0, σplank 2 ) and ɛ ijk N(0, σ 2 ). This model corresponds to the factor structure diagram given in figure Initial explorative analysis Having realized the complete structure of the data, it is time to do initial plotting/ explorative analysis. Throughout this module, figures and results are presented without showing R code or raw R output. This can be seen as a standard for reports in the course! Typically, numerous figures not entering a final project report should be studied, since this phase is explorative, and final figures to present the key results are chosen after the statistical analyses are completed. The plotting of various average profiles is usually a helpful tool for data with several factors. In Figure 3.2 four of these are presented. In the top left diagram the width humidity patterns for each plank is depicted by plotting the average humidity (taking the average of the five depths for each width and plank) against the widths. It is immediately clear that there is extensive plank-to-plank variations in the level of humidity. The message about the width effect is less clear. In the top right the similar

6 enote INITIAL EXPLORATIVE ANALYSIS mean of humidity mean of humidity width depth mean of humidity depth mean of humidity width width depth Figur 3.2: Four average humidity profiles plot for the depth effect is seen. Here the message is much clearer: The humidity is high in the center (depth=5) and low at the top (depth=1) and at the bottom (depth=9). As pointed out, this is the effect seen when the three widths are averaged. It could be that the depth effect is different for widths close to the side of the plank (width=1) than for widths in the center (width=3). In other words, there could be a plank*width interaction effect, that we wouldn t find in the plots above. Instead similar plots are given in the bottom diagrams of figure 3.2 for the widths and depths by averaging over the planks (that is, plotting the 15 average values). The depth structure already seen is recognized. Also, it is seen that there is a clear shift in humidity level from width to width and that the depth humidity pattern seems to be roughly the same for the three widths. However, there are some deviations from parallel patterns and the uncertainties in the deviations from parallel patterns are not visible. A similar increasing-decreasing width pattern, that was not clearly visible from the top diagram is now seen. This pattern seems to be roughly the same for all depths (with the

7 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 7 same precautions as before) and the low humidity levels for the top and bottom depths are clearly seen. Note again that the two bottom plots contain the same information: had there been clearly non-parallel patterns in one figure (an interaction effect) this would also appear in the other figure. The next step is to start the actual statistical analysis of the data. 3.3 Test of overall effects/model reduction A statistical analysis of this kind is commonly carried out in several steps, starting with the basic model found from the factor structure considerations. This model usually contains every possible effect there may be in the data. However, it is of interest to simplify things into easily interpretable results, if possible! So, the idea is to remove nonsignificant complex stuff from the model before summarizing the results. Carrying out the mixed model analysis corresponding to the model given by (3-2) gives the following ANOVA table of fixed effects: Source of Numerator degrees Denominator degrees F- P- variation of freedom of freedom statistics values depth < width < depth*width We see, that the depth*width interaction effect is non-significant. Hence, we remove the interaction term and do the analysis based on the model: Y i = µ + α(width i ) + β(depth i ) + d(plank i ) + ɛ i, (3-3) where d(plank i ) N(0, σ 2 Plank ) and ɛ i N(0, σ 2 ). This model is illustrated by the factor structure diagram in figure 3.3. Note how the 8 degrees of freedom from the interaction effect has now been added to the error degrees of freedom. The table of fixed effects then becomes: Source of Numerator degrees Denominator degrees F- P- variation of freedom of freedom statistics values depth < width <0.0001

8 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 8 Figur 3.3: The factor structure diagram Note that the removal of the non-significant interaction effect only has minor effects on the conclusions regarding the depth and width effects: They are both extremely significant, confirming what we explored above. Since there are no more non-significant fixed effects, the model given by 3-3 is the final model to use for summarizing the results. 3.4 Post hoc analysis and summarizing the results Estimates of the variance parameters The final model is given by (3-3), since main effects of as well width as depth are clearly significant. Estimates of the two variance parameters are: ˆσ 2 Planks = , ˆσ 2 = Uncertainties of these estimates on the standard deviation scale given as 95% profile likelihood confidence limits are: 2.5 % 97.5 % Planks Residual

9 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 9 The remaining part of this subsection on post-hoc analysis and presentation of results illustrates how the information in factors can be summarized whenever the factor does not interact with any other factor Estimates of the fixed parameters Estimates of the expected values (LSMEANS) for each level of depth, together with their uncertainties and 95% confidence intervals are: Estimate SE Lower Upper Depth Depth Depth Depth Depth and correspondingly for each level of width: Estimate SE Lower Upper Width Width Width Comparisons of the fixed parameters A commonly used post hoc analysis is to compare either specific pairs of depths (resp. widths) or compare all combinations within each factor. For the former, a standard t- tests can be used, e.g. ˆβ(1) ˆβ(2) t = SE ( ˆβ(1) ˆβ(2) ) using the error degrees of freedom (274). Or equivalently expressed by a 95% confidence interval: ˆβ(1) ˆβ(2) ± t.975,274 SE ( ˆβ(1) ˆβ(2) )

10 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 10 In this case, the estimates of the fixed effects are raw averages of the data based on the same number of observations for each level, so the standard error of the difference between two depth levels is given by SE ( ˆβ(1) ˆβ(2) ) = 2 ˆσ 2 /60 This means that two depth levels are claimed signifcantly different if they differ by more than t.975,274 2 ˆσ 2 /60 from each other. This is also called the 95% Least Significant Difference (LSD) value. It would be tempting to do such tests for all combinations of levels within each factor. This is generally NOT an acceptable approach, since the probability of significanceby-chance becomes too large when many tests are performed simultaneously. This is called the multiplicity problem With five depth levels there are 5 4/2 = 10 possible depth pairs to compare. Comparing two specific (decided before seeing the data) levels is not the same as comparing the smallest among five with the largest among five. In a case with no effects one would always expect the latter two to be more different by chance than the former. There are numerous solutions to properly handle this problem, if all comparisons indeed are made. All of them amounts to requiring differences to be larger than required by the usual t-test to be claimed significant. One general idea, that can be used whenever numerous tests are performed simultaneously, is the Bonferroni correction: If k tests are performed simultaneously, then use level α/k in each test rather than α. For instance, if all depth levels are compared, standard pair-wise t-test output can be used, but employing level 0.5% in each test rather than 5%: So only claiming those differences significant for which the usual P-value is less than This method is known to be somewhat conservative, meaning that it may be too critical, or in other words again: it may miss some actual differences. Another solution is to use another distribution than the t-distribution, when comparisons are made. With the so-called Tukey-Kramer method two depth levels would be claimed signifcantly different if they differ by more than ν.975,j,274 ˆσ 2 /60 from each other, where J is the number of groups to be compared and ν 0.975,J,274 is the 97.5%-quantile of the so-called studentized range distribution with J groups. This distribution takes into account that the two levels that we compare in a single test is coming from J groups all together. This distribution is, just like the t-distribution, tabulated or available in the computer. Note that if J = 2, then the studentized range

11 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 11 distribution corresponds to the t-distribution, The Tukey-adjusted results are: ν.975,2,274 = t.975,274 2 Depth Parameter Estimate SE Lower Upper P-value difference 1-3 β(1) β(2) < β(1) β(3) < β(1) β(4) < β(1) β(5) β(2) β(3) β(2) β(4) β(2) β(5) < β(3) β(4) β(3) β(5) < β(4) β(5) < Note that since the P-values are corrected, that is, based on the more proper studentized range distribution, they can be used directly without any additional Bonferroni correction. Similarly for the width effect: Width Parameter Estimate SE Lower Upper P-value difference 1-2 α(1) α(2) α(1) α(3) < α(2) α(3) < Frequently, the key information of the two tables for each effect is summarized into a single table in which the lsmeans are ordered by size: Depth 9 Depth 1 Depth 7 Depth 3 Depth 5 Estimate a a b bc c

12 enote POST HOC ANALYSIS AND SUMMARIZING THE RESULTS 12 The letter subscripts express the 5% significance results of the 10 pair-wise comparisons: Two depths sharing a subscript are NOT significantly different Two depths NOT sharing a subscript are significantly different So the pattern already observed in Figure 3.2 can now be statistically confirmed: there is a clear lower humidity close to the top and the bottom (and no difference between top and bottom). Also there is an indication that the center position has significantly higher humidity than the in between positions (between which no difference is seen). For the width effect, the summary table becomes particularly simple, since all three differences are significant: Width 3 Width 1 Width 2 Estimate a b c For these data, a figure of the raw data, like one of the bottom plots of figure 3.2 together with a statement of the lack of significant width*depth interaction and the two summary tables would probably suffice for most purposes. In later modules we will see how additional plots of the model expectations/details will provide informative figures for interpretation. Other types (than the multiple comparison approach) of post hoc analysis may be employed, especially when quantitative information about the factor levels are available. In this case we know exactly the positions that corresponds to the different widths and depths and this could be used in the analysis. For instance, it could be investigated whether a quadratic function of the depths could be used to describe the humidity pattern. Apart from the nice direct functional interpretation of the dependence of humidity on depth, it could possibly provide more powerful tests for interaction effects. In fact this would still be a linear model, and could be handled by lmer from the lme4- package. We will return to such analyzes in a later module. Non-linear models (using e.g. exponentials etc) could also be an option in some cases, but then the model will no longer be a linear model, and additional theory and packages would be needed. The summary approach above was based on the assumption of no interaction between width and depth, that is, the conclusions regarding widths hold for all the depths, and vice versa. Had there been a significant interaction, we would have to present, say, the

13 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 13 depth effects for each of the three widths (and/or vice versa), since the significance tells us that these three conclusions will NOT be the same. In practice, we proceed as above, BUT for the combined width*depth factor with 15 levels rather than for each of them separately. We will see examples of this later. One important step in the analysis given is missing: An investigation of the validity of the model assumptions! We return to this issue in Module 6, where we then finish the analysis of this data set on the humidity of beech wood planks. 3.5 R-TUTORIAL: Creating report ready tables and figures Since reports witout raw R-code or raw R-output are requested as well in this course as generally, it is useful to be able to apply some of the tools given in R to create nice tables (and figures) for LATEX and/or Microsoft Word-based report writing Plot devices First of all, there are different device functions for saving plots in various formats, e.g., to save a plot as a pdf, write: pdf("myplanksinteractionplot.pdf") with(planks, interaction.plot(depth, width, humidity, col=2:4)) dev.off() Note that dev.off() lets R know that no further graphics commands will follow. It turns off the graphics device and saves the figure to the designated file. Or as a png: (you choose the extension of the output file yourself, but it is clearly recommended to choose an extension that corresponds to the device function (here pdf or png.) png("myplanksinteractionplot.png") with(planks, interaction.plot(depth, width, humidity, col=2:4)) dev.off() Similarly, there are bmp, jpeg and other device functions. Plots can also be exported directly from the Plots -window in Rstudio.

14 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES Plotting with colours Colors can be specified in several different ways. And various plot functions may have various colour options for colouring different aspects of the plot. The simplest way to specify a colour is with a character string giving the color name (e.g., "red"). A list of the possible colors can be obtained with the function colours, write: colors (distinct = FALSE) to see all the possible choices. Have a look at this website to see what all these colours look like, or go to: the QuickR website. Even more easily you can use integers as colour codes. As a default R uses a palette of 8 colours: palette() [1] "black" "red" "green3" "blue" "cyan" "magenta" "yellow" [8] "gray" which can then be refered to by the numbers 1-8. And then it would cycle modulus 8, meaning that using 9 would give "black" again. There are a number pre-defined palettes that can be used when more (and better) collection of colours are needed, e.g. functions rainbow and hsv, e.g. write:?heat.colors which then could be used e.g. as (plots not shown): par(mfrow=c(2,2)) with(planks, { interaction.plot(width, plank, humidity, legend=false, col=heat.colors(20)) interaction.plot(depth, plank, humidity, legend=false, col=terrain.colors(20)) interaction.plot(width, depth, humidity, col=topo.colors(5)) interaction.plot(depth, width, humidity, col=cm.colors(3)) }) par(mfrow=c(1,1))

15 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 15 Or: # Notice the value 10 is used to tell that you want 10 colors # e.g. rainbow(10) gives 10 different colors. rainbow(5) gives 5 colors with(planks, interaction.plot(width, depth, humidity, col=rainbow(5))) Or: with(planks, interaction.plot(width, depth, humidity, col=hsv(1:5/5))) Report ready tables with xtable Nice tables can be produced by the xtable function from the xtable-package. An example: means <- as.matrix(with(planks, tapply(humidity, width, mean))) xtable(means) % latex table generated in R by xtable package % Wed Sep 27 11:51: \begin{table}[ht] \centering \begin{tabular}{rr} \hline & x \\ \hline 1 & 5.51 \\ 2 & 5.79 \\ 3 & 5.10 \\ \hline \end{tabular} \end{table} When this tex-code is included in your tex-file it will appear in the report as in the following table. Note how the input to xtable was a matrix here. The function is prepared to recognize a number of different R-objects, see e.g.:

16 enote R-TUTORIAL: CREATING REPORT READY TABLES AND FIGURES 16 x methods(xtable) [1] xtable.anova* xtable.aov* [3] xtable.aovlist* xtable.coxph* [5] xtable.data.frame* xtable.glm* [7] xtable.gmsar* xtable.lagimpact* [9] xtable.lm* xtable.matrix* [11] xtable.prcomp* xtable.sarlm* [13] xtable.sarlm.pred* xtable.spautolm* [15] xtable.sphet* xtable.splm* [17] xtable.stsls* xtable.summary.aov* [19] xtable.summary.aovlist* xtable.summary.glm* [21] xtable.summary.gmsar* xtable.summary.lm* [23] xtable.summary.prcomp* xtable.summary.sarlm* [25] xtable.summary.spautolm* xtable.summary.sphet* [27] xtable.summary.splm* xtable.summary.stsls* [29] xtable.table* xtable.ts* [31] xtable.zoo* see?methods for accessing help and source code For instance, ANOVA-tables will be recognized. So a LATEX-user can then copy these tex-lines into the report.tex-document. Or to integrate the R-code into the LATEX-code, use the knitr R-package to create the pure tex-file from an.rnw file, which is a kind of LATEX-file with all the R-code integrated into it, with a lot of flexibility in controlling what will be showed/evaluated etc. in the output. This can be used for both raw code, results, tables and figures. A Microsoft Word user may also use xtable through the html-print-option: print(xtable(means), type = "html") <!-- html table generated in R by xtable package --> <!-- Wed Sep 27 11:51: >

17 enote R-TUTORIAL: INITIAL EXPLORATIVE ANALYSIS 17 <table border=1> <tr> <th> </th> <th> x </th> </tr> <tr> <td align="right"> 1 </td> <td align="right"> 5.51 </td> </tr> <tr> <td align="right"> 2 </td> <td align="right"> 5.79 </td> </tr> <tr> <td align="right"> 3 </td> <td align="right"> 5.10 </td> </tr> </table> And then print the table directly into a file: print(xtable(means), type = "html", file = "myhtmltable.html") Open the file in a browser and copy-paste to Word. 3.6 R-TUTORIAL: Initial explorative analysis The data set planks is imported as described in enote 1. Assume that the data set is called planks in R. The plots in Figure 3.2 in are produced using the function interaction.plot() which requires three arguments: first the factor that is to be on the x-axis, then the factor that separates the data into distinct graphs and finally the response variable. An optional parameter legend which takes either FALSE or TRUE specifies whether or not a legend should be added (relating the graphs to the factor levels) The code that produced this figure was: par(mar = c(3.5, 3.5, 1, 1), # smaller margin on top and right mgp = c(2.4,0.7,0), # position of axis labels, ticks labels and axis las=1) planks <- read.table("planks.txt", header = TRUE, sep = ",") Ylim <- c(3, 9) par(mfrow=c(2,2)) with(planks, { interaction.plot(width, plank, humidity, ylim=ylim, legend=false, bty="n", col=2:11, xtick = TRUE) interaction.plot(depth, plank, humidity, ylim=ylim, legend=false, bty="n", col=2:11, xtick = TRUE) interaction.plot(width, depth, humidity, ylim=ylim,

18 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 18 bty="n", col=2:11, xtick = TRUE) interaction.plot(depth, width, humidity, ylim=ylim, bty="n", col=2:11, xtick = TRUE) }) par(mfrow=c(1,1)) Notice that the with{... } function around the interaction.plot statements results in evaluation of the statements within a frame where the data set planks is available. This approach avoids having to attach data sets. The function par is used to set a variety of graphical parameters (try typing?par for details). The parameter mfrow is a vector of length two where the first component is the number of rows on the graphical device and the second component is the number of columns on the graphical device. To return to the default use par(mfrow=c(1, 1)). 3.7 Test of overall effects/model reduction In the previous section we did not need to define the variables as factors in Rto use interaction.plot, but in the following we do. Configure the three variables depth, plank and width as factors: planks$plank <- factor(planks$plank) planks$depth <- factor(planks$depth) planks$width <- factor(planks$width) Analysis of models including random effects can be done using the lmer function from the R-package lme4. The general model with fixed-effects structure consisting of the interaction between two factors and random effects assigned to the plank is specified as follows require(lme4) model1 <- lmer(humidity ~ depth*width +(1 plank), data = planks) Notice that the fixed-effects structure is specified as either depth + width + depth:width or depth*width as more short used here they give the same result. The relevant tests of the fixed-effects structure are obtained applying anova(model1) after making sure the lmertest-package is available

19 enote TEST OF OVERALL EFFECTS/MODEL REDUCTION 19 require(lmertest) anova(model1) Analysis of Variance Table of type III with Satterthwaite approximation for degrees of freedom Sum Sq Mean Sq NumDF DenDF F.value Pr(>F) depth < 2.2e-16 *** width e-12 *** depth:width Signif. codes: 0 *** ** 0.01 * lmertest automatically loads lme4, so we could have just run require(lmertest) from the beginning instead. Using Anova from the car-package we obtain: require(car) Anova(model1, test.statistic = "F", type = 3) Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df) Response: humidity F Df Df.res Pr(>F) (Intercept) < 2e-16 *** depth < 2e-16 *** width * depth:width Signif. codes: 0 *** ** 0.01 * The interaction is not significant and a reduced model can be formulated model2 <- lmer(humidity ~ depth + width + (1 plank), data = planks) anova(model2) Analysis of Variance Table of type III with Satterthwaite approximation for degrees of freedom Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)

20 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS20 depth < 2.2e-16 *** width e-12 *** --- Signif. codes: 0 *** ** 0.01 * Both factors are highly significant and no further reduction is possible. 3.8 R-TUTORIAL: Post hoc analysis and summarizing the results Estimates of the variance-parameters are found with VarCorr(model2) Groups Name Std.Dev. plank (Intercept) Residual Note that the estimates are given on the standard-deviation scale not the variancescale. The so-called profile likelihood based confidence intervals for the two variance parameters are found with: m2prof <- profile(model2, which=1:2, signames=false) confint(m2prof) 2.5 % 97.5 % sd_(intercept) plank sigma The profile function by default profiles the likelihood for all model parameters, but since profiling is time-consuming and since we are only interested in the profile likelihood confidence intervals for the two variance parameters we set the which=1:2 option. As in enote 1 we can use lsmeans to compute the estimated mean levels and their differences:

21 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS21 require(lsmeans) lsmeans::lsmeans(model2, ~ depth) depth lsmean SE df lower.cl upper.cl Results are averaged over the levels of: width Degrees-of-freedom method: satterthwaite Confidence level used: 0.95 lsmeans::lsmeans(model2, pairwise ~ width) $lsmeans width lsmean SE df lower.cl upper.cl Results are averaged over the levels of: depth Degrees-of-freedom method: satterthwaite Confidence level used: 0.95 $contrasts contrast estimate SE df t.ratio p.value < <.0001 Results are averaged over the levels of: depth P value adjustment: tukey method for comparing a family of 3 estimates Observe that writing pairwise ~ (LS) means. generates all pairwise differences of the expected

22 enote R-TUTORIAL: POST HOC ANALYSIS AND SUMMARIZING THE RESULTS22 The multcomp package also includes the so-called compact letter displays: require(multcomp) tuk2 <- glht(model2, linfct = mcp(depth = "Tukey")) tuk.cld2 <- cld(tuk2) tuk.cld2 # Display the CLD "a" "bc" "c" "b" "a" # Plot the compact-letter-display: old.par <- par(no.readonly=true) # Save current graphics parameters par(mai=c(1,1,1.25,1)) # Use sufficiently large upper margin plot(tuk.cld2, col=2:6) a b c c b a linear predictor depth

23 enote EXERCISES 23 par(old.par) # reset graphics parameters The lmertest-package has a rand function which produces an ANOVA-like table of χ 2 - tests of the random effects in a mixed model: rand(model2) Analysis of Random effects Table: Chi.sq Chi.DF p.value plank <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Exercises Exercise 1 Colour of spinage Spinage heated to 90 or 100 degrees Celcius was vacuum packed and stored for 0, 1 or 2 weeks before the packs were opened and chill stored in normal atmosphere for 0, 1 or 2 days. Then the colour was measured on a Hunter Lab. Two of the colour coordinates, a and b (measuring respectively something like red and yellow colour), were recorded and are given in the data set below. The variable batch is a blocking variable referring to two batches of spinage. The data is available in the file spinage.txt and listed here: Batch temp weeks days a b A A A A A A A A A A

24 enote EXERCISES 24 A A A A A A A A B B B B B B B B B B B B B B B B B B a) Write down all the factors relevant for the analysis, and their levels and mutual structure. Are they crossed or nested, for example? Make the factor structure diagram. b) Analyse the effect of the different factors on the two colour measurements and summarize the significant effects. (lsmeans etc)

25 enote EXERCISES 25 Exercise 2 Sensory evaluation of spinage In the spinage experiment from exercise 1 sensory evaluations were performed beside the colour measurements. The treatments were still the same, so the factors were heating temperature, original storage (weeks), storage after opening (days), and batch. The products from each treatment combination from each batch were assessed by (some of) 7 assessors who gave a score (between 0 and 15) for each of 6 different sensory properties (see the list further below). There was one sesssion for each combination of batch and weeks, and at each session the assessors evaluated the same 6 products (6 combinations of days and temperature). Note that not all assessors were present at all sessions. The results, with one line per evaluation, are given in the order: weeks of storage, days after opening, batch, temperature, session number, assessor number, and the six sensory properties hay flavour 1, hay flavour 2, hay taste, spinage flavour 1, spinage flavour 2, spinage taste. The data is available in the file spinagesens.txt and listed partly below: 0 0 A A A A A A A A A A A A A A B B (252 lines in total) 2 2 B B

26 enote EXERCISES 26 a) Write down the factors relevant for the analysis, and their levels and mutual structure. [You should include a production factor corresponding to the combinations of temperature, weeks, days, and batch.] b) Specify which effects you want to include in the model. Pay particular attention to which interactions you want in the model. [Include at least some of the interactions between assessor and treatment factors]. Which effects are random and which are fixed? c) Perform the analysis for one of the sensory properties and draw conclusions.

enote 3 1 enote 3 Case study

enote 3 1 enote 3 Case study enote 3 1 enote 3 Case study enote 3 INDHOLD 2 Indhold 3 Case study 1 3.1 Introduction.................................... 3 3.2 Initial explorative analysis............................ 5 3.3 Test of overall

More information

Random coefficients models

Random coefficients models enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................

More information

Random coefficients models

Random coefficients models enote 9 1 enote 9 Random coefficients models enote 9 INDHOLD 2 Indhold 9 Random coefficients models 1 9.1 Introduction.................................... 2 9.2 Example: Constructed data...........................

More information

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10 St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 9: R 9.1 Random coefficients models...................... 1 9.1.1 Constructed data........................

More information

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation: Topic 11. Unbalanced Designs [ST&D section 9.6, page 219; chapter 18] 11.1 Definition of missing data Accidents often result in loss of data. Crops are destroyed in some plots, plants and animals die,

More information

Module 3: SAS. 3.1 Initial explorative analysis 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE

Module 3: SAS. 3.1 Initial explorative analysis 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 3: SAS 3.1 Initial explorative analysis....................... 1 3.1.1 SAS JMP............................

More information

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC. Mixed Effects Models Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC March 6, 2018 Resources for statistical assistance Department of Statistics

More information

Analysis of variance - ANOVA

Analysis of variance - ANOVA Analysis of variance - ANOVA Based on a book by Julian J. Faraway University of Iceland (UI) Estimation 1 / 50 Anova In ANOVAs all predictors are categorical/qualitative. The original thinking was to try

More information

One Factor Experiments

One Factor Experiments One Factor Experiments 20-1 Overview Computation of Effects Estimating Experimental Errors Allocation of Variation ANOVA Table and F-Test Visual Diagnostic Tests Confidence Intervals For Effects Unequal

More information

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Statistics Lab #7 ANOVA Part 2 & ANCOVA Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")

More information

Lab 5 - Risk Analysis, Robustness, and Power

Lab 5 - Risk Analysis, Robustness, and Power Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors

More information

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 34 It is possible to have many factors in a factorial experiment. In DDD we saw an example of a 3-factor study with ball size, height, and surface

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

Model Selection and Inference

Model Selection and Inference Model Selection and Inference Merlise Clyde January 29, 2017 Last Class Model for brain weight as a function of body weight In the model with both response and predictor log transformed, are dinosaurs

More information

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R In this course we will be using R (for Windows) for most of our work. These notes are to help students install R and then

More information

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes. Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 1 1 / 31 It is possible to have many factors in a factorial experiment. We saw some three-way factorials earlier in the DDD book (HW 1 with 3 factors:

More information

Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy.

Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. JMP Output from Chapter 5 Factorial Analysis through JMP Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. Fitting

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

The linear mixed model: modeling hierarchical and longitudinal data

The linear mixed model: modeling hierarchical and longitudinal data The linear mixed model: modeling hierarchical and longitudinal data Analysis of Experimental Data AED The linear mixed model: modeling hierarchical and longitudinal data 1 of 44 Contents 1 Modeling Hierarchical

More information

Demo yeast mutant analysis

Demo yeast mutant analysis Demo yeast mutant analysis Jean-Yves Sgro February 20, 2018 Contents 1 Analysis of yeast growth data 1 1.1 Set working directory........................................ 1 1.2 List all files in directory.......................................

More information

Tutorial for the SensMixed application

Tutorial for the SensMixed application Tutorial for the SensMixed application Alexandra Kuznetsova 1. The SensMixed package - an overview The SensMixed package is an R package for analysing Sensory and Consumer data in a mixed model framework

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 245 Introduction This procedure generates R control charts for variables. The format of the control charts is fully customizable. The data for the subgroups can be in a single column or in multiple

More information

Pair-Wise Multiple Comparisons (Simulation)

Pair-Wise Multiple Comparisons (Simulation) Chapter 580 Pair-Wise Multiple Comparisons (Simulation) Introduction This procedure uses simulation analyze the power and significance level of three pair-wise multiple-comparison procedures: Tukey-Kramer,

More information

For our example, we will look at the following factors and factor levels.

For our example, we will look at the following factors and factor levels. In order to review the calculations that are used to generate the Analysis of Variance, we will use the statapult example. By adjusting various settings on the statapult, you are able to throw the ball

More information

Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews?

Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews? Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews? Eduardo Magalhaes Barbosa 17 de novembro de 2015 1 Introduction Star classification features are ubiquitous in apps world,

More information

A Knitr Demo. Charles J. Geyer. February 8, 2017

A Knitr Demo. Charles J. Geyer. February 8, 2017 A Knitr Demo Charles J. Geyer February 8, 2017 1 Licence This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/.

More information

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that The theory of the linear model 41 Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that E(Y X) =X 0 b 0 0 the F-test statistic follows an F-distribution with (p p 0, n p) degrees

More information

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt )

Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) JMP Output from Chapter 9 Factorial Analysis through JMP Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) Fitting the Model and checking conditions Analyze > Fit Model

More information

Tutorial for the SensMixed application

Tutorial for the SensMixed application Tutorial for the SensMixed application Alexandra Kuznetsova, Per Bruun Brockhoff 1. The SensMixed package - an overview The SensMixed package is an R package for analysing Sensory and Consumer data in

More information

Section 4 General Factorial Tutorials

Section 4 General Factorial Tutorials Section 4 General Factorial Tutorials General Factorial Part One: Categorical Introduction Design-Ease software version 6 offers a General Factorial option on the Factorial tab. If you completed the One

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

We have seen that as n increases, the length of our confidence interval decreases, the confidence interval will be more narrow.

We have seen that as n increases, the length of our confidence interval decreases, the confidence interval will be more narrow. {Confidence Intervals for Population Means} Now we will discuss a few loose ends. Before moving into our final discussion of confidence intervals for one population mean, let s review a few important results

More information

Multi-Factored Experiments

Multi-Factored Experiments Design and Analysis of Multi-Factored Experiments Advanced Designs -Hard to Change Factors- Split-Plot Design and Analysis L. M. Lye DOE Course 1 Hard-to-Change Factors Assume that a factor can be varied,

More information

lme for SAS PROC MIXED Users

lme for SAS PROC MIXED Users lme for SAS PROC MIXED Users Douglas M. Bates Department of Statistics University of Wisconsin Madison José C. Pinheiro Bell Laboratories Lucent Technologies 1 Introduction The lme function from the nlme

More information

36-402/608 HW #1 Solutions 1/21/2010

36-402/608 HW #1 Solutions 1/21/2010 36-402/608 HW #1 Solutions 1/21/2010 1. t-test (20 points) Use fullbumpus.r to set up the data from fullbumpus.txt (both at Blackboard/Assignments). For this problem, analyze the full dataset together

More information

A (very) brief introduction to R

A (very) brief introduction to R A (very) brief introduction to R You typically start R at the command line prompt in a command line interface (CLI) mode. It is not a graphical user interface (GUI) although there are some efforts to produce

More information

Package simr. April 30, 2018

Package simr. April 30, 2018 Type Package Package simr April 30, 2018 Title Power Analysis for Generalised Linear Mixed Models by Simulation Calculate power for generalised linear mixed models, using simulation. Designed to work with

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

pairwise.t.test(dataset$measurement, dataset$group, p.adj = bonferroni ) TukeyHSD(aov(dataset$measurement~dataset$group))

pairwise.t.test(dataset$measurement, dataset$group, p.adj = bonferroni ) TukeyHSD(aov(dataset$measurement~dataset$group)) Tutorial 9: Comparing Three or More Groups One-way (single-factor) ANOVA (analysis of variance) Used to compare means of 3 or more groups based on a single explanatory (independent) variable, or factor.

More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 INTRODUCTION Graphs are one of the most important aspects of data analysis and presentation of your of data. They are visual representations

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software December 2017, Volume 82, Issue 13. doi: 10.18637/jss.v082.i13 lmertest Package: Tests in Linear Mixed Effects Models Alexandra Kuznetsova Technical University of Denmark

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Rebecca C. Steorts, Duke University STA 325, Chapter 3 ISL 1 / 49 Agenda How to extend beyond a SLR Multiple Linear Regression (MLR) Relationship Between the Response and Predictors

More information

Stat 411/511 MULTIPLE COMPARISONS. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 MULTIPLE COMPARISONS. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 MULTIPLE COMPARISONS Nov 16 2015 Charlotte Wickham stat511.cwick.co.nz Thanksgiving week No lab material next week 11/24 & 11/25. Labs as usual this week. Lectures as usual Mon & Weds next

More information

BART STAT8810, Fall 2017

BART STAT8810, Fall 2017 BART STAT8810, Fall 2017 M.T. Pratola November 1, 2017 Today BART: Bayesian Additive Regression Trees BART: Bayesian Additive Regression Trees Additive model generalizes the single-tree regression model:

More information

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C.

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C. 219 Lemma J For all languages A, B, C the following hold i. A m A, (reflexive) ii. if A m B and B m C, then A m C, (transitive) iii. if A m B and B is Turing-recognizable, then so is A, and iv. if A m

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

Logical operators: R provides an extensive list of logical operators. These include

Logical operators: R provides an extensive list of logical operators. These include meat.r: Explanation of code Goals of code: Analyzing a subset of data Creating data frames with specified X values Calculating confidence and prediction intervals Lists and matrices Only printing a few

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

E-Campus Inferential Statistics - Part 2

E-Campus Inferential Statistics - Part 2 E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

An Experiment in Visual Clustering Using Star Glyph Displays

An Experiment in Visual Clustering Using Star Glyph Displays An Experiment in Visual Clustering Using Star Glyph Displays by Hanna Kazhamiaka A Research Paper presented to the University of Waterloo in partial fulfillment of the requirements for the degree of Master

More information

Week 5: Multiple Linear Regression II

Week 5: Multiple Linear Regression II Week 5: Multiple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Adjusted R

More information

Introduction to Mixed Models: Multivariate Regression

Introduction to Mixed Models: Multivariate Regression Introduction to Mixed Models: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #9 March 30, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate

More information

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C is one of many capability metrics that are available. When capability metrics are used, organizations typically provide

More information

STATS PAD USER MANUAL

STATS PAD USER MANUAL STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11

More information

AA BB CC DD EE. Introduction to Graphics in R

AA BB CC DD EE. Introduction to Graphics in R Introduction to Graphics in R Cori Mar 7/10/18 ### Reading in the data dat

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools

Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Regression on SAT Scores of 374 High Schools and K-means on Clustering Schools Abstract In this project, we study 374 public high schools in New York City. The project seeks to use regression techniques

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 4, 2016 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS Experiment 1 CH 222 - Fall 2004 INTRODUCTION TO SPREADSHEETS Introduction Spreadsheets are valuable tools utilized in a variety of fields. They can be used for tasks as simple as adding or subtracting

More information

Quick Start with CASSY Lab. Bi-05-05

Quick Start with CASSY Lab. Bi-05-05 Quick Start with CASSY Lab Bi-05-05 About this manual This manual helps you getting started with the CASSY system. The manual does provide you the information you need to start quickly a simple CASSY experiment

More information

Meet MINITAB. Student Release 14. for Windows

Meet MINITAB. Student Release 14. for Windows Meet MINITAB Student Release 14 for Windows 2003, 2004 by Minitab Inc. All rights reserved. MINITAB and the MINITAB logo are registered trademarks of Minitab Inc. All other marks referenced remain the

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 5, 2015 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data Introduction About this Document This manual was written by members of the Statistical Consulting Program as an introduction to SPSS 12.0. It is designed to assist new users in familiarizing themselves

More information

2010 by Minitab, Inc. All rights reserved. Release Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are

2010 by Minitab, Inc. All rights reserved. Release Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are 2010 by Minitab, Inc. All rights reserved. Release 16.1.0 Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are registered trademarks of Minitab, Inc. in the United

More information

STATISTICS FOR PSYCHOLOGISTS

STATISTICS FOR PSYCHOLOGISTS STATISTICS FOR PSYCHOLOGISTS SECTION: JAMOVI CHAPTER: USING THE SOFTWARE Section Abstract: This section provides step-by-step instructions on how to obtain basic statistical output using JAMOVI, both visually

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables: Regression Lab The data set cholesterol.txt available on your thumb drive contains the following variables: Field Descriptions ID: Subject ID sex: Sex: 0 = male, = female age: Age in years chol: Serum

More information

Multiple Comparisons of Treatments vs. a Control (Simulation)

Multiple Comparisons of Treatments vs. a Control (Simulation) Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that

More information

SPSS INSTRUCTION CHAPTER 9

SPSS INSTRUCTION CHAPTER 9 SPSS INSTRUCTION CHAPTER 9 Chapter 9 does no more than introduce the repeated-measures ANOVA, the MANOVA, and the ANCOVA, and discriminant analysis. But, you can likely envision how complicated it can

More information

HydroOffice Diagrams

HydroOffice Diagrams Hydro Office Software for Water Sciences HydroOffice Diagrams User Manual for Ternary 1.0, Piper 2.0 and Durov 1.0 tool HydroOffice.org Citation: Gregor M. 2013. HydroOffice Diagrams user manual for Ternary1.0,

More information

Fractional. Design of Experiments. Overview. Scenario

Fractional. Design of Experiments. Overview. Scenario Design of Experiments Overview We are going to learn about DOEs. Specifically, you ll learn what a DOE is, as well as, what a key concept known as Confounding is all about. Finally, you ll learn what the

More information

Inference in mixed models in R - beyond the usual asymptotic likelihood ratio test

Inference in mixed models in R - beyond the usual asymptotic likelihood ratio test 1 / 42 Inference in mixed models in R - beyond the usual asymptotic likelihood ratio test Søren Højsgaard 1 Ulrich Halekoh 2 1 Department of Mathematical Sciences Aalborg University, Denmark sorenh@math.aau.dk

More information

Multiple Linear Regression: Global tests and Multiple Testing

Multiple Linear Regression: Global tests and Multiple Testing Multiple Linear Regression: Global tests and Multiple Testing Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike

More information

Exercise: Graphing and Least Squares Fitting in Quattro Pro

Exercise: Graphing and Least Squares Fitting in Quattro Pro Chapter 5 Exercise: Graphing and Least Squares Fitting in Quattro Pro 5.1 Purpose The purpose of this experiment is to become familiar with using Quattro Pro to produce graphs and analyze graphical data.

More information

SAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/

SAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/ STAT:5201 Applied Statistic II (Factorial with 3 factors as 2 3 design) Three-way ANOVA (Factorial with three factors) with replication Factor A: angle (low=0/high=1) Factor B: geometry (shape A=0/shape

More information

A Multiple-Line Fitting Algorithm Without Initialization Yan Guo

A Multiple-Line Fitting Algorithm Without Initialization Yan Guo A Multiple-Line Fitting Algorithm Without Initialization Yan Guo Abstract: The commonest way to fit multiple lines is to use methods incorporate the EM algorithm. However, the EM algorithm dose not guarantee

More information

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB Keppel, G. Design and Analysis: Chapter 17: The Mixed Two-Factor Within-Subjects Design: The Overall Analysis and the Analysis of Main Effects and Simple Effects Keppel describes an Ax(BxS) design, which

More information

THE GROUP OF SYMMETRIES OF THE TOWER OF HANOI GRAPH

THE GROUP OF SYMMETRIES OF THE TOWER OF HANOI GRAPH THE GROUP OF SYMMETRIES OF THE TOWER OF HANOI GRAPH SOEUN PARK arxiv:0809.1179v1 [math.co] 7 Sep 2008 Abstract. The Tower of Hanoi problem with k pegs and n disks has been much studied via its associated

More information

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information

Correctly Compute Complex Samples Statistics

Correctly Compute Complex Samples Statistics SPSS Complex Samples 15.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample

More information

MODELING FOR RESIDUAL STRESS, SURFACE ROUGHNESS AND TOOL WEAR USING AN ADAPTIVE NEURO FUZZY INFERENCE SYSTEM

MODELING FOR RESIDUAL STRESS, SURFACE ROUGHNESS AND TOOL WEAR USING AN ADAPTIVE NEURO FUZZY INFERENCE SYSTEM CHAPTER-7 MODELING FOR RESIDUAL STRESS, SURFACE ROUGHNESS AND TOOL WEAR USING AN ADAPTIVE NEURO FUZZY INFERENCE SYSTEM 7.1 Introduction To improve the overall efficiency of turning, it is necessary to

More information

Introduction to Excel Workshop

Introduction to Excel Workshop Introduction to Excel Workshop Empirical Reasoning Center September 9, 2016 1 Important Terminology 1. Rows are identified by numbers. 2. Columns are identified by letters. 3. Cells are identified by the

More information

Additional Issues: Random effects diagnostics, multiple comparisons

Additional Issues: Random effects diagnostics, multiple comparisons : Random diagnostics, multiple Austin F. Frank, T. Florian April 30, 2009 The dative dataset Original analysis in Bresnan et al (2007) Data obtained from languager (Baayen 2008) Data describing the realization

More information

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business

Section 3.2: Multiple Linear Regression II. Jared S. Murray The University of Texas at Austin McCombs School of Business Section 3.2: Multiple Linear Regression II Jared S. Murray The University of Texas at Austin McCombs School of Business 1 Multiple Linear Regression: Inference and Understanding We can answer new questions

More information

BIOMETRICS INFORMATION

BIOMETRICS INFORMATION BIOMETRICS INFORMATION (You re 95% likely to need this information) PAMPHLET NO. # 57 DATE: September 5, 1997 SUBJECT: Interpreting Main Effects when a Two-way Interaction is Present Interpreting the analysis

More information

Factorial ANOVA. Skipping... Page 1 of 18

Factorial ANOVA. Skipping... Page 1 of 18 Factorial ANOVA The potato data: Batches of potatoes randomly assigned to to be stored at either cool or warm temperature, infected with one of three bacterial types. Then wait a set period. The dependent

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

Sweave Dynamic Interaction of R and L A TEX

Sweave Dynamic Interaction of R and L A TEX Sweave Dynamic Interaction of R and L A TEX Nora Umbach Dezember 2009 Why would I need Sweave? Creating reports that can be updated automatically Statistic exercises Manuals with embedded examples (like

More information

Stat 5303 (Oehlert): Unreplicated 2-Series Factorials 1

Stat 5303 (Oehlert): Unreplicated 2-Series Factorials 1 Stat 5303 (Oehlert): Unreplicated 2-Series Factorials 1 Cmd> a

More information

Categorical explanatory variables

Categorical explanatory variables Hutcheson, G. D. (2011). Tutorial: Categorical Explanatory Variables. Journal of Modelling in Management. 6, 2: 225 236. NOTE: this is a slightly updated version of this paper which is distributed to correct

More information

Chapter 6: Linear Model Selection and Regularization

Chapter 6: Linear Model Selection and Regularization Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the

More information

Multiple Regression White paper

Multiple Regression White paper +44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms

More information

Creating a Basic Chart in Excel 2007

Creating a Basic Chart in Excel 2007 Creating a Basic Chart in Excel 2007 A chart is a pictorial representation of the data you enter in a worksheet. Often, a chart can be a more descriptive way of representing your data. As a result, those

More information