margarine Name:

Similar documents
Statistics Lab #7 ANOVA Part 2 & ANCOVA

Yelp Star Rating System Reviewed: Are Star Ratings inline with textual reviews?

Analysis of variance - ANOVA

newspapers Name: September 19, 2015

for statistical analyses

Regression Lab 1. The data set cholesterol.txt available on your thumb drive contains the following variables:

Model Selection and Inference

Multiple Linear Regression: Global tests and Multiple Testing

One Factor Experiments

Laboratory for Two-Way ANOVA: Interactions

The theory of the linear model 41. Theorem 2.5. Under the strong assumptions A3 and A5 and the hypothesis that

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

E-Campus Inferential Statistics - Part 2

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

Assignment 8. Due Thursday November 16 at 11:59pm on Blackboard

Regression on the trees data with R

Confidence Intervals: Estimators

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

For Additional Information...

Getting Started with Minitab 17

36-402/608 HW #1 Solutions 1/21/2010

Getting Started with Minitab 18

Section 2.2: Covariance, Correlation, and Least Squares

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

The Statistical Sleuth in R: Chapter 10

Topic:- DU_J18_MA_STATS_Topic01

9.1 Random coefficients models Constructed data Consumer preference mapping of carrots... 10

IE 361 Exam 1 October 2005 Prof. Vardeman Give Give Does Explain What Answer explain

Using R & R Commander in Biomathematics Research

Lab #9: ANOVA and TUKEY tests

enote 3 1 enote 3 Case study

BayesFactor Examples

Stat 411/511 MULTIPLE COMPARISONS. Charlotte Wickham. stat511.cwick.co.nz. Nov

Unit 5: Estimating with Confidence

Module 3: SAS. 3.1 Initial explorative analysis 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE

NCSS Statistical Software. Design Generator

enote 3 1 enote 3 Case study

Stat 5303 (Oehlert): Response Surfaces 1

EDUCATIONAL DATA MINING USING R PROGRAMMING AND R STUDIO

Subset Selection in Multiple Regression

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

R-Square Coeff Var Root MSE y Mean

Week 4: Simple Linear Regression III

Orange Juice data. Emanuele Taufer. 4/12/2018 Orange Juice data (1)

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015

Enter your UID and password. Make sure you have popups allowed for this site.

Section 2.3: Simple Linear Regression: Predictions and Inference

S CHAPTER return.data S CHAPTER.Data S CHAPTER

Hypermarket Retail Analysis Customer Buying Behavior. Reachout Analytics Client Sample Report

Meet MINITAB. Student Release 14. for Windows

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

1 Simple Linear Regression

Regression Analysis and Linear Regression Models

Stat 204 Sample Exam 1 Name:

DATA DEFINITION PHASE

* Sample SAS program * Data set is from Dean and Voss (1999) Design and Analysis of * Experiments. Problem 3, page 129.

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Normal Plot of the Effects (response is Mean free height, Alpha = 0.05)

Factorial ANOVA with SAS

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R

2010 by Minitab, Inc. All rights reserved. Release Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are

Statistical Analysis of Series of N-of-1 Trials Using R. Artur Araujo

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Quantitative - One Population

Stat 5303 (Oehlert): Unreplicated 2-Series Factorials 1

Mensch-Maschine-Interaktion 2 Übung 10

STATISTICS FOR PSYCHOLOGISTS

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:

Package ssmn. R topics documented: August 9, Type Package. Title Skew Scale Mixtures of Normal Distributions. Version 1.1.

Stata versions 12 & 13 Week 4 Practice Problems

Factorial ANOVA. Skipping... Page 1 of 18

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

Exercise 2.23 Villanova MAT 8406 September 7, 2015

Pair-Wise Multiple Comparisons (Simulation)

6:1 LAB RESULTS -WITHIN-S ANOVA

STAT - Edit Scroll up the appropriate list to highlight the list name at the very top Press CLEAR, followed by the down arrow or ENTER

Conditional and Unconditional Regression with No Measurement Error

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

Data Analysis and Hypothesis Testing Using the Python ecosystem

G.S. Gilbert, ENVS291 Transition to R vw2015 Classes 4 and 5 Regression, ANOVA, t- test

Joe Swintek Badger Technical Services. June 6, 2018

Lab 07: Multiple Linear Regression: Variable Selection

JMP GENOMICS VALIDATION USING COVERING ARRAYS AND EQUIVALENCE PARTITIONING WENJUN BAO AND JOSEPH MORGAN JMP DIVISION, SAS INC.

Additional multcomp Examples

OPTIMIZING THE TOTAL COMPLETION TIME IN PROCESS PLANNING USING THE RANDOM SIMULATION ALGORITHM

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

SYSTAT A Tutorial Manual. Cover version 12

Analysis of Two-Level Designs

Interval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.

Introduction to R, Github and Gitlab

7/18/16. Review. Review of Homework. Lecture 3: Programming Statistics in R. Questions from last lecture? Problems with Stata? Problems with Excel?

STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015

Section 2.1: Intro to Simple Linear Regression & Least Squares

Introduction to R. Introduction to Econometrics W

4b: Making an auxiliary table for calculating the standard deviation

8. MINITAB COMMANDS WEEK-BY-WEEK

Table Of Contents. Table Of Contents

INDEX. # 95 % CI t.star <- qt(0.975, df = 29) t.star ## [1] ME <- t.star * 1.9/sqrt(30) ME ## [1]

Contrasts and Multiple Comparisons

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Transcription:

margarine Name: 2017-04-24 Contents margarine 1 data analysis............................................. 1 ANOVA F test for equality of means................................ 3 multiple comparisons......................................... 4 margarine references: - Peck, 1/e, 17.20 - ANOVA table, Peck, chapter 17, table 17.2, p.14 - saturated fat, Wikipedia - myristic acid, a saturated fatty acid, Wikipedia - monounsaturated fat, Wikipedia - polyunsaturated fat, Wikipedia data analysis Import the data. Measure the physiologically active polyunsaturated fatty acids (= PAPUFA, in percent) for each sample of margarine. BlueBonnet <- read.table("bluebonnet.txt", header=false, sep=" ") Chiffon <- read.table("chiffon.txt", header=false, sep=" ") Fleischmanns <- read.table("fleischmanns.txt", header=false, sep=" ") Imperial <- read.table("imperial.txt", header=false, sep=" ") Mazola <- read.table("mazola.txt", header=false, sep=" ") Parkay <- read.table("parkay.txt", header=false, sep=" ") val <- unname(unlist(c(bluebonnet, Chiffon, Fleischmanns, Imperial, Mazola, Parkay))) label <- c(rep("bluebonnet", length(bluebonnet)), rep("chiffon", length(chiffon)), rep("fleischmanns", length(fleischmanns)), rep("imperial", length(imperial)), rep("mazola", length(mazola)), rep("parkay", length(parkay))) data <- data.frame(label, val) head(data) ## label val ## 1 BlueBonnet 13.5 ## 2 BlueBonnet 13.4 ## 3 BlueBonnet 14.1 1

## 4 BlueBonnet 14.3 ## 5 Chiffon 13.2 ## 6 Chiffon 12.7 The data has two columns and 26 rows. Equal standard deviations? margarine.sd <- aggregate(val ~ label, data=data, sd) barplot(margarine.sd$val, col=terrain.colors(6), names.arg=margarine.sd$label, las=1, cex.names=0.8, ylab="standard deviation of PAPUFA") Standard deviation of PAPUFA 0.6 0.5 0.4 0.3 0.2 0.1 0.0 BlueBonnet Chiffon Fleischmanns Imperial Mazola Parkay largest.ratio <- margarine.sd$val[3] / margarine.sd$val[4] largest.ratio ## [1] 1.820931 Boxplots. boxplot(val ~ label, data=data, horizontal=true, las=1, par(mar=c(4, 7, 2, 2)), col=terrain.colors(6), xlab="papufa (percent)") 2

Parkay Mazola Imperial Fleischmanns Chiffon BlueBonnet 13 14 15 16 17 18 PAPUFA (percent) ANOVA F test for equality of means H 0 : all the means are the same H a : not all the means are the same Construct a linear model and call anova on that model. margarine.lm <- lm(val ~ label, data=data) options(show.signif.stars = FALSE) anova(margarine.lm) ## Analysis of Variance Table ## ## Response: val ## Df Sum Sq Mean Sq F value Pr(>F) ## label 5 108.19 21.637 79.264 1.737e-12 ## Residuals 20 5.46 0.273 There are g = 6 groups and n = 26 values, so the test statistic is F = 79.264 with g 1 = 5 and n g = 20 degrees of freedom. Confirm that the p-value of this statistic is as reported in the anova display. 1 - pf(79.264, df1=5, df2=20) ## [1] 1.736833e-12 Illustration Here is an illustration relating these statistics. 3

x.max <- 100 y.max <- 0.8 f.val <- 79.264 g <- 6 n <- nrow(data) f.df1 <- g - 1 f.df2 <- n - g f.p.value <- 1.737e-12 title <- "F Test" draw.f(x.max, y.max, f.val, f.df1, f.df2, f.p.value, title) F Test Density 0.0 0.2 0.4 0.6 0.8 F(df 1 = 5, df 2 = 20) p value = 1.737e 12 F = 79.264 0 20 40 60 80 100 x Conclusion. State the formal conclusion of the HT and explain how you reached that conclusion p.value <- f.p.value alpha <- 0.05 reject.h0 <- p.value <= alpha reject.h0 ## [1] TRUE State the conclusion in context. multiple comparisons R s TukeyHSD procedure (= Tukey Honest Significant Differences) implements the Tukey-Cramer Multiple Comparison Procedure discussed in our text. 4

TukeyHSD(aov(margarine.lm)) ## Tukey multiple comparisons of means ## 95% family-wise confidence level ## ## Fit: aov(formula = margarine.lm) ## ## $label ## diff lwr upr p adj ## Chiffon-BlueBonnet -0.725-1.8862516 0.4362516 0.3963073 ## Fleischmanns-BlueBonnet 4.275 3.1137484 5.4362516 0.0000000 ## Imperial-BlueBonnet 0.275-0.8862516 1.4362516 0.9736311 ## Mazola-BlueBonnet 3.315 2.2133400 4.4166600 0.0000001 ## Parkay-BlueBonnet -1.025-2.1266600 0.0766600 0.0775326 ## Fleischmanns-Chiffon 5.000 3.8387484 6.1612516 0.0000000 ## Imperial-Chiffon 1.000-0.1612516 2.1612516 0.1176619 ## Mazola-Chiffon 4.040 2.9383400 5.1416600 0.0000000 ## Parkay-Chiffon -0.300-1.4016600 0.8016600 0.9526463 ## Imperial-Fleischmanns -4.000-5.1612516-2.8387484 0.0000000 ## Mazola-Fleischmanns -0.960-2.0616600 0.1416600 0.1107597 ## Parkay-Fleischmanns -5.300-6.4016600-4.1983400 0.0000000 ## Mazola-Imperial 3.040 1.9383400 4.1416600 0.0000004 ## Parkay-Imperial -1.300-2.4016600-0.1983400 0.0150616 ## Parkay-Mazola -4.340-5.3786550-3.3013450 0.0000000 par.orig <- par(mar=c(2, 12, 0.5, 0.5), las = 1, mgp = c(2.9, 0.7, 0)) plot(tukeyhsd(aov(margarine.lm)), las=1, col="forestgreen") 5

Chiffon BlueBonnet Fleischmanns BlueBonnet Imperial BlueBonnet Mazola BlueBonnet Parkay BlueBonnet Fleischmanns Chiffon Imperial Chiffon Mazola Chiffon Parkay Chiffon Imperial Fleischmanns Mazola Fleischmanns Parkay Fleischmanns Mazola Imperial Parkay Imperial Parkay Mazola 6 4 2 0 2 4 6 Conclusion. Interpret these results. Underscoring pattern. means <- aggregate(data[, 2], list(data$label), mean) names(means) <- c("margarine", "x.bar") means[order(means$x.bar), ] # calculate the means # order by mean ## margarine x.bar ## 6 Parkay 12.800 ## 2 Chiffon 13.100 ## 1 BlueBonnet 13.825 ## 4 Imperial 14.100 ## 5 Mazola 17.140 ## 3 Fleischmanns 18.100 Groups. Groups consist of means which have not yet been shown to be distinct. [P-C-B] [C-B-I] [B-I] [M-F] 6