Sta$s$cs & Experimental Design with R. Barbara Kitchenham Keele University

Similar documents
Sta$s$cs & Experimental Design with R. Barbara Kitchenham Keele University

One Factor Experiments

Statistics Lab #7 ANOVA Part 2 & ANCOVA

Recall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:


Mul$-objec$ve Visual Odometry Hsiang-Jen (Johnny) Chien and Reinhard Kle=e

Stat 5303 (Oehlert): Unbalanced Factorial Examples 1

General Factorial Models

General Factorial Models

More on Experimental Designs

NCSS Statistical Software. Design Generator

Set up of the data is similar to the Randomized Block Design situation. A. Chang 1. 1) Setting up the data sheet

Statistical Bioinformatics (Biomedical Big Data) Notes 2: Installing and Using R

Compare Linear Regression Lines for the HP-67

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Analysis of variance - ANOVA

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

STA 4273H: Sta-s-cal Machine Learning

PSY 9556B (Feb 5) Latent Growth Modeling

Week 5: Multiple Linear Regression II

SAS/STAT 13.1 User s Guide. The NESTED Procedure

For our example, we will look at the following factors and factor levels.

Lecture Notes #4: Randomized Block, Latin Square, and Factorials4-1

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Table Of Contents. Table Of Contents

Introduction to Mixed Models: Multivariate Regression

Correctly Compute Complex Samples Statistics

Topic:- DU_J18_MA_STATS_Topic01

Repeated Measures Part 4: Blood Flow data

Statistical Pattern Recognition

STATISTICS (STAT) Statistics (STAT) 1

Week 4: Simple Linear Regression III

Quantitative - One Population

An introduction to SPSS

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison

Source df SS MS F A a-1 [A] [T] SS A. / MS S/A S/A (a)(n-1) [AS] [A] SS S/A. / MS BxS/A A x B (a-1)(b-1) [AB] [A] [B] + [T] SS AxB

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

SAS data statements and data: /*Factor A: angle Factor B: geometry Factor C: speed*/

Apply. A. Michelle Lawing Ecosystem Science and Management Texas A&M University College Sta,on, TX

Group Sta*s*cs in MEG/EEG

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao

Cross- Valida+on & ROC curve. Anna Helena Reali Costa PCS 5024

Two-Stage Least Squares

Design of Experiments

Study Guide. Module 1. Key Terms

Multivariate Analysis Multivariate Calibration part 2

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

Lab #9: ANOVA and TUKEY tests

SPSS INSTRUCTION CHAPTER 9

Week 4: Simple Linear Regression II

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

The NESTED Procedure (Chapter)

A new graphic for one-way ANOVA

Compare Linear Regression Lines for the HP-41C

Machine Learning Crash Course: Part I

Subset Selection in Multiple Regression

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Product Catalog. AcaStat. Software

R-Square Coeff Var Root MSE y Mean

Factorial ANOVA. Skipping... Page 1 of 18

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

Introduction to Statistical Analyses in SAS

Statistical Good Practice Guidelines. 1. Introduction. Contents. SSC home Using Excel for Statistics - Tips and Warnings

Evaluating the Numerical Accuracy of Analyse-it for Microsoft Excel

nag anova random (g04bbc)

Factorial ANOVA with SAS

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Correctly Compute Complex Samples Statistics

Bootstrapped and Means Trimmed One Way ANOVA and Multiple Comparisons in R

Regression on the trees data with R

Applied Regression Modeling: A Business Approach

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data

7/15/2016 ARE YOUR ANALYSES TOO WHY IS YOUR ANALYSIS PARAMETRIC? PARAMETRIC? That s not Normal!

Centering and Interactions: The Training Data

Lab 5 - Risk Analysis, Robustness, and Power

CSCI 599 Class Presenta/on. Zach Levine. Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates

The linear mixed model: modeling hierarchical and longitudinal data

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

BIOMETRICS INFORMATION

introduc)on to stats in R and Rbrul

Splines and penalized regression

Statistical Pattern Recognition

Parallel line analysis and relative potency in SoftMax Pro 7 Software

Chemical Reaction dataset ( )

Regression. Dr. G. Bharadwaja Kumar VIT Chennai

Use JSL to Scrape Data from the Web and Predict Football Wins! William Baum Graduate Sta/s/cs Student University of New Hampshire

Generalized least squares (GLS) estimates of the level-2 coefficients,

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

Week 6, Week 7 and Week 8 Analyses of Variance

The following procedures and commands, are covered in this part: Command Purpose Page

Regression Analysis and Linear Regression Models

Section 2.3: Simple Linear Regression: Predictions and Inference

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Ronald H. Heck 1 EDEP 606 (F2015): Multivariate Methods rev. November 16, 2015 The University of Hawai i at Mānoa

Practical 4: Mixed effect models

Statistical Pattern Recognition

In this computer exercise we will work with the analysis of variance in R. We ll take a look at the following topics:

STATS PAD USER MANUAL

Transcription:

Sta$s$cs & Experimental Design with R Barbara Kitchenham Keele University 1

Analysis of Variance Mul$ple groups with Normally distributed data 2

Experimental Design LIST Factors you may be able to control BLOCK Factors under your control Some factors could be used to restrict scope of experiment E.G. Restrict to Post graduate students MEASURE Factors that cant be controlled Possible co- variates RANDOMLY Assign units to treatments within blocks 3

ANOVA Basic Terminology ANOVA stands for Analysis of Variance Consider the problem of deciding whether tes$ng method A is beter method B You recruit 20 testers (subjects/par$cipants) Randomly assign 10 to standard method (called a control) Randomly assign 10 to the new method Give them a tes$ng problem & measure outcome (e.g. number of defects detected) The two treatments together are referred to as a factor with two levels Number of defects is called dependent variable Method is called the independent variable Takes on two values A or B When you have equal number of par$cipants in each treatment condi$on Balanced design Otherwise unbalanced This is called a one- way between - groups ANOVA 4

Basic Experimental Designs One- way ANOVA means par$cipants classified in one dimension i.e. treatment There can be many treatments Treatments can be independent E.g. Tes$ng methods A, B, C, etc. Treatment may be related Based on the extent of a treatment E.g. Extent of training one day, two days, or 5 days 5

More Complex Designs Consider a tes$ng experiment comparing three methods Want to assess how well the methods work with programs of different complexity Assume three methods and three levels of complexity: easy, average, hard This experiment has two factors Tes$ng method and complexity For each tes$ng method we want to inves$gate each complexity condi$on Also interested in the effect of complexity level on the outcome of each method Which is called the interac;on between the factors For a balanced design we would need the number of par$cipants to be a mul$ple 9 product of number of condi$ons in each factor This design is called a 3 by 3 Factorial experiment 6

Within- subject Designs Alterna$vely suppose we have three tes$ng methods and tes$ng problems all of average complexity If each par$cipant tried out each method 20 par$cipants result in 60 observa$ons 20 for each tes$ng method In this case we can treat the individual par$cipants as a blocking factor Analysing the data to remove the effect of difference among par$cipants Hopefully reducing the variance used for our tests This give us a within- subjects design 7

Basic On- way ANOVA Model Fixed effects model x ij is i- th member of group j A is an overall average effect common to all observa$ons E j is a fixed or constant difference from A due to the jth popula$on common to all members of j e ij is a random error ~N(0,σ 2 ) H0 is all E j are zero and popula$on mean = A 8

Model parameters Assuming Independent of E j 9

Par$$oning Sums of Squares SSW: SSB: 10

Ra$onal for F test Distribu$on of ra$o of two chi- squared variables is known and called F distribu$on So distribu$on of ra$o of two sample variances (i.e. s 12 /s 22 ) follows the F distribu$on If distribu$on of measured values is Normal in each group and H0 true Ra$o of [SBG/(k- 1)]/[SWG/(N- k)] F with degrees of freedom k- 1 and N- k respec$vely 11

One- Way ANOVA Table Source of Varia;on Sum of Squares Degrees of Freedom Mean Square F- ra;o Between Groups Within Groups Total SSB ν=k- 1 MSB=SSB/ν MSB/MSW SSW ν=n- k MSW=SSW/ν SS 12

ANOVA for COCOMO Produc$vity with Mode as main factor Source of Varia;on Sum of Squares Degrees of Freedom Mean Square F- ra;o Between Groups 1.197 2 0.598 13.33 *** (p=1.62e- 05) Within Groups 2.693 60 0.0499 Total 3.89 62 0.0627 13

QQPlot of Produc$vity data analysis Studentized Residuals(fit) -1 0 1 2 3 4-2 -1 0 1 2 t Quantiles 14

QQPlot of ANOVA based on Log(Produc$vity) Studentized Residuals(fit2) -2-1 0 1 2-2 -1 0 1 2 t Quantiles 15

Standard ANOVA designs Blocked designs Blocking is used for controllable nuisance parameters Simplest design is randomised blocks design Has treatment factor (T) with k- levels Blocking Factor B Each Block has an observa$on for each treatment E.g. Block are student grades Match k- tuples of students based on grade Randomly assign one subject per block to each of k treatments Interac$on between blocks & treatments ignored 16

ANOVA Design for Randomised Blocks Treatments Blocks T1 T2 T3 B1 S1 S2 S3 B2 S4 S5 S6 B3 S7 S8 S9 Source SS df MS F Treatments SS Between Treatments k- 1 MST= SST/ df(t) MMST/ ME Blocks SS Between Blocks j- 1 MSB= SSB/ df(b) Error SS Within Treatments and Blocks (k- 1) (j- 1) ME= SSE/ df(e) 17

La$n- Square Two- way Blocking Example would be Par$cipants each try a set of different treatments Individual par$cipants are one block Order that par$cipants are assigned to each treatment is other block Order Subjects First Second Third S1 T1 T2 T3 S2 T2 T3 T1 S3 T3 T1 T2 18

Factorial Design Factor B Factor A Level 1 Level 2 Level 3 Level 1 P1,P2,P3 P4,P5,P6 P7,P8,P9 Level 2 P10,P11,P12 P13,P14,P15 P16,P17,P19 Source SS df MS F Factor A SS Between Factor k- 1 MSA= SSA/df(A) MSA/MSE A levels Factor B SS Factor B levels j- 1 MSB= SSB/df(B) MSB/MSE Interac$on SS Due to Interac$on between A and B (k- 1) (j- 1) MSAB= SSAB/df(AB) MSAB/MSE Error SS Within cells k j (n- 1) MSE= SSE/df(E) 19

Factor Analysis Example Use a subset of the COCOMO data base Select 6 projects from each Mode category Such that 3 project in each Mode category Have high requirements vola$lity Have normal requirements vola$lity One factor with 3 levels and one factor with two levels Balanced 2*3 Factor Analysis 20

Log(Produc$vity) Analysis Interaction between Mode and Requirement Volatility QQ Plot for 2-way factorial model mean of log(productivity) -2.0-1.5-1.0-0.5 0.0 rvolcat n h Studentized Residuals(fit4) -1 0 1 2 3 4 5 E O SD Modecat -2-1 0 1 2 t Quantiles 21

Influence Plot for Log(Produc$vity) Studentized Residuals -1 0 1 2 3 4 5 6 10 0.330 0.331 0.332 0.333 0.334 0.335 0.336 Hat-Values 22

Full COCOMO Dataset Interaction between Mode and Requirements Volatility mean of log(productivity) -3.0-2.5-2.0-1.5-1.0 rvolcat l n h vh Studentized Residuals(fit) -2-1 0 1 2-2 -1 0 1 2 E O SD Modecat t Quantiles 23

AOV Order dependency For full data set factors are not balanced Analysis differs depending on which factor entered first Mean Log(Produc$vity) with number of project in each in parenthesis Mode Requirements Vola$lity L N H VH E - 1.5554 (1) - 1.9730 (11) - 2.404 (11) - 3.0700 (5) O - 0.7644 (2) - 0.7511 (15) - 1.9205 (4) - 2.0554 (2) SD - 1.1595 (2) - 1.2211 (7) - 2.2785 (3) NA (0) Term Fitng Requirements Mode Residuals First Vola$lity MS Mode 4.2*** 10.318 ** 0.395 MS Req Vol 7.496 *** 5.373 *** 0.395 df 3 2 57 24

Random Effects and Mixed Effects Random effects model (n observa$ons in each group) where α j ~N(0,σ a2 ) Compared with fixed effects α j are random variables not fixed quan$$es to be es$mated Null hypothesis α j = 0 is the same Under H1, expected value of MSBG= nσ a2 +σ 2 Differences between models if H0 is false Owen used to assess different ways of measuring something So main purpose of analysis is to es$mate σ a 2 Rarely used in SE except for meta- analysis Mixed effects model includes some fixed and some random factors In such models, the F tests may differ from the equivalent fixed effects model Mixed and Random effects not handled in basic R configura$on 25

Different types of model Is the produc$vity of different plaxorms different? Obtain produc$vity measures from projects produced on the different plaxorms Fixed effects Are two methods of measuring func$on points equivalent Find 20 FP counters and 10 projects Assign 2 counters to each project Let each counter use both methods on their assigned project Mixed effects Project effect - fixed Method fixed Person effect - random With- in person error term Between method error term Important to use the correct tests Between method error term must be used to compare methods 26

Impact of Model type on 2- way Factorial Mean Squares A B AB Error Fixed Effects Random Effects Mixed Model: A fixed, B Random 27

SE Example Test Case Priori$za$on Design: 18 techniques 16 different test case priori$sa$on techniques 2 control techniques Ran experiments in groups of 4 techniques 8 C programs Generated 29 different versions with a random number of non- interfering faults From available set of regression tests for program Extracted 50 different test sets per program version for each method Each experiment could generate 4 8 29 50=46400 observa$ons Although not all combina$ons possible 28

Example of ANOVA table Source SS df MS F Program 3472054 7 49615.6 1358 Techn 97408 3 32469.2 88.9 Program*Techn 182322 21 8682.0 23.77 Error 9490507 259086 365.22 Is this analysis valid? 29

Model Each observa$on is based on Program - Fixed Treatment - Fixed Interac$on between Treatment and Program Within each program the version used Random effect Within each version test case used for each method Random effect 30

ANOVA Problems F- test requires the ra$o two chi- squared variables Variance of a Normal variable is chi- squared Also assume the variances are equal for each group Affects of non normality and heteroscedastcity Worse if sample sizes differ F test is not robust for heavy- tailed or skewed distribu$ons 31

MANOVA Analysis of variance generalised to mul$ple outcome variables Consider analysing Dura$on, KDSI & Effort (awer log transforma$on) within Mode Need to setup a data matrix containing only y variables Then use manova(y~modecat) Need library(mass) 32

MANOVA Results Modecat Log(Effort) Log(Dur) Log(AKDSI) E 5.8093 2.9453 3.48624 SD 4.7885 2.5510 3.3134 O 3.6552 2.4936 2.5862 F=8.27 with 6 and 118 degrees of freedom p=1.744e- 07 R command summary.aov(fit) Shows ANOVA for each variable separately Only Effort significant at p<0.05 Require Mul$variate Normality Homogeneity of variance- covariance matrices 33

Mahalanobis Distance With p 1 mul$variate random vector x with mean variance- covariance matrix S Mahalobis d 2 is distance between x and squared Chi- squared with p degrees of freedom Check normality by a qqplot of chi- squared Points should be close to lines with slope 1 and intercept 0 34

qqplot of d 2 Assessing Multivariate Normality Mahalanobis D2 0 2 4 6 8 10 63 0 2 4 6 8 10 12 qchisq(ppoints(n), df = p) 35

Robust two- way analyses Trimmed means can be used in a two- way factorial design Can cope with lack of balance Same results irrespec$ve of order Needs a reasonably large number of units in each cell Command is t2way(j,k,w,tr=p) W is a list with J K entries Might need to use p=.1 rather than.2 if small numbers of observa$ons per cell Recoded rvol categories so Normal & Low counted as one category High and Very high together counted as one category 36

Construc$ng List Variable w[[1]] contains the values for factor A level 1 and factor B level 1 w[[2]] w[[j ]] contain the values for factor A level 1 and factor B levels 2 to J w[[j+1]] w[[2j]] contains values for factor A level 2 and factor B levels 1 J w[[k(j- 1) +1]] w[[kj]] contains values for factor A level K and factor B levels 1 to J 37

Produc$vity per Cell Rvolcat Mode Organic Semi- Embedded detached N or L 0.5378 (17) 0.3137 (9) 0.1871 (12) H or VH 0.1507 (6) 0.2 231(3) 0.0866 (16) 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 38

Trimmed means results Effect due to Requirement Vola$lity significant (p=0.05) Effect due to Mode significant (p=0.001) Interac$on significant (p=0.014) Different results if log(produc$vity) Mode (p=0.002), Rvol(p=0.031), Interac$on (p=0.27) Similar results if log(produc$vity) & trim=0 Mode (p=0.002), Rvol (p=0.029), Interac$on (p=0.383) 39

Log(Produc$vity) -5-4 -3-2 -1 0 1-5 -4-3 -2-1 0 1-5 -4-3 -2-1 0 1-5 -4-3 -2-1 0 1-5 -4-3 -2-1 0 1-5 -4-3 -2-1 0 1 40

Non- Parametric Analysis Akritas, Arnold & Brunner method Works for unbalanced Factorial design Same results irrespec$ve of order Func$on: bdm2way(j,k,x) J=number of levels in Factor A K= number of levels in factor B Based on w as a list variable (same as for trimmed means) Reports the rela$ve effect size 41

COCOMO Example Produc$vity for factors Requirements vola$lity (two levels) Mode category E,SD,O Requirements vola$lity effects (p=0.059) Mode effects (p=0.205) Interac$on effects (p=0.624) Rela$ve effect size Requirements Vola$lity Mode Embedded Semi- Detached Organic Normal 0.4140 0.6693 0.7988 High 0.2202 0.3360 0.3995 42

Addi$onal facili$es Trimmed means Available for three- way designs Randomised effects Linear contrasts for complex designs MANOVA Not all techniques available in standard R configura$on With a good transforma$on available Can transform data and use tr=0 For facili$es not available in standard R 43

Conclusions ANOVA can easily get too complex to understand Always choose the simplest design possible Preferably one that is fully specified in a sta$s$cal text book Main problems are mixed designs with mul$ple levels and error terms ANOVA is reliant on normal distribu$ons but Possible to use trimmed means for Robust analyses However, may be beter to transform data Non- parametric methods for designs as complex as two- way factorial designs available in WRS library Allow for unbalanced designs ANCOVA covered by regression analysis MANOVA facili$es available Standard R facili$es Trimmed means 44