In this computer exercise we will work with the analysis of variance in R. We ll take a look at the following topics:
|
|
- Edgar Greer
- 6 years ago
- Views:
Transcription
1 UPPSALA UNIVERSITY Department of Mathematics Måns Thulin, Analysis of regression and variance Fall 2011 COMPUTER EXERCISE 2: One-way ANOVA In this computer exercise we will work with the analysis of variance in R. We ll take a look at the following topics: Inference about variances One-way ANOVA Diagnostics Pairwise comparisons Power of tests and the choice of sample size Comparison of tests for equal variances 1 Inference about variances Assume that we have two normally distributed samples and that we want to test whether or not the variances of the two samples are equal. This is done with an F-test (page in the textbook). In R this is done using the var.test function; we give an example with simulated data. x <- rnorm(50, mean = 0, sd = 2) y <- rnorm(30, mean = 1, sd = 1) var.test(x, y) Try changing the means and variances of the two samples! What happens? Inference about variances in the more general case where we have more than two samples is done using Bartlett s test or Levene s test, which we will study later in the exercise. 2 One-way ANOVA We will study the etch rate data example from the book (described on pages 61-62): make sure that you understand what the different quantities in the experiment are. The data is given in the etchrun.dat file (which you find on the course page on the student portal) and to begin with we examine the data by graphical methods: etch <- read.table("etchrun.dat", col.names = c("rate","power")) plot(rate ~ Power, etch, lwd=4, col="blue"); grid() 1
2 In order to do the analysis, we must tell R that Power contains factors as this is not obvious from the data file. To do this, we use the command factor as follows: etch$power <- factor(etch$power, labels=c("160","180","200","220")) Together with the previous plot, a boxplot can be used to get some initial ideas about the data and to get some indications about whether the assumption of equal variances is correct or not. boxplot(rate~power, etch) 2.1 Estimation To perform the one-way ANOVA we can use the commands lm and anova. As before lm fits a linear model. anova gives a summary table about the result of the ANOVA. m1 <- lm(rate ~ Power, etch) anova(m1) Alternatively, we can use aov instead of lm: m2 <- aov(rate ~ Power, etch) anova(m2) (The result is the same but the data in m1 and m2 is stored in somewhat different ways.) What is the null and alternative hypotheses in ANOVA? What conclusions regarding the hypotheses can be drawn from the summary tables? 2.2 Diagnostics Behind the ANOVA lies the assumption the the errors are normal with homogeneous variance. As in the regression case, the normality assumption can be checked using a QQ-plot (qqnorm) or the Shapiro-Wilk test (shapiro.test). We can also plot the residuls in some form, for instance against the fitted values: plot(m1$fit, m1$res); grid() What does the figure tell us about constant variance? To further investigate the equality of variances we can use Bartlett s test. It is found on page 79 of the textbook. The test statistic χ 2 0 presented there uses logarithms with base 10, resulting in the constant being a part of the statistic. A somewhat easier way of writing the statistic is to use natural logarithms: χ 2 0 = q/c q = (N a) ln(ms E ) a (n i 1) ln Si 2 where c and Si 2 = 1 ni n i 1 j=1 (y ij ȳ i. ) 2 are defined as in the textbook. In R, we perform Bartlett s test as follows: bartlett.test(rate~power, etch) i=1 2
3 What conclusions can be drawn from the given p-value? As an alternative, we can use Levene s test. The idea behind the test is to compute d ij = y ij ỹ i, where ỹ i is the median for the i:th population, and to use these as a response variable in a new one-way ANOVA. The use of the median instead of the mean makes the test more robust, i.e. less sensitive to outliers and departures from normality (we investigate this further at the end of this computer exercise). An R function for Levene s test is found in the car library, which is opened by writing library(car). The function is called levene.test see the help file for instructions on how to use it. 2.3 A note on data import The data in etchrun.dat was stored in a way that was convenient for us to import into R all we needed to do after importing it was to mark one of the variables as a factor. At times the data can be stored in ways that makes it a little harder to import it. In the file etching.dat the data for each level of the factor is stored in separate columns (compare etchrun.dat and etching.dat to see the difference). Using the stack command we can store the data in a more suitable way in R: etch2 <- read.table("etching.dat", col.names = c("160","180","200","220")) etch3 <- stack(etch2) names(etch3) <- c("rate","power") anova(lm(rate~power,etch3)) You should get the same answer as before. 2.4 Pairwise comparisons Pairwise comparisons of means can be done using Tukey s HSD (page 92). The R function for this requires an object created with aov; in our case the model called m2: cm2 <- TukeyHSD(m2, "Power"); cm2 plot(cm2) How should the output from the function be interpreted? What conclusions can be drawn? EXERCISE. In an experiment the effect of shelf height on the sales of a particular doog food (Arf Dog Food) was studied. During a period of 8 days the daily sales (in hundreds of dollars) for three shelf heights, knee, waist and eye height, was registered. The shelf height was changed randomly three times daily. The data is stored in the file shelf.dat (column 1: knee height, column 2: waist height, column 3: eye height). Assume an ANOVA model, one-way with one factor at three levels (the different heights). Analyse the effect of shelf height on sales using the methods we ve used above. 3 Power of tests and sample size Next we will consider the problem of determining a suitable number of replicates see section 3.7 (page 101) in the textbook for a brief introduction to the topic. The book 3
4 uses the quantity Φ 2 which is used to read so-called OC-curves. With R this can be done directly, without having to use the figures in the appendix of the book. To do this, we must calculate a quantity corresponding to the Φ 2 used in the book: f 2 = 1 σ 2 k p i (µ i µ) 2 i=1 where k is the number of levels of the factor, p i = 1/k for balanced experiments and a suitable value for σ is choosen. Then f 2 = Φ 2 /n. f is sometimes called the effect size. Let us take a closer look at Example 3.10 (page 102) using R. The confidence level is α = 0.01, the required power is 1 β = 0.90 and the standard deviation σ = 25 /min is assumed. We could easily sum the four terms, but we will use a more general approach and show how to use the command apply (which in our example below operates on the matrix elm): library(pwr) k0 = 4 levs <- c(575,600,650,675) elm <- matrix( (levs-625)^2/25^2 ) f0 = sqrt( 1/k0*apply(elm,2,sum) ) pwr.anova.test(k=k0, f=f0, sig.level=0.01, power=0.9) How many replicates are needed to get the required test power? An alternate approach for selecting a sample size is described on pages It involves the greatest difference D between the means of the different levels. We get f 2 = D 2 /2/k/σ 2 and can call pwr.anova.test as before. EXERCISE. Go through the procedure on pages using R, i.e. choose D = 75 and so on. How many replicates are required? 4 Comparison of Bartlett s test and Levene s test Earlier in the computer exercise we claimed that Levene s test was more robust than Bartlett s test. This is not shown in the textbook, but we can investigate it using computer simulations. If you find this interesting, take a look at the R script bart.r in the student portal. Go through the code presented there, and then do the exercises below. EXERCISE. Change the variances so that at least one sample has a variance that is different from that of the others. Does the number of tests that result in rejection of the null hypothesis increase, as expected? Is one of the tests better in the sense that it has a higher amount of rejections (i.e. higher power)? EXERCISE. Use runif to simulate samples from the uniform distribution instead, first with equal variances and then with unequal variances. What happens? EXERCISE. In the sammer manner, use rexp to simulate samples from the exponential distribution. What happens? 4
5 Try inserting an outlier into one of the samples and see what hap- EXERCISE. pens. Example: x<-c(rnorm(n[1]-1,0,1), rnorm(1,10,1), rnorm(n[2],1,1), rnorm(n[3],1,1)) EXERCISE. Change the sample sizes so that they differ more. What happens? 5
Analysis of variance - ANOVA
Analysis of variance - ANOVA Based on a book by Julian J. Faraway University of Iceland (UI) Estimation 1 / 50 Anova In ANOVAs all predictors are categorical/qualitative. The original thinking was to try
More informationLab 5 - Risk Analysis, Robustness, and Power
Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors
More informationWeek 7: The normal distribution and sample means
Week 7: The normal distribution and sample means Goals Visualize properties of the normal distribution. Learning the Tools Understand the Central Limit Theorem. Calculate sampling properties of sample
More informationIQR = number. summary: largest. = 2. Upper half: Q3 =
Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number
More informationZ-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
Z-TEST / Z-STATISTIC: used to test hypotheses about µ when the population standard deviation is known and population distribution is normal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses
More informationLab #9: ANOVA and TUKEY tests
Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for
More informationPrepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.
Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good
More informationfor statistical analyses
Using for statistical analyses Robert Bauer Warnemünde, 05/16/2012 Day 6 - Agenda: non-parametric alternatives to t-test and ANOVA (incl. post hoc tests) Wilcoxon Rank Sum/Mann-Whitney U-Test Kruskal-Wallis
More informationRegression Analysis and Linear Regression Models
Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical
More informationPair-Wise Multiple Comparisons (Simulation)
Chapter 580 Pair-Wise Multiple Comparisons (Simulation) Introduction This procedure uses simulation analyze the power and significance level of three pair-wise multiple-comparison procedures: Tukey-Kramer,
More informationAssumption 1: Groups of data represent random samples from their respective populations.
Tutorial 6: Comparing Two Groups Assumptions The following methods for comparing two groups are based on several assumptions. The type of test you use will vary based on whether these assumptions are met
More informationMacros and ODS. SAS Programming November 6, / 89
Macros and ODS The first part of these slides overlaps with last week a fair bit, but it doesn t hurt to review as this code might be a little harder to follow. SAS Programming November 6, 2014 1 / 89
More informationGetting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018
Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018 Contents Overview 2 Generating random numbers 2 rnorm() to generate random numbers from
More informationStatistical Tests for Variable Discrimination
Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:
More informationSTATS PAD USER MANUAL
STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11
More informationTHIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010
THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE
More informationTable Of Contents. Table Of Contents
Statistics Table Of Contents Table Of Contents Basic Statistics... 7 Basic Statistics Overview... 7 Descriptive Statistics Available for Display or Storage... 8 Display Descriptive Statistics... 9 Store
More informationLearner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display
CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &
More informationSections 4.3 and 4.4
Sections 4.3 and 4.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 32 4.3 Areas under normal densities Every
More informationNonparametric and Simulation-Based Tests. Stat OSU, Autumn 2018 Dalpiaz
Nonparametric and Simulation-Based Tests Stat 3202 @ OSU, Autumn 2018 Dalpiaz 1 What is Parametric Testing? 2 Warmup #1, Two Sample Test for p 1 p 2 Ohio Issue 1, the Drug and Criminal Justice Policies
More informationE-Campus Inferential Statistics - Part 2
E-Campus Inferential Statistics - Part 2 Group Members: James Jones Question 4-Isthere a significant difference in the mean prices of the stores? New Textbook Prices New Price Descriptives 95% Confidence
More informationMeet MINITAB. Student Release 14. for Windows
Meet MINITAB Student Release 14 for Windows 2003, 2004 by Minitab Inc. All rights reserved. MINITAB and the MINITAB logo are registered trademarks of Minitab Inc. All other marks referenced remain the
More informationThe ctest Package. January 3, 2000
R objects documented: The ctest Package January 3, 2000 bartlett.test....................................... 1 binom.test........................................ 2 cor.test.........................................
More informationGeneral Factorial Models
In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 1 1 / 31 It is possible to have many factors in a factorial experiment. We saw some three-way factorials earlier in the DDD book (HW 1 with 3 factors:
More informationConfidence Intervals. Dennis Sun Data 301
Dennis Sun Data 301 Statistical Inference probability Population / Box Sample / Data statistics The goal of statistics is to infer the unknown population from the sample. We ve already seen one mode of
More informationRecall the expression for the minimum significant difference (w) used in the Tukey fixed-range method for means separation:
Topic 11. Unbalanced Designs [ST&D section 9.6, page 219; chapter 18] 11.1 Definition of missing data Accidents often result in loss of data. Crops are destroyed in some plots, plants and animals die,
More informationappstats6.notebook September 27, 2016
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using
More informationChapter 2: The Normal Distributions
Chapter 2: The Normal Distributions Measures of Relative Standing & Density Curves Z-scores (Measures of Relative Standing) Suppose there is one spot left in the University of Michigan class of 2014 and
More informationSPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL
SPSS QM II SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform some statistical analyses in SPSS. Details around a certain function/analysis method not covered
More informationSlides 11: Verification and Validation Models
Slides 11: Verification and Validation Models Purpose and Overview The goal of the validation process is: To produce a model that represents true behaviour closely enough for decision making purposes.
More informationGetting Started with Minitab 17
2014, 2016 by Minitab Inc. All rights reserved. Minitab, Quality. Analysis. Results. and the Minitab logo are all registered trademarks of Minitab, Inc., in the United States and other countries. See minitab.com/legal/trademarks
More informationThe first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.
Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.
More informationDescriptive Statistics, Standard Deviation and Standard Error
AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.
More informationGetting Started with Minitab 18
2017 by Minitab Inc. All rights reserved. Minitab, Quality. Analysis. Results. and the Minitab logo are registered trademarks of Minitab, Inc., in the United States and other countries. Additional trademarks
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More informationSTAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.
STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,
More informationGeneral Factorial Models
In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 34 It is possible to have many factors in a factorial experiment. In DDD we saw an example of a 3-factor study with ball size, height, and surface
More informationData Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski
Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming
More informationMath 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency
Math 14 Introductory Statistics Summer 008 6-9-08 Class Notes Sections 3, 33 3: 1-1 odd 33: 7-13, 35-39 Measures of Central Tendency odd Notation: Let N be the size of the population, n the size of the
More informationChapter 2: The Normal Distribution
Chapter 2: The Normal Distribution 2.1 Density Curves and the Normal Distributions 2.2 Standard Normal Calculations 1 2 Histogram for Strength of Yarn Bobbins 15.60 16.10 16.60 17.10 17.60 18.10 18.60
More informationEvaluating Robot Systems
Evaluating Robot Systems November 6, 2008 There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it
More information2) In the formula for the Confidence Interval for the Mean, if the Confidence Coefficient, z(α/2) = 1.65, what is the Confidence Level?
Pg.431 1)The mean of the sampling distribution of means is equal to the mean of the population. T-F, and why or why not? True. If you were to take every possible sample from the population, and calculate
More informationTable of Contents (As covered from textbook)
Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression
More informationIntroduction to Data Science
Introduction to Data Science CS 491, DES 430, IE 444, ME 444, MKTG 477 UIC Innovation Center Fall 2017 and Spring 2018 Instructors: Charles Frisbie, Marco Susani, Michael Scott and Ugo Buy Author: Ugo
More informationLab 07: Multiple Linear Regression: Variable Selection
Lab 07: Multiple Linear Regression: Variable Selection OBJECTIVES 1.Use PROC REG to fit multiple regression models. 2.Learn how to find the best reduced model. 3.Variable diagnostics and influential statistics
More informationResampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016
Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation
More informationIndex. Bar charts, 106 bartlett.test function, 159 Bottles dataset, 69 Box plots, 113
Index A Add-on packages information page, 186 187 Linux users, 191 Mac users, 189 mirror sites, 185 Windows users, 187 aggregate function, 62 Analysis of variance (ANOVA), 152 anova function, 152 as.data.frame
More informationCHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves
More informationUnit 5: Estimating with Confidence
Unit 5: Estimating with Confidence Section 8.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Unit 5 Estimating with Confidence 8.1 8.2 8.3 Confidence Intervals: The Basics Estimating
More informationDesign of Experiments
Seite 1 von 1 Design of Experiments Module Overview In this module, you learn how to create design matrices, screen factors, and perform regression analysis and Monte Carlo simulation using Mathcad. Objectives
More informationIntroductory Applied Statistics: A Variable Approach TI Manual
Introductory Applied Statistics: A Variable Approach TI Manual John Gabrosek and Paul Stephenson Department of Statistics Grand Valley State University Allendale, MI USA Version 1.1 August 2014 2 Copyright
More informationBar Graphs and Dot Plots
CONDENSED LESSON 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs
More information2010 by Minitab, Inc. All rights reserved. Release Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are
2010 by Minitab, Inc. All rights reserved. Release 16.1.0 Minitab, the Minitab logo, Quality Companion by Minitab and Quality Trainer by Minitab are registered trademarks of Minitab, Inc. in the United
More informationMinitab 17 commands Prepared by Jeffrey S. Simonoff
Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More informationStatistics Lab #7 ANOVA Part 2 & ANCOVA
Statistics Lab #7 ANOVA Part 2 & ANCOVA PSYCH 710 7 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")
More information2) familiarize you with a variety of comparative statistics biologists use to evaluate results of experiments;
A. Goals of Exercise Biology 164 Laboratory Using Comparative Statistics in Biology "Statistics" is a mathematical tool for analyzing and making generalizations about a population from a number of individual
More informationSTA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures
STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and
More informationMinitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.
Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Introduction to Minitab The interface for Minitab is very user-friendly, with a spreadsheet orientation. When you first launch Minitab, you will see
More informationMATH NATION SECTION 9 H.M.H. RESOURCES
MATH NATION SECTION 9 H.M.H. RESOURCES SPECIAL NOTE: These resources were assembled to assist in student readiness for their upcoming Algebra 1 EOC. Although these resources have been compiled for your
More informationFly wing length data Sokal and Rohlf Box 10.1 Ch13.xls. on chalk board
Model Based Statistics in Biology. Part IV. The General Linear Model. Multiple Explanatory Variables. Chapter 13.6 Nested Factors (Hierarchical ANOVA ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6,
More informationSimulating power in practice
Simulating power in practice Author: Nicholas G Reich This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en
More informationWeek 4: Simple Linear Regression III
Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of
More informationAverages and Variation
Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus
More informationUNIT 1A EXPLORING UNIVARIATE DATA
A.P. STATISTICS E. Villarreal Lincoln HS Math Department UNIT 1A EXPLORING UNIVARIATE DATA LESSON 1: TYPES OF DATA Here is a list of important terms that we must understand as we begin our study of statistics
More informationTutorial 3: Probability & Distributions Johannes Karreth RPOS 517, Day 3
Tutorial 3: Probability & Distributions Johannes Karreth RPOS 517, Day 3 This tutorial shows you: how to simulate a random process how to plot the distribution of a variable how to assess the distribution
More informationUsing Excel for Graphical Analysis of Data
Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are
More informationboxplot - A graphic way of showing a summary of data using the median, quartiles, and extremes of the data.
Learning Target Create scatterplots and identify whether there is a relationship between two sets of data. Draw a line of best fit and use it to make predictions. Focus Questions How can I organize data?
More informationST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.
ST512 Fall Quarter, 2005 Exam 1 Name: Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. 1. (42 points) A random sample of n = 30 NBA basketball
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use
More informationToday s Lecture. Factors & Sampling. Quick Review of Last Week s Computational Concepts. Numbers we Understand. 1. A little bit about Factors
Today s Lecture Factors & Sampling Jarrett Byrnes September 8, 2014 1. A little bit about Factors 2. Sampling 3. Describing your sample Quick Review of Last Week s Computational Concepts Numbers we Understand
More informationChapter 6: DESCRIPTIVE STATISTICS
Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling
More informationIntroductory Guide to SAS:
Introductory Guide to SAS: For UVM Statistics Students By Richard Single Contents 1 Introduction and Preliminaries 2 2 Reading in Data: The DATA Step 2 2.1 The DATA Statement............................................
More informationMTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET
MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET Before you work on the practice problems (Section 3) please make sure that you read the supplementary notes (Section 1) and work through
More informationChapter 2 Modeling Distributions of Data
Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and
More informationhumor... May 3, / 56
humor... May 3, 2017 1 / 56 Power As discussed previously, power is the probability of rejecting the null hypothesis when the null is false. Power depends on the effect size (how far from the truth the
More informationProblem set for Week 7 Linear models: Linear regression, multiple linear regression, ANOVA, ANCOVA
ECL 290 Statistical Models in Ecology using R Problem set for Week 7 Linear models: Linear regression, multiple linear regression, ANOVA, ANCOVA Datasets in this problem set adapted from those provided
More informationResources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.
Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationMultiple Comparisons of Treatments vs. a Control (Simulation)
Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that
More informationCLAREMONT MCKENNA COLLEGE. Fletcher Jones Student Peer to Peer Technology Training Program. Basic Statistics using Stata
CLAREMONT MCKENNA COLLEGE Fletcher Jones Student Peer to Peer Technology Training Program Basic Statistics using Stata An Introduction to Stata A Comparison of Statistical Packages... 3 Opening Stata...
More informationProbability Models.S4 Simulating Random Variables
Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Probability Models.S4 Simulating Random Variables In the fashion of the last several sections, we will often create probability
More informationheight VUD x = x 1 + x x N N 2 + (x 2 x) 2 + (x N x) 2. N
Math 3: CSM Tutorial: Probability, Statistics, and Navels Fall 2 In this worksheet, we look at navel ratios, means, standard deviations, relative frequency density histograms, and probability density functions.
More informationLAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT
NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling
More informationVocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.
5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table
More informationThe Normal Distribution & z-scores
& z-scores Distributions: Who needs them? Why are we interested in distributions? Important link between distributions and probabilities of events If we know the distribution of a set of events, then we
More informationChemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt )
JMP Output from Chapter 9 Factorial Analysis through JMP Chemical Reaction dataset ( https://stat.wvu.edu/~cjelsema/data/chemicalreaction.txt ) Fitting the Model and checking conditions Analyze > Fit Model
More informationModeling and Performance Analysis with Discrete-Event Simulation
Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 10 Verification and Validation of Simulation Models Contents Model-Building, Verification, and Validation Verification
More informationContinuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.
Continuous Improvement Toolkit Normal Distribution The Continuous Improvement Map Managing Risk FMEA Understanding Performance** Check Sheets Data Collection PDPC RAID Log* Risk Analysis* Benchmarking***
More informationDealing with Categorical Data Types in a Designed Experiment
Dealing with Categorical Data Types in a Designed Experiment Part II: Sizing a Designed Experiment When Using a Binary Response Best Practice Authored by: Francisco Ortiz, PhD STAT T&E COE The goal of
More informationOne Factor Experiments
One Factor Experiments 20-1 Overview Computation of Effects Estimating Experimental Errors Allocation of Variation ANOVA Table and F-Test Visual Diagnostic Tests Confidence Intervals For Effects Unequal
More informationCpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.
C: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C is one of many capability metrics that are available. When capability metrics are used, organizations typically provide
More informationExample 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy.
JMP Output from Chapter 5 Factorial Analysis through JMP Example 5.25: (page 228) Screenshots from JMP. These examples assume post-hoc analysis using a Protected LSD or Protected Welch strategy. Fitting
More informationBluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition
Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created
More informationChapters 5-6: Statistical Inference Methods
Chapters 5-6: Statistical Inference Methods Chapter 5: Estimation (of population parameters) Ex. Based on GSS data, we re 95% confident that the population mean of the variable LONELY (no. of days in past
More informationYear 10 General Mathematics Unit 2
Year 11 General Maths Year 10 General Mathematics Unit 2 Bivariate Data Chapter 4 Chapter Four 1 st Edition 2 nd Edition 2013 4A 1, 2, 3, 4, 6, 7, 8, 9, 10, 11 1, 2, 3, 4, 6, 7, 8, 9, 10, 11 2F (FM) 1,
More informationInterval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.
Interval Estimation It is a common requirement to efficiently estimate population parameters based on simple random sample data. In the R tutorials of this section, we demonstrate how to compute the estimates.
More informationData Analysis and Hypothesis Testing Using the Python ecosystem
ARISTOTLE UNIVERSITY OF THESSALONIKI Data Analysis and Hypothesis Testing Using the Python ecosystem t-test & ANOVAs Stavros Demetriadis Assc. Prof., School of Informatics, Aristotle University of Thessaloniki
More informationWorkshop 8: Model selection
Workshop 8: Model selection Selecting among candidate models requires a criterion for evaluating and comparing models, and a strategy for searching the possibilities. In this workshop we will explore some
More information