Lab 5 - Risk Analysis, Robustness, and Power

Size: px
Start display at page:

Download "Lab 5 - Risk Analysis, Robustness, and Power"

Transcription

1 Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors when, after the examination of quantitative data, we conclude that the tested null hypothesis is false or not false. These 'errors' can be thought of as the "risks" of making particular kinds of mistakes when basing a decision on the examination of quantitative data. To determine or estimate 'risks' (the probability of Type I and Type II errors), we follow a formal hypothesis testing procedure in which we insure or assume that our data meet certain conditions (e.g., independence, normality, and homogeneity of variances for parametric tests and independence and continuity for non-parametric tests). We state a decision rule prior to applying our test which amounts to specifying α, the significance level for our test, but also is synonymous with setting our Type I error rate and defining our region of rejection (i.e., the values of the test statistic for which we will reject the null hypothesis). Since we control or specify our Type I error rate prior to performing a test, if we meet the assumptions of the test, then the actual frequency of Type I errors we would make, if we repeated our experiment many times under the same conditions, will equal to the nominal rate we specify. For example, the histogram below represents 20,000, 2-sample t - tests computed from data that were derived by sampling 5 subjects from each of two 2 underlying populations with exactly identical means (9.0) and variances (2.0). Note that this histogram is the sampling distribution of the two-sample t - statistic for samples of size n 1 = n 2 = 5, and that the distribution has a mean suspiciously close to 0, and a variance suspiciously close to

2 If our t - test is working well and we perform an upper one-tailed test with α = 0.05, then 5% of our t - values would fall in the region of rejection even when the null hypothesis is exactly true! In the example above, 20 sets of 1000 t - tests were conducted and one would expect that 5% x 1000 = 50 test to be significant at the 5% level. In fact the average number of Type I errors (significant t - tests even when the null hypothesis was true) was 50.8 ± (se). II. Robustness In many cases, we may be in error when we assume that our data follow a particular distribution or when we claim that the variances in our treatment groups are equal. When this situation arises, the distribution of the statistic that we use in testing our hypothesis may differ from that expected when the assumptions of the test are met. For example, if we use a t - test to compare a sample estimate of the mean to a theoretical value, the test statistic does not have an exact t - distribution if the population of values of our random variable, x, is not normally distributed. Similarly, if we are performing a separate groups t - test and assume that the treatment group variances are equal when in fact they are not, our estimates of the probability of making Type I and Type II errors will be distorted. Therefore, the consequence of not meeting the assumptions of normality and homogeneity of variances is that our estimates of the probability of making Type I or Type II errors will be distorted. For instance, we may conclude from a t - test that the chance we are making a Type I error when we reject the null hypothesis is 5%. But, if x is not a normally distributed random variable, the true probability of making a Type I error might be more than 5% because the test statistic does not follow exactly the t - distribution with the specified degrees of freedom. The ability of a specific statistical hypothesis test to provide accurate estimates of the probability of Type I and Type II errors, even when the underlying assumptions are violated, is called robustness. Some hypothesis tests are more robust to deviations from certain underlying assumptions than others. The type and magnitude of the deviation of the data from the assumptions required by a test is often important in choosing the appropriate statistical test to apply. Tests of hypotheses are used in many situations where the underlying assumptions are violated. Therefore, robustness is a desirable property. The following table summarizes the results of a large number of computer simulations to determine the consequences of violating the assumptions of normality and equality of variances in the two sample t - test. For each simulation, samples were drawn from two underlying populations with exactly equal means and either equal or unequal variances. Since the population means were known to be equal, anytime we reject the null hypothesis we are committing a Type I error. The values given in this table are the number of Type I errors (out of 1000 replicate simulation runs) when the null hypothesis was tested versus an upper 1 - tailed alternative hypothesis. At an α = 0.05 level, we would expect 0.05 X 1000 = 50 Type I errors in each instance, if the test was performing 2

3 according to expectations. Since 20 - replicate runs of 1000 simulations were performed for each table entry, the standard error of each entry is also presented. Values that are substantially lower than 50 or greater than 50 indicate that the test is actually being performed at an α level other than the nominal 5% we intended. In other words we either make to many Type I errors (reject when true) or two few (fail to reject when false). The simulations were performed on populations that were either normally or uniformly distributed with means equal to 9.0. In the case where variances were equal, variances both = 4.0. For the case of unequal variances var 1 = 4.0, var 2 =16.0. Assessment of the Robustness of the t - test to violations of the assumptions of Normality and Equality of Variances (values in table are number of Type I errors in 1000 trials ± se) Sample Sizes n1,n2 Equal Variances (var 1 = var 2 = 4.0) Unequal Variances (var 1 = 4.0, var 2 = 16.0) pooled variance estimate separate variance estimate normal uniform normal uniform normal uniform A B C D E F 5,5 49.5± ± ± ± ± ±1.8 20, ± ± ± ± ± ± , ± ± ± ± ± ±2.9 5, ± ± ± ± ± ±1.7 20, ± ± ± ± ± ±2.9 5, ± ± ± ± ± ±2.1 20, ± ± ± ± , ± ± ± ± , ± ± ± ±2.0 From this table we see that when variances are equal, the t - test performed well regardless of whether the population was normally distributed or the sample sizes were equal (columns A and B). However, when the variances are unequal the test only performs well if the population is normally distributed and sample sizes are equal (first three rows in column C). If we use the Satterthwaite correction with unequal variances, then the t - test also performs well for unequal variances whether or not sample sizes are equal (data column E). In virtually all instances, the t - test does not perform well when the underlying population is non-normal and variances are unequal (data column F). 3

4 II. Power Power is a measure of the ability of a statistical test to detect an experimental effect that is actually present. Power is an estimate of the probability of rejecting the null hypothesis (H o ) if a specified alternative hypothesis (H a ) is actually correct. Power is equal to one minus the probability of making a Type II error, β, (failing to reject H o when it is false): power = 1 - β; thus, the smaller the Type II error, the greater the power and, therefore, the greater the sensitivity of the test. The level of power will depend on several factors: 1) the magnitude of the difference between H o and H a, which is also called the "effect size," 2) the amount of variability in the underlying population(s) (the variances), 3) the sample size, n, and 4) the level of significance chosen for the test, α or the probability of a Type I error. For more information on statistical power analysis see the supplemental lecture notes on power and more about power analysis. As long as we are committed to making decisions in the face of incomplete knowledge, as every scientist is, we cannot avoid making Type I and Type II errors. We can, however, try to minimize the chances of making them. We directly control the probability of making a Type I error by our selection of α, the significance level of our test. By setting a region of rejection, we are taking a risk that a certain proportion of the time (for example 5%, when α = 0.05) we will obtain values of our test statistic that would lead us to reject the null hypothesis (fall in the region of rejection) even when the null hypothesis is true. How can we reduce the probability of making a Type II error? One obvious way is to increase the size of the region of rejection. In other words, increase α. Of course, we do so at the cost of an increased probability of making Type I errors. Every researcher must strike a balance between the two types of error. Other ways to reduce the probability of making a Type II error are to increase the sample size or to reduce the variability in the data. Increasing sample size is simple, but reducing variability in the observations is also an important means to improve the power of tests. The process of "experimental design" is one in which research effort is allocated to insure the most powerful test for the effort expended is actually conducted. The probability of making a Type II error and the power of a statistical test are more difficult to determine because they require one to specify a quantitative alternative hypothesis. However, if one can specify a reasonable alternative hypothesis, guides for calculating the power of a test, or the sample size necessary to achieve a desired level of power are now becoming more widely available. The most comprehensive book on statistical power analysis is by Cohen (1977), but many web-based power calculators and statistical packages now include functions to permit the determination of power and sample size. Click here to view the available web-based power calculators or see the table below for links to power calculators for t - tests. Click here to link to the website where you can download the program PS, a Windows based power calculator. 4

5 Calculating Power in R In R there are four different ways to make power calculations: 1) using built in functions for simple power calculations, 2) using the contributed package "pwr", 3) using the non-central distribution functions, and 4) by simulation. Built in R Functions for Power and Sample Size Calculation There are 3 functions in the base R installation for power and sample size calculation: power.t.test, power.anova.test, and power.prop.test. These functions will calculate power for various t - tests, analysis of variance (ANOVA) designs, and for tests of equality of proportions. For the purposes of this lab exercise and for illustrating the general workings of these functions, I will focus on the power.t.test function. The power.t.test function has several arguments power.t.test(n =, delta =, sd =, sig.level =,power =,type = "", alternative ="", strict = FALSE) where n is the sample size (the within group sample size for a 2 sample t-test), delta (δ) is the effect size expressed as a difference in means, s is the standard deviation, sig.level is the α - value for the test, power is the desired power which is entered only when you are calculating sample size in which case n = NULL or is left out, type has three possible levels given below, alternative has two possible levels given below, and strict which determines whether or not both tails of the non-central distribution are included in the power calculation (TRUE or not FALSE). type =("two.sample", "one.sample", "paired"), alternative = ("two.sided", "one.sided") strict = (FALSE, TRUE) Suppose we have a preliminary estimate of σ = 4.6 and wanted to determine the power of a one sample t test relative the alternative hypothesis that µ exceeded the hypothesized value by 3 units with a sample size of n =15. # compute power of one sample t - test using built in function power.t.test(n=15,d=3, sd=4.6,sig.level=0.05,type="one.sample",alternative="o ne.sided") ## ## One-sample t test power calculation ## ## n = 15 ## delta = 3 ## sd = 4.6 5

6 ## sig.level = 0.05 ## power = ## alternative = one.sided Not that the output from R reiterates the conditions for which power is bring calculated and give the estimate of power. The 'pwr' package in R The 'pwr' package in R implements the power calculations outlined by Jacob Cohen in his book "Statistical Power Analysis for the Behavioral Sciences" (1988). Cohen's book is about the only book length treatment that deals with power and sample size calculation. The 'pwr' package has functions to compute power and sample size for a wider array of statistical tests than the baseline functions in R. However, the drawback to using the 'pwr' package is that its functions require you to express the "effect size" in terms of effect size formula that are given in Cohen's book, but are not available in the documentation for the package. Alternatively, if you are willing to use Cohen's definition of a "small," "medium," or "large" effect size, which he defined based on his knowledge of the behavioral sciences, then an internal function will provide the necessary effect size values for each test. In my experience, Cohen's effect sizes are not appropriate for applications in ecology, and potentially more broadly in biology. I will illustrate the function for power and sample size calculation for the one sample t - test, and for a 2 - sample t - test with equal or unequal sample sizes in the 'pwr' package. The function for a one sample, paired, or 2 - sample t - test with equal sample sizes is: pwr.t.test(n =, d =, sig.level =, power =,type =, alternative = ) The potential levels of the arguments "type=" and "alternative=" are: type = "two.sample", "one.sample", "paired", alternative = "two.sided", "less", "greater" # compute power for same one - sample t - test as above using pwr package # attach pwr package library(pwr) # compute power pwr.t.test(n = 15, d =(3/4.6), sig.level = 0.05, type ="one.sample",alternat ive ="greater" ) 6

7 ## ## One-sample t test power calculation ## ## n = 15 ## d = ## sig.level = 0.05 ## power = ## alternative = greater Note we get the same results as when we used the built in power.t.test function. The function for power and sample size calculation for the t - test in the 'pwr' package is very similar to the base function, but note that the options for the argument "alternative=" differ. In the case of a 2 - sample t - test with unequal sample sizes, the 'pwr' package offers the pwr.t2n.test function. This function could also be used for equal sample sizes as well. It has similar arguments, but requires that you specify both sample sizes and does not have the "type=" argument. pwr.t2n.test(n1 =, n2=, d =, sig.level =, power =, alternative = ) Suppose we have a control and treatment group and we know that the mean for the control group is 12.4 and that the control group standard deviation is 8.7. What would be the power of a one-tailed t test performed at the α = 0.05 versus the alternative hypothesis that the treatment group mean was 25% higher than the control group mean with control group sample size of 10 and treatment group sample size of 20? Since we assume that variances are equal the Cohen's effect size is the differences in means divided by the estimated standard deviation for the control group. # specify control group mean and calculate treatment group mean expected # under alternative hypothesis m1=12.4 m2=1.25 * 12.4 # specify preliminary estimate of s s=8.7 # calculate Cohen's effect size measure - delta delta=abs(m1-m2)/s delta ## [1] # compute power of t - test pwr.t2n.test(n1=10,n2=15,d=delta,sig.level=0.05,alternative="greater") ## ## t test power calculation 7

8 ## ## n1 = 10 ## n2 = 15 ## d = ## sig.level = 0.05 ## power = ## alternative = greater Using Non-Central Distributions for Power and Sample Size Analysis in R Another approach to power and sample size calculation in R is based on using the distribution functions available for the various test statistics such the t, F and χ 2 distributions. Recall that the distribution of the test statistic under the null hypothesis is said to be centrally distributed about its expected value when the null hypothesis is true. For the t - statistic its expected value is 0, for F it is 1, and for χ 2 it is the degrees of freedom of the test. The distribution of the test statistic when a specific quantitative alternative hypothesis is articulated is the same distribution, but shifted so that its expected value is now equal to what is called the "non-centrality parameter." The non-centrality parameter is a measure of the "effect size." R has a set of functions for each of these distributions that have an argument 'ncp=' which allows one to set the non-centrality parameter. For a one -sample t - test or a paired t - test the non-centrality parameter (δ) is: δ = x a μ s n which looks like a t - statistic, but x a is the mean expected under the alternative hypothesis, and µ is the mean expected under the null hypothesis. Now using the R function pt(tcrit, df,ncp) one can calculate the Type II error or power of a test for a specific sample size For example to calculate the power of a test, one must first determine the critical value of t beyond which one would reject the null hypothesis. Clearly, if the null hypothesis is rejected one cannot have made a Type II error. This can be accomplished by using the R function qt(alpha, df) which will provide the t values associated with the alpha level quantile of the t - distribution.. We can obtain the critical value of t for the one tailed one sample t test we performed above using power.t.test and prw.t.test: #obtain critical t as the 95th percentile of the t - distribution with specific df n=15 tcrit=qt(0.95,df=n-1) tcrit 8

9 ## [1] Now we can calculate power to detect that the mean under the alternative hypothesis exceeds the null hypothesized mean by 3. We first calculate δ, assuming that our preliminary estimate of s = 4.6, then we calculate the Type II error probability, and then power. # calculate non-centrality parameter as: (ma - mo)/(s/sqrtr(n)) ncp=3/(4.6/sqrt(n)) # calculate Type II error err2=pt(tcrit,df=14,ncp=ncp) err2 ## [1] # calculate power 1-err2 ## [1] Note again that this estimate of power agrees with those we obtained using the power.t.test and the pwr.t.test functions. For a two-tailed test one must subtract from the Type II error calculation the lower region of rejection as shown below. # calculate 2 tailed version of same test tcrit=qt(0.975,df=n-1) tl=-1*tcrit # calculate Type II error err22=pt(tcrit,df=14,ncp=ncp)-pt(tl,df=14,ncp=ncp) err22 ## [1] # calculate power power=1-err22 power ## [1] Note that power for the two-tailed test is lower than for the equivalent one-tailed since more of the non-central t - distribution lies below the critical t - value (since the critical upper t is associated with the 97.5th percentile rather than the 95% percentile of the t distribution. Removal of the lower tail of the non-central t - distribution beyond the lower critical t - value (-t) has a very small effect on the estimated power. 9

10 A similar process can be followed for a two-sample t - test, but with a modified noncentrality parameter (δ): δ = x 1 x 2 S p 1 n n 2 which should look like a t - statistic for a two sample test. Since one is almost always performing power calculations prior to conducting the research, it is customary to assume that the variances in the two groups will be equal since the purpose of making the calculation is to plan the experiment to have adequate power for a reasonable alternative hypothesis. Power and sample Size Analysis by Simulation in R When the study design becomes more complicated, it is easy to outstrip the capabilities of R's built in functions for making power or sample size calculations. In that case, one can calculate power or sample size by programming R to perform a computer simulation. However, I will use a simple study design to illustrate the simulation approach, but its basic principles apply to more complicated designs. Part I - Estimate Type I error rate 1. First simulate data under the null hypothesis 2. Perform the test you plan to do to the simulated data at your chosen α level. 3. Record whether the test applied was significant. 4. Repeat this process many times (step 1 to 3). 5. Tabulate how many simulations showed a significant effect. This number divided by the number of simulations will estimate your Type I error. It should be close to your nominal α level. Part II 1. Now simulate data under the alternative hypothesis. 2. Perform the same test on the simulated data. 10

11 3. Record whether the test was significant or not at your chosen α level. 4. Repeat steps 1-3 many times. 5. Tabulate how many simulations showed a significant effect. This estimate divided by the number of simulations will be your estimate of power - the proportion of the simulations in which you rejected a false null hypothesis. Below I illustrate code to calculate the power of the one sample t - test illustrated above. Note that the estimated power of is very close the estimates generated using power.t.test, pwr.t.test, and the non-central t - distribution. # From one-sample one-tailed t - test performed above # set the number of simulations #initialize counter variables # set sample size n numsims=10000 count=0 countp=0 n=15 # Simulate data under the null hypothesis as being normally distributed with # mean = 0, and given values of standard deviation for (i in 1:numsims){ dat=rnorm(n,0,4.6) #perform the t - test res=t.test(dat,mu=0,alternative="greater") # check to see if t - test is significant at alpha =0.05 level and # count how many times it is significant (count) if(res$p.value<=0.05){count=count+1} } # calculate Type I error rate as number significant/number of simulations type1=count/numsims type1 ## [1] # Simulate the data under the alternative hypothesis that the the mean # exceeds the null hypothesized value by 3 units for (i in 1:numsims){ datn=rnorm(n,3,4.6) res2=t.test(datn,mu=0,alternative="greater") if(res2$p.value<=0.05){countp=countp+1} } # power is the number of times null hypothesis rejected/number of simulation 11

12 power=countp/numsims power ## [1] Power Curves It is often useful to plot a curve that shows the estimated power as function of sample size, effect size, or sd. Commonly power is plotted versus sample size with power on the y - axis and sample size on the x - axis. To generate a power curve in R, one could use a few lines of R code # create a vector with the integers from 2 to 50 n=(2:50) # determine the length of the vector (how many elements) np=length(n) # note the sample size n is vector values (has many values) pwrt=power.t.test(n=n,delta=2,sd=2.82,type="one.sample",alternative="one.side d") # extract values of power from object "pwrt" and plot graph plot(n,pwrt$power, xlab="sample Size", ylab="power", type ="l") A lot more can be done in R to make your plot look better. 12

13 III. Exercises Calculate power or sample size for the following tests: Use R, the program PS, or another online power calculator (note PS gives power and sample size values assuming a two tailed test, to get one - tailed results for a test at the α = 0.05 level, use an α = 0.1) to solve the following problems. Show your work (R syntax if you use R, or the values you entered into PS or an online calculator), which calculator you used, and interpret the results. A. Given that the variance in arsenic concentration in drinking water is 8 ppb, what is the power of a test based on 10 samples to determine if arsenic levels exceeds the public health standard of 5 ppb by 2 ppb? Assume that the test is performed at α = What sample size would be necessary to have α = β = 0.05? Plot a curve for power versus sample size for this example. B. An ecologist is contemplating a study on the effects of ice plant on seedling growth in a native plant. Two treatments are anticipated: treatment 1 will examine growth of the native plant at locations where ice plant has not previously grown, and treatment 2 will examine growth of the native at locations from which ice plant has been removed. Given a preliminary estimate that plants reach an average height of 35 cm and have a variance of 20 cm, what sample size is necessary to detect a 25% reduction in height caused by ice plant with power of 80%? (Assume that the test is to be performed at the α = 0.05 level). Would it be better to use an independent groups t - test or a paired t - test? (hint: using the same preliminary estimates calculate the sample size required to achieve the desired power for both a separate groups and a paired t - test). What consequence would there be to the independent group t - test to using unequal sample sizes? (Hint: use the Iowa calculator to answer this question.) Perform the power and sample size calculations requested in parts A and B, and turn in a brief write-up of your results. Further Instructions on completing the Lab Take note that the power calculator PS always assumes that you are performing a 2- tailed test. Therefore, to perform a test with α = 0.05, you need to enter α = 0.1 in the appropriate box for α. Also, the parameter m in PS is the ratio of sample size in the two treatments when you are performing an independent groups t - test. You get different results depending on which sample size you put in the numerator or denominator. Therefore, avoid using PS to address questions about unequal samples sizes. Use Russ Lenth s the web-based power calculator at the University of Iowa. When doing the problems read them carefully and extract information on variability, effect size, Type I error rate, if the problem should be a one tailed test, etc. so that you can enter values in the calculators to compute power or sample size. When completing 13

14 your write-up state which calculator you used and what values you entered so that I can tell where you went wrong. Also, for part B of the exercise to determine the effect of unequal sample sizes on power, keep the total sample size constant and vary allocation between treatments. For example, if n 1 = 4 and n 2 = 4, compare the power to n 1 = 6 and n 2 = 2, so the total sample size remains 8. Links to Web Based Power and Sample Size Calculators for t -tests 1 - sample tests sample tests 1 and 2 sample tests 14

Equivalence Tests for Two Means in a 2x2 Cross-Over Design using Differences

Equivalence Tests for Two Means in a 2x2 Cross-Over Design using Differences Chapter 520 Equivalence Tests for Two Means in a 2x2 Cross-Over Design using Differences Introduction This procedure calculates power and sample size of statistical tests of equivalence of the means of

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ when the population standard deviation is known and population distribution is normal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses

More information

In this computer exercise we will work with the analysis of variance in R. We ll take a look at the following topics:

In this computer exercise we will work with the analysis of variance in R. We ll take a look at the following topics: UPPSALA UNIVERSITY Department of Mathematics Måns Thulin, thulin@math.uu.se Analysis of regression and variance Fall 2011 COMPUTER EXERCISE 2: One-way ANOVA In this computer exercise we will work with

More information

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015 STAT 113: Lab 9 Colin Reimer Dawson Last revised November 10, 2015 We will do some of the following together. The exercises with a (*) should be done and turned in as part of HW9. Before we start, let

More information

For our example, we will look at the following factors and factor levels.

For our example, we will look at the following factors and factor levels. In order to review the calculations that are used to generate the Analysis of Variance, we will use the statapult example. By adjusting various settings on the statapult, you are able to throw the ball

More information

Package pwr. R topics documented: October 29, 2012

Package pwr. R topics documented: October 29, 2012 Type Package Title Basic functions for power analysis Version 1.1.1 Date 2009-10-24 Author Stephane Champely Package pwr October 29, 2012 Maintainer Stephane Champely Depends R

More information

2) familiarize you with a variety of comparative statistics biologists use to evaluate results of experiments;

2) familiarize you with a variety of comparative statistics biologists use to evaluate results of experiments; A. Goals of Exercise Biology 164 Laboratory Using Comparative Statistics in Biology "Statistics" is a mathematical tool for analyzing and making generalizations about a population from a number of individual

More information

Table Of Contents. Table Of Contents

Table Of Contents. Table Of Contents Statistics Table Of Contents Table Of Contents Basic Statistics... 7 Basic Statistics Overview... 7 Descriptive Statistics Available for Display or Storage... 8 Display Descriptive Statistics... 9 Store

More information

Dealing with Categorical Data Types in a Designed Experiment

Dealing with Categorical Data Types in a Designed Experiment Dealing with Categorical Data Types in a Designed Experiment Part II: Sizing a Designed Experiment When Using a Binary Response Best Practice Authored by: Francisco Ortiz, PhD STAT T&E COE The goal of

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Package pwr. R topics documented: March 25, Version Date Title Basic Functions for Power Analysis

Package pwr. R topics documented: March 25, Version Date Title Basic Functions for Power Analysis Version 1.2-1 Date 2017-03-25 Title Basic Functions for Power Analysis Package pwr March 25, 2017 Power analysis functions along the lines of Cohen (1988). Imports stats, graphics Suggests ggplot2, scales,

More information

Unit 5: Estimating with Confidence

Unit 5: Estimating with Confidence Unit 5: Estimating with Confidence Section 8.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Unit 5 Estimating with Confidence 8.1 8.2 8.3 Confidence Intervals: The Basics Estimating

More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 4, 2016 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Multiple Comparisons of Treatments vs. a Control (Simulation)

Multiple Comparisons of Treatments vs. a Control (Simulation) Chapter 585 Multiple Comparisons of Treatments vs. a Control (Simulation) Introduction This procedure uses simulation to analyze the power and significance level of two multiple-comparison procedures that

More information

Pair-Wise Multiple Comparisons (Simulation)

Pair-Wise Multiple Comparisons (Simulation) Chapter 580 Pair-Wise Multiple Comparisons (Simulation) Introduction This procedure uses simulation analyze the power and significance level of three pair-wise multiple-comparison procedures: Tukey-Kramer,

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 5, 2015 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

Categorical Data in a Designed Experiment Part 2: Sizing with a Binary Response

Categorical Data in a Designed Experiment Part 2: Sizing with a Binary Response Categorical Data in a Designed Experiment Part 2: Sizing with a Binary Response Authored by: Francisco Ortiz, PhD Version 2: 19 July 2018 Revised 18 October 2018 The goal of the STAT COE is to assist in

More information

POL 345: Quantitative Analysis and Politics

POL 345: Quantitative Analysis and Politics POL 345: Quantitative Analysis and Politics Precept Handout 9 Week 11 (Verzani Chapter 10: 10.1 10.2) Remember to complete the entire handout and submit the precept questions to the Blackboard 24 hours

More information

Slides 11: Verification and Validation Models

Slides 11: Verification and Validation Models Slides 11: Verification and Validation Models Purpose and Overview The goal of the validation process is: To produce a model that represents true behaviour closely enough for decision making purposes.

More information

Chapter 2 Modeling Distributions of Data

Chapter 2 Modeling Distributions of Data Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and

More information

Title. Description. Menu. Remarks and examples. stata.com. stata.com. PSS Control Panel

Title. Description. Menu. Remarks and examples. stata.com. stata.com. PSS Control Panel Title stata.com GUI Graphical user interface for power and sample-size analysis Description Menu Remarks and examples Also see Description This entry describes the graphical user interface (GUI) for the

More information

STATS PAD USER MANUAL

STATS PAD USER MANUAL STATS PAD USER MANUAL For Version 2.0 Manual Version 2.0 1 Table of Contents Basic Navigation! 3 Settings! 7 Entering Data! 7 Sharing Data! 8 Managing Files! 10 Running Tests! 11 Interpreting Output! 11

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

SPSS Basics for Probability Distributions

SPSS Basics for Probability Distributions Built-in Statistical Functions in SPSS Begin by defining some variables in the Variable View of a data file, save this file as Probability_Distributions.sav and save the corresponding output file as Probability_Distributions.spo.

More information

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit. Continuous Improvement Toolkit Normal Distribution The Continuous Improvement Map Managing Risk FMEA Understanding Performance** Check Sheets Data Collection PDPC RAID Log* Risk Analysis* Benchmarking***

More information

Simulating power in practice

Simulating power in practice Simulating power in practice Author: Nicholas G Reich This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Confidence Intervals: Estimators

Confidence Intervals: Estimators Confidence Intervals: Estimators Point Estimate: a specific value at estimates a parameter e.g., best estimator of e population mean ( ) is a sample mean problem is at ere is no way to determine how close

More information

One Factor Experiments

One Factor Experiments One Factor Experiments 20-1 Overview Computation of Effects Estimating Experimental Errors Allocation of Variation ANOVA Table and F-Test Visual Diagnostic Tests Confidence Intervals For Effects Unequal

More information

Introductory Applied Statistics: A Variable Approach TI Manual

Introductory Applied Statistics: A Variable Approach TI Manual Introductory Applied Statistics: A Variable Approach TI Manual John Gabrosek and Paul Stephenson Department of Statistics Grand Valley State University Allendale, MI USA Version 1.1 August 2014 2 Copyright

More information

Chapter 12: Statistics

Chapter 12: Statistics Chapter 12: Statistics Once you have imported your data or created a geospatial model, you may wish to calculate some simple statistics, run some simple tests, or see some traditional plots. On the main

More information

The Power and Sample Size Application

The Power and Sample Size Application Chapter 72 The Power and Sample Size Application Contents Overview: PSS Application.................................. 6148 SAS Power and Sample Size............................... 6148 Getting Started:

More information

Chapter 2: The Normal Distribution

Chapter 2: The Normal Distribution Chapter 2: The Normal Distribution 2.1 Density Curves and the Normal Distributions 2.2 Standard Normal Calculations 1 2 Histogram for Strength of Yarn Bobbins 15.60 16.10 16.60 17.10 17.60 18.10 18.60

More information

1. The Normal Distribution, continued

1. The Normal Distribution, continued Math 1125-Introductory Statistics Lecture 16 10/9/06 1. The Normal Distribution, continued Recall that the standard normal distribution is symmetric about z = 0, so the area to the right of zero is 0.5000.

More information

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc. C is one of many capability metrics that are available. When capability metrics are used, organizations typically provide

More information

Week 4: Simple Linear Regression II

Week 4: Simple Linear Regression II Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties

More information

Binary Diagnostic Tests Clustered Samples

Binary Diagnostic Tests Clustered Samples Chapter 538 Binary Diagnostic Tests Clustered Samples Introduction A cluster randomization trial occurs when whole groups or clusters of individuals are treated together. In the twogroup case, each cluster

More information

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D. Introduction to Minitab The interface for Minitab is very user-friendly, with a spreadsheet orientation. When you first launch Minitab, you will see

More information

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable Learning Objectives Continuous Random Variables & The Normal Probability Distribution 1. Understand characteristics about continuous random variables and probability distributions 2. Understand the uniform

More information

Week 4: Simple Linear Regression III

Week 4: Simple Linear Regression III Week 4: Simple Linear Regression III Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Goodness of

More information

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals BIOSTATS 540 Fall 017 8. SUPPLEMENT Normal, T, Chi Square, F and Sums of Normals Page 1 of Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals Topic 1. Normal Distribution.. a. Definition..

More information

The Normal Distribution & z-scores

The Normal Distribution & z-scores & z-scores Distributions: Who needs them? Why are we interested in distributions? Important link between distributions and probabilities of events If we know the distribution of a set of events, then we

More information

Comparison of Means: The Analysis of Variance: ANOVA

Comparison of Means: The Analysis of Variance: ANOVA Comparison of Means: The Analysis of Variance: ANOVA The Analysis of Variance (ANOVA) is one of the most widely used basic statistical techniques in experimental design and data analysis. In contrast to

More information

Distributions of random variables

Distributions of random variables Chapter 3 Distributions of random variables 31 Normal distribution Among all the distributions we see in practice, one is overwhelmingly the most common The symmetric, unimodal, bell curve is ubiquitous

More information

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal

More information

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes. Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department

More information

Psychology 282 Lecture #21 Outline Categorical IVs in MLR: Effects Coding and Contrast Coding

Psychology 282 Lecture #21 Outline Categorical IVs in MLR: Effects Coding and Contrast Coding Psychology 282 Lecture #21 Outline Categorical IVs in MLR: Effects Coding and Contrast Coding In the previous lecture we learned how to incorporate a categorical research factor into a MLR model by using

More information

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Creation & Description of a Data Set * 4 Levels of Measurement * Nominal, ordinal, interval, ratio * Variable Types

More information

humor... May 3, / 56

humor... May 3, / 56 humor... May 3, 2017 1 / 56 Power As discussed previously, power is the probability of rejecting the null hypothesis when the null is false. Power depends on the effect size (how far from the truth the

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time

Today. Lecture 4: Last time. The EM algorithm. We examine clustering in a little more detail; we went over it a somewhat quickly last time Today Lecture 4: We examine clustering in a little more detail; we went over it a somewhat quickly last time The CAD data will return and give us an opportunity to work with curves (!) We then examine

More information

Section 2.2 Normal Distributions. Normal Distributions

Section 2.2 Normal Distributions. Normal Distributions Section 2.2 Normal Distributions Normal Distributions One particularly important class of density curves are the Normal curves, which describe Normal distributions. All Normal curves are symmetric, single-peaked,

More information

Question. Dinner at the Urquhart House. Data, Statistics, and Spreadsheets. Data. Types of Data. Statistics and Data

Question. Dinner at the Urquhart House. Data, Statistics, and Spreadsheets. Data. Types of Data. Statistics and Data Question What are data and what do they mean to a scientist? Dinner at the Urquhart House Brought to you by the Briggs Multiracial Alliance Sunday night All food provided (probably Chinese) Contact Mimi

More information

appstats6.notebook September 27, 2016

appstats6.notebook September 27, 2016 Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using

More information

Week 7: The normal distribution and sample means

Week 7: The normal distribution and sample means Week 7: The normal distribution and sample means Goals Visualize properties of the normal distribution. Learning the Tools Understand the Central Limit Theorem. Calculate sampling properties of sample

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 2015 MODULE 4 : Modelling experimental data Time allowed: Three hours Candidates should answer FIVE questions. All questions carry equal

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

Use of Extreme Value Statistics in Modeling Biometric Systems

Use of Extreme Value Statistics in Modeling Biometric Systems Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision

More information

Chapter 6 Normal Probability Distributions

Chapter 6 Normal Probability Distributions Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal Distribution 6-3 Applications of Normal Distributions 6-4 Sampling Distributions and Estimators 6-5 The Central

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

Bootstrapping Method for  14 June 2016 R. Russell Rhinehart. Bootstrapping Bootstrapping Method for www.r3eda.com 14 June 2016 R. Russell Rhinehart Bootstrapping This is extracted from the book, Nonlinear Regression Modeling for Engineering Applications: Modeling, Model Validation,

More information

nquery Sample Size & Power Calculation Software Validation Guidelines

nquery Sample Size & Power Calculation Software Validation Guidelines nquery Sample Size & Power Calculation Software Validation Guidelines Every nquery sample size table, distribution function table, standard deviation table, and tablespecific side table has been tested

More information

Condence Intervals about a Single Parameter:

Condence Intervals about a Single Parameter: Chapter 9 Condence Intervals about a Single Parameter: 9.1 About a Population Mean, known Denition 9.1.1 A point estimate of a parameter is the value of a statistic that estimates the value of the parameter.

More information

Lecture 21 Section Fri, Oct 3, 2008

Lecture 21 Section Fri, Oct 3, 2008 Lecture 21 Section 6.3.1 Hampden-Sydney College Fri, Oct 3, 2008 Outline 1 2 3 4 5 6 Exercise 6.15, page 378. A young woman needs a 15-ampere fuse for the electrical system in her apartment and has decided

More information

Excel 2010 with XLSTAT

Excel 2010 with XLSTAT Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with

More information

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition Online Learning Centre Technology Step-by-Step - Minitab Minitab is a statistical software application originally created

More information

Chapter 2: Modeling Distributions of Data

Chapter 2: Modeling Distributions of Data Chapter 2: Modeling Distributions of Data Section 2.2 The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 2 Modeling Distributions of Data 2.1 Describing Location in a Distribution

More information

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data Spatial Patterns We will examine methods that are used to analyze patterns in two sorts of spatial data: Point Pattern Analysis - These methods concern themselves with the location information associated

More information

Detecting Polytomous Items That Have Drifted: Using Global Versus Step Difficulty 1,2. Xi Wang and Ronald K. Hambleton

Detecting Polytomous Items That Have Drifted: Using Global Versus Step Difficulty 1,2. Xi Wang and Ronald K. Hambleton Detecting Polytomous Items That Have Drifted: Using Global Versus Step Difficulty 1,2 Xi Wang and Ronald K. Hambleton University of Massachusetts Amherst Introduction When test forms are administered to

More information

Power of the power command in Stata 13

Power of the power command in Stata 13 Power of the power command in Stata 13 Yulia Marchenko Director of Biostatistics StataCorp LP 2013 UK Stata Users Group meeting Yulia Marchenko (StataCorp) September 13, 2013 1 / 27 Outline Outline Basic

More information

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling

More information

4b: Making an auxiliary table for calculating the standard deviation

4b: Making an auxiliary table for calculating the standard deviation In the book we discussed the use of an auxiliary table to calculate variance and standard deviation (Table 4.3). Such a table gives much more insight in the underlying calculations than the simple number

More information

One way ANOVA when the data are not normally distributed (The Kruskal-Wallis test).

One way ANOVA when the data are not normally distributed (The Kruskal-Wallis test). One way ANOVA when the data are not normally distributed (The Kruskal-Wallis test). Suppose you have a one way design, and want to do an ANOVA, but discover that your data are seriously not normal? Just

More information

Chapter 8. Interval Estimation

Chapter 8. Interval Estimation Chapter 8 Interval Estimation We know how to get point estimate, so this chapter is really just about how to get the Introduction Move from generating a single point estimate of a parameter to generating

More information

CHAPTER 2: Describing Location in a Distribution

CHAPTER 2: Describing Location in a Distribution CHAPTER 2: Describing Location in a Distribution 2.1 Goals: 1. Compute and use z-scores given the mean and sd 2. Compute and use the p th percentile of an observation 3. Intro to density curves 4. More

More information

The Normal Distribution & z-scores

The Normal Distribution & z-scores & z-scores Distributions: Who needs them? Why are we interested in distributions? Important link between distributions and probabilities of events If we know the distribution of a set of events, then we

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem. STAT 2607 REVIEW PROBLEMS 1 REMINDER: On the final exam 1. Word problems must be answered in words of the problem. 2. "Test" means that you must carry out a formal hypothesis testing procedure with H0,

More information

Multivariate Capability Analysis

Multivariate Capability Analysis Multivariate Capability Analysis Summary... 1 Data Input... 3 Analysis Summary... 4 Capability Plot... 5 Capability Indices... 6 Capability Ellipse... 7 Correlation Matrix... 8 Tests for Normality... 8

More information

Catering to Your Tastes: Using PROC OPTEX to Design Custom Experiments, with Applications in Food Science and Field Trials

Catering to Your Tastes: Using PROC OPTEX to Design Custom Experiments, with Applications in Food Science and Field Trials Paper 3148-2015 Catering to Your Tastes: Using PROC OPTEX to Design Custom Experiments, with Applications in Food Science and Field Trials Clifford Pereira, Department of Statistics, Oregon State University;

More information

Meeting 1 Introduction to Functions. Part 1 Graphing Points on a Plane (REVIEW) Part 2 What is a function?

Meeting 1 Introduction to Functions. Part 1 Graphing Points on a Plane (REVIEW) Part 2 What is a function? Meeting 1 Introduction to Functions Part 1 Graphing Points on a Plane (REVIEW) A plane is a flat, two-dimensional surface. We describe particular locations, or points, on a plane relative to two number

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data CHAPTER 2 Modeling Distributions of Data 2.2 Density Curves and Normal Distributions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Density Curves

More information

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies. Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.

More information

Screening Design Selection

Screening Design Selection Screening Design Selection Summary... 1 Data Input... 2 Analysis Summary... 5 Power Curve... 7 Calculations... 7 Summary The STATGRAPHICS experimental design section can create a wide variety of designs

More information

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users BIOSTATS 640 Spring 2018 Review of Introductory Biostatistics STATA solutions Page 1 of 13 Key Comments begin with an * Commands are in bold black I edited the output so that it appears here in blue Unit

More information

STA Module 4 The Normal Distribution

STA Module 4 The Normal Distribution STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Cluster Randomization Create Cluster Means Dataset

Cluster Randomization Create Cluster Means Dataset Chapter 270 Cluster Randomization Create Cluster Means Dataset Introduction A cluster randomization trial occurs when whole groups or clusters of individuals are treated together. Examples of such clusters

More information

Excel Tips and FAQs - MS 2010

Excel Tips and FAQs - MS 2010 BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my

More information

Chapters 5-6: Statistical Inference Methods

Chapters 5-6: Statistical Inference Methods Chapters 5-6: Statistical Inference Methods Chapter 5: Estimation (of population parameters) Ex. Based on GSS data, we re 95% confident that the population mean of the variable LONELY (no. of days in past

More information

Bland-Altman Plot and Analysis

Bland-Altman Plot and Analysis Chapter 04 Bland-Altman Plot and Analysis Introduction The Bland-Altman (mean-difference or limits of agreement) plot and analysis is used to compare two measurements of the same variable. That is, it

More information

Lecture 31 Sections 9.4. Tue, Mar 17, 2009

Lecture 31 Sections 9.4. Tue, Mar 17, 2009 s for s for Lecture 31 Sections 9.4 Hampden-Sydney College Tue, Mar 17, 2009 Outline s for 1 2 3 4 5 6 7 s for Exercise 9.17, page 582. It is believed that 20% of all university faculty would be willing

More information

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.

More information

Nonparametric Testing

Nonparametric Testing Nonparametric Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information