Chapter 8. Interval Estimation

Chapter 8 Interval Estimation

We know how to get point estimate, so this chapter is really just about how to get the Introduction Move from generating a single point estimate of a parameter to generating an

Topics in Chapter 8 1. interval estimation Start with a rare case that is easy to understand. Population standard deviation KNOWN Then move to a more realistic case that is slightly more difficult Population Standard Deviation UNKNOW 2. Margin of Error and Sample Sizes 3. Population Proportion

Interval for with s known Recall that we know the distribution of is Normally distributed E( ) = m Standard error of = s / n We use this information to produce our interval estimate. Idea Come up with a margin such that 95% of our is taken into account. We will use other margins later, but we will start out with 95%

Interval for with s known How much area is in the left tail? How much area is in the right tail? What are the z values such that the area in the tails is? 95 % Sampling distribution of x-bar - Margin m + Margin x

Interval for with s known If our variable were STANDARD normal, we would be done. Our margin of error would be ± 1.96. Variables that occur in real world are not STANDARD normal, we have to work back up to real world variables Note this is opposite of chapter 6 when we went from real world to z. Now we are going from z to real world. How to do it? Scale the z value by the standard error of x Margin of error will be (Some measure of z) * standard error of x (Some measure of z) * s n To get interval estimator all we do is x ± Margin of Error

Interval for with s known Focus on some value of z Terminology We just looked at 95% of the distribution of x-bar being taken into account. Only 5 percent was left over. We call that left over part a, pronounced alpha Therefore the part in the center is 1-a 1-a is called the And the part in the tails is a The part in each tail individually is a/2 z α/2 is the z value such that the area to the right is a/2

Interval for with s known What is a here? What is 1- a here? How do we scale z so that we get something scaled for x? Sampling distribution of x-bar 95 %.025.025 m - Margin + margin -1.96 * s 1.96 * s n n x

Examples a Level of confidence Area in each tail Table value 1 - a a / 2 Z a / 2.02.05.10.20.98.95.90.80.0100.0250.0500.1000

Example Estimate with 98% confidence the mean gallons of water used per shower for Dallas Cowboys after a game if the true standard deviation is known to be 10 gallons. The sample mean for 16 showers is 30.00 gal. m.o.e. = = Z = 4 = gal.

Example, continued... 98% confidence interval: Point estimate m.o.e. 30.00 5.825 or 24.175 to 35.825 gallons

Example, Confidence interval Interpretation I am 98% confident that the population mean gallons of water used per shower for Dallas Cowboys after a game will fall within the interval 24.175 to 35.825 gallons. A statement in must contain four parts: 1. amount of confidence. 2. the parameter being estimated 3. the population to which we generalize 4. the calculated interval.

Interval for with s known Meaning of the confidence interval We form the interval to say something about the population parameter. So is the population parameter in the interval or not? Can we say anything about the probability the population parameter is in interval? Answer: Once we form an interval, we make probability statements. The parameter is either in there (p=1) or not (p=0) What we can say is that the leads to (1-a) of the intervals containing the population parameter Link

Questions for Thought If I were to draw 500 samples and calculate a 95% confidence interval about µ for each of the 500 samples, how many of those confidence intervals would you expect to contain µ? How is the statistic z = x μ σ/ n distributed? What is the meaning of z α/2? How do I find it?

Interval for with s UNknown Moving to more realistic case s is unknown Have to estimate s with s, the sample standard deviation. This causes a change in our interval estimate Before we used Now we are going to use Two Differences to notice s is We are using s instead. We now using, rather than z table. Estimating the standard deviation changed the shape of the distribution!

Comparison of z and t Bell Shaped Symmetric z YES t YES Mean =0 =0 Standard Deviation Degrees of Freedom Table values =1 >1 not relevant n-1 Area to Left of particular z-value Actual t-values. Areas are listed at the top of table As n-1 increases, the t-distribution becomes the z distribution Look at degrees of freedom equal to infinity

What do we do if the true population standard deviation s is unknown?

General form for margin of error when s is UN-known: m.o.e. = t a 2,n 1 s n estimated standard error of the mean where t a 2,n 1 is the s x appropriate t-value from the t-distribution.

Explanation of symbol: t a / 2, n-1 cuts off the top tail at area = a/2 a / 2 0 t a / 2, n 1 t-distribution Use the t-table to find the value.

0.1 0.05 0.025 0.01 0.005 d.f. = 1 3.078 6.314 12.706 31.821 63.656 2 1.886 2.920 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 6 1.440 1.943 2.447 3.143 3.707 7 1.415 1.895 2.365 2.998 3.499 8 1.397 1.860 2.306 2.896 3.355 9 1.383 1.833 2.262 2.821 3.250 10 1.372 1.812 2.228 2.764 3.169 11 1.363 1.796 2.201 2.718 3.106 12 1.356 1.782 2.179 2.681 3.055 13 1.350 1.771 2.160 2.650 3.012 14 1.345 1.761 2.145 2.624 2.977 15 1.341 1.753 2.131 2.602 2.947 16 1.337 1.746 2.120 2.583 2.921 17 1.333 1.740 2.110 2.567 2.898 18 1.330 1.734 2.101 2.552 2.878 19 1.328 1.729 2.093 2.539 2.861 20 1.325 1.725 2.086 2.528 2.845 21 1.323 1.721 2.080 2.518 2.831 22 1.321 1.717 2.074 2.508 2.819 23 1.319 1.714 2.069 2.500 2.807 24 1.318 1.711 2.064 2.492 2.797 25 1.316 1.708 2.060 2.485 2.787 26 1.315 1.706 2.056 2.479 2.779 27 1.314 1.703 2.052 2.473 2.771 28 1.313 1.701 2.048 2.467 2.763 29 1.311 1.699 2.045 2.462 2.756 30 1.310 1.697 2.042 2.457 2.750 31 1.309 1.696 2.040 2.453 2.744 32 1.309 1.694 2.037 2.449 2.738 33 1.308 1.692 2.035 2.445 2.733 34 1.307 1.691 2.032 2.441 2.728 35 1.306 1.690 2.030 2.438 2.724 0.1 0.05 0.025 0.01 0.005 36 1.306 1.688 2.028 2.434 2.719 37 1.305 1.687 2.026 2.431 2.715 38 1.304 1.686 2.024 2.429 2.712 39 1.304 1.685 2.023 2.426 2.708 40 1.303 1.684 2.021 2.423 2.704 41 1.303 1.683 2.020 2.421 2.701 42 1.302 1.682 2.018 2.418 2.698 43 1.302 1.681 2.017 2.416 2.695 44 1.301 1.680 2.015 2.414 2.692 45 1.301 1.679 2.014 2.412 2.690 97 1.290 1.661 1.985 2.365 2.627 98 1.290 1.661 1.984 2.365 2.627 99 1.290 1.660 1.984 2.365 2.626 100 1.290 1.660 1.984 2.364 2.626 1.282 1.645 1.960 2.326 2.576 Want 95% CI, n = 20, a/2 = d.f. = t = 0 t

Example 2 Estimate with 98% confidence the mean gallons of water used per shower for Dallas Cowboys after a game. The sample mean of 16 showers is 30.00 gallons and the sample standard deviation is 10.4 gallons. m.o.e. = t a 2, n-1 s n = t.01, 15 10.4 16 = 2.602 10.4 4 = 6.765 gal.

Example 2, continued... 98% confidence interval: Point estimate m.o.e. 30.00 6.765 or 23.235 to 36.765 gallons

Example 2, Statement in the I am 98% confident that the population mean gallons of water used per shower for Dallas Cowboys after a game will fall within the interval 23.235 to 36.765 gallons. A statement must contain four parts: 1. amount of confidence. 2. the parameter being estimated 3. the population to which we generalize 4. the calculated interval.

Confidence vs. Probability BEFORE a sample is collected, there is a 95% probability that the future to be computed sample mean, will fall within m.o.e. units of m. AFTER the sample is collected, the computed sample mean either fell within m.o.e. units of m, or it did not. After the event, it does not make sense to talk about probability. Analogy: Suppose you own 95 tickets in a 100-ticket lottery. The drawing was held one hour ago, but you don t know the result. P(win) = 0 or 1, but you are very CONFIDENT that you have won the lottery.

What to use for confidence interval about the mean, z or t? How to decide? For standard deviation known, use z For standard deviation estimated from the sample, use t.

Sample size and Margin of Error for population mean So far we have worked on finding confidence intervals after a sample has been drawn Today: Work on something BEFORE sample is drawn What sample size should I use to get a particular margin of error? s m.o.e. = Z a 2 n

Sample size and margin of error population mean s m.o.e. = Z a 2 n Call m.o.e E E = Z s a 2 n Now Solve for n n = Z 2 2 a 2 E s 2

Sample size and margin of error population mean We can think of s as coming from historical data or our best estimate. n = Z 2 2 a 2 E s 2

Example What sample size is need to estimate the mean mpg of Toyota Camrys with a margin of error of.2 mpg at 90% confidence if the historical standard deviation is.88 mpg? First, Notice we are talking about a mean (not a proportion). This helps us to choose the correct formula. Next, Find the appropriate z value Finally, plug and chug. n = 1.645 2 0.88 2 0.2 2 = 52.39

Confidence Intervals for the Population Proportion. Examples Proportion of people who were laid off this year Proportion of college graduates in Ogden Proportion of skiers who are from out of state. Point estimate of population proportion is Our interval Estimate is going to be In order to form the margin of error we need to know shape of.we need its distribution For large samples is normally distributed Use z tables

Confidence Intervals for the Population Proportion. Focus on Margin of Error Z α/2 * Some measure of Standard error Z a 2 Confidence interval is: + Z a 2 _ p (1 p) n _ p (1 p) n

Example The governor will spend more money convincing voters of a new program if he finds fewer than 50% of voters currently support it. In a telephone survey of 200 randomly selected voters, 82 say they support the proposed program Construct a 95% confidence interval for the population of ALL voters who support the program. Should the governor spend more money convincing voters?

Example Continued + Z a 2 p (1 p) n P-bar = sample proportion = 82/200 =.41.41 + 1.96.41(.59) 200.41 ± 0.068 = (.342,.478 )

Example continued What can we conclude? The CI is.342 to.478.50 is NOT in the CI, therefore.50 is not a plausible value Less than.5 of the voters support the proposed program; therefore the governor should spend more on promotion.

Sample size and Margin of Error for population proportion Start with Error and solve for Sample size E = n = Z a 2 _ p (1 p) It does not make sense to talk about the sample proportion p-bar before taking a sample Solution: Make an educated guess of sample proportion and put in in the formula. n _ Book calls the educated guess p*

Example In a survey, the planning value for the population proportion is p* =.35. How large of a sample should be take to provide a 95% confidence interval with a margin of error of.05 n = n = (1.96 2 *.35(.65))/(.05 2 ) n = 349.59 Round Up n = 350