Chapter 8. Interval Estimation

Similar documents
Unit 5: Estimating with Confidence

Confidence Intervals: Estimators

Chapters 5-6: Statistical Inference Methods

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

We have seen that as n increases, the length of our confidence interval decreases, the confidence interval will be more narrow.

1. The Normal Distribution, continued

Lecture 31 Sections 9.4. Tue, Mar 17, 2009

Confidence Intervals. Dennis Sun Data 301

Section 7.2: Applications of the Normal Distribution

Chapter 6: DESCRIPTIVE STATISTICS

Data Analysis & Probability

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Today s Topics. Percentile ranks and percentiles. Standardized scores. Using standardized scores to estimate percentiles

Chapter 2: The Normal Distribution

Distributions of Continuous Data

2) In the formula for the Confidence Interval for the Mean, if the Confidence Coefficient, z(α/2) = 1.65, what is the Confidence Level?

The Normal Curve. June 20, Bryan T. Karazsia, M.A.

Quantitative - One Population

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

So..to be able to make comparisons possible, we need to compare them with their respective distributions.

Descriptive Statistics, Standard Deviation and Standard Error

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

CHAPTER 2: Describing Location in a Distribution

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

6-1 THE STANDARD NORMAL DISTRIBUTION

Condence Intervals about a Single Parameter:

Measures of Position

Statistics: Interpreting Data and Making Predictions. Visual Displays of Data 1/31

The Normal Distribution & z-scores

Chapter 2 Modeling Distributions of Data

Distributions of random variables

Central Limit Theorem Sample Means

Math 14 Lecture Notes Ch. 6.1

CHAPTER 2 Modeling Distributions of Data

Statistical Tests for Variable Discrimination

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Ch6: The Normal Distribution

The Normal Distribution

Chapter 2: Modeling Distributions of Data

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Normal Data ID1050 Quantitative & Qualitative Reasoning

How individual data points are positioned within a data set.

Chapter 2: The Normal Distributions

Averages and Variation

4.7 Approximate Integration

The Normal Distribution & z-scores

Probability Distributions

height VUD x = x 1 + x x N N 2 + (x 2 x) 2 + (x N x) 2. N

Lesson #17 Function Introduction

Chapter 3: Describing, Exploring & Comparing Data

23.2 Normal Distributions

BIOS: 4120 Lab 11 Answers April 3-4, 2018

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

CHAPTER 2: SAMPLING AND DATA

Confidence Interval of a Proportion

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Section 1.5 Transformation of Functions

More Summer Program t-shirts

Section 1.5 Transformation of Functions

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

courtesy 1

Pair-Wise Multiple Comparisons (Simulation)

Section 1.5 Transformation of Functions

(Refer Slide Time: 1:43)

University of South Carolina Math 222: Math for Elementary Educators II Instructor: Austin Mohr Section 002 Fall Midterm Exam Solutions

MAT 110 WORKSHOP. Updated Fall 2018

CHAPTER 2 DESCRIPTIVE STATISTICS

L E A R N I N G O B JE C T I V E S

Sec 6.3. Bluman, Chapter 6 1

IQR = number. summary: largest. = 2. Upper half: Q3 =

Section 2.3: Simple Linear Regression: Predictions and Inference

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

The Normal Distribution & z-scores

More Formulas: circles Elementary Education 12

Measures of Dispersion

Unit 8 SUPPLEMENT Normal, T, Chi Square, F, and Sums of Normals

Chapter 2: Frequency Distributions

STA 570 Spring Lecture 5 Tuesday, Feb 1

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 2 - Graphical Summaries of Data

Chapter 2 Describing, Exploring, and Comparing Data

General Factorial Models

(Refer Slide Time: 02.06)

MATH : EXAM 3 INFO/LOGISTICS/ADVICE

Notes on Simulations in SAS Studio

For our example, we will look at the following factors and factor levels.

Multiple Comparisons of Treatments vs. a Control (Simulation)

(Refer Slide Time: 06:01)

3.5 Applying the Normal Distribution: Z - Scores

Lab 5 - Risk Analysis, Robustness, and Power

One Factor Experiments

5 R1 The one green in the same place so either of these could be green.

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

Problem Set #8. Econ 103

Warmups! Write down & Plot the points on the graph

Chapter 6. THE NORMAL DISTRIBUTION

Transcription:

Chapter 8 Interval Estimation

We know how to get point estimate, so this chapter is really just about how to get the Introduction Move from generating a single point estimate of a parameter to generating an

Topics in Chapter 8 1. interval estimation Start with a rare case that is easy to understand. Population standard deviation KNOWN Then move to a more realistic case that is slightly more difficult Population Standard Deviation UNKNOW 2. Margin of Error and Sample Sizes 3. Population Proportion

Interval for with s known Recall that we know the distribution of is Normally distributed E( ) = m Standard error of = s / n We use this information to produce our interval estimate. Idea Come up with a margin such that 95% of our is taken into account. We will use other margins later, but we will start out with 95%

Interval for with s known How much area is in the left tail? How much area is in the right tail? What are the z values such that the area in the tails is? 95 % Sampling distribution of x-bar - Margin m + Margin x

Interval for with s known If our variable were STANDARD normal, we would be done. Our margin of error would be ± 1.96. Variables that occur in real world are not STANDARD normal, we have to work back up to real world variables Note this is opposite of chapter 6 when we went from real world to z. Now we are going from z to real world. How to do it? Scale the z value by the standard error of x Margin of error will be (Some measure of z) * standard error of x (Some measure of z) * s n To get interval estimator all we do is x ± Margin of Error

Interval for with s known Focus on some value of z Terminology We just looked at 95% of the distribution of x-bar being taken into account. Only 5 percent was left over. We call that left over part a, pronounced alpha Therefore the part in the center is 1-a 1-a is called the And the part in the tails is a The part in each tail individually is a/2 z α/2 is the z value such that the area to the right is a/2

Interval for with s known What is a here? What is 1- a here? How do we scale z so that we get something scaled for x? Sampling distribution of x-bar 95 %.025.025 m - Margin + margin -1.96 * s 1.96 * s n n x

Examples a Level of confidence Area in each tail Table value 1 - a a / 2 Z a / 2.02.05.10.20.98.95.90.80.0100.0250.0500.1000

Example Estimate with 98% confidence the mean gallons of water used per shower for Dallas Cowboys after a game if the true standard deviation is known to be 10 gallons. The sample mean for 16 showers is 30.00 gal. m.o.e. = = Z = 4 = gal.

Example, continued... 98% confidence interval: Point estimate m.o.e. 30.00 5.825 or 24.175 to 35.825 gallons

Example, Confidence interval Interpretation I am 98% confident that the population mean gallons of water used per shower for Dallas Cowboys after a game will fall within the interval 24.175 to 35.825 gallons. A statement in must contain four parts: 1. amount of confidence. 2. the parameter being estimated 3. the population to which we generalize 4. the calculated interval.

Interval for with s known Meaning of the confidence interval We form the interval to say something about the population parameter. So is the population parameter in the interval or not? Can we say anything about the probability the population parameter is in interval? Answer: Once we form an interval, we make probability statements. The parameter is either in there (p=1) or not (p=0) What we can say is that the leads to (1-a) of the intervals containing the population parameter Link

Questions for Thought If I were to draw 500 samples and calculate a 95% confidence interval about µ for each of the 500 samples, how many of those confidence intervals would you expect to contain µ? How is the statistic z = x μ σ/ n distributed? What is the meaning of z α/2? How do I find it?

Interval for with s UNknown Moving to more realistic case s is unknown Have to estimate s with s, the sample standard deviation. This causes a change in our interval estimate Before we used Now we are going to use Two Differences to notice s is We are using s instead. We now using, rather than z table. Estimating the standard deviation changed the shape of the distribution!

Comparison of z and t Bell Shaped Symmetric z YES t YES Mean =0 =0 Standard Deviation Degrees of Freedom Table values =1 >1 not relevant n-1 Area to Left of particular z-value Actual t-values. Areas are listed at the top of table As n-1 increases, the t-distribution becomes the z distribution Look at degrees of freedom equal to infinity

What do we do if the true population standard deviation s is unknown?

General form for margin of error when s is UN-known: m.o.e. = t a 2,n 1 s n estimated standard error of the mean where t a 2,n 1 is the s x appropriate t-value from the t-distribution.

Explanation of symbol: t a / 2, n-1 cuts off the top tail at area = a/2 a / 2 0 t a / 2, n 1 t-distribution Use the t-table to find the value.

0.1 0.05 0.025 0.01 0.005 d.f. = 1 3.078 6.314 12.706 31.821 63.656 2 1.886 2.920 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 6 1.440 1.943 2.447 3.143 3.707 7 1.415 1.895 2.365 2.998 3.499 8 1.397 1.860 2.306 2.896 3.355 9 1.383 1.833 2.262 2.821 3.250 10 1.372 1.812 2.228 2.764 3.169 11 1.363 1.796 2.201 2.718 3.106 12 1.356 1.782 2.179 2.681 3.055 13 1.350 1.771 2.160 2.650 3.012 14 1.345 1.761 2.145 2.624 2.977 15 1.341 1.753 2.131 2.602 2.947 16 1.337 1.746 2.120 2.583 2.921 17 1.333 1.740 2.110 2.567 2.898 18 1.330 1.734 2.101 2.552 2.878 19 1.328 1.729 2.093 2.539 2.861 20 1.325 1.725 2.086 2.528 2.845 21 1.323 1.721 2.080 2.518 2.831 22 1.321 1.717 2.074 2.508 2.819 23 1.319 1.714 2.069 2.500 2.807 24 1.318 1.711 2.064 2.492 2.797 25 1.316 1.708 2.060 2.485 2.787 26 1.315 1.706 2.056 2.479 2.779 27 1.314 1.703 2.052 2.473 2.771 28 1.313 1.701 2.048 2.467 2.763 29 1.311 1.699 2.045 2.462 2.756 30 1.310 1.697 2.042 2.457 2.750 31 1.309 1.696 2.040 2.453 2.744 32 1.309 1.694 2.037 2.449 2.738 33 1.308 1.692 2.035 2.445 2.733 34 1.307 1.691 2.032 2.441 2.728 35 1.306 1.690 2.030 2.438 2.724 0.1 0.05 0.025 0.01 0.005 36 1.306 1.688 2.028 2.434 2.719 37 1.305 1.687 2.026 2.431 2.715 38 1.304 1.686 2.024 2.429 2.712 39 1.304 1.685 2.023 2.426 2.708 40 1.303 1.684 2.021 2.423 2.704 41 1.303 1.683 2.020 2.421 2.701 42 1.302 1.682 2.018 2.418 2.698 43 1.302 1.681 2.017 2.416 2.695 44 1.301 1.680 2.015 2.414 2.692 45 1.301 1.679 2.014 2.412 2.690 97 1.290 1.661 1.985 2.365 2.627 98 1.290 1.661 1.984 2.365 2.627 99 1.290 1.660 1.984 2.365 2.626 100 1.290 1.660 1.984 2.364 2.626 1.282 1.645 1.960 2.326 2.576 Want 95% CI, n = 20, a/2 = d.f. = t = 0 t

0.1 0.05 0.025 0.01 0.005 d.f. = 1 3.078 6.314 12.706 31.821 63.656 2 1.886 2.920 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 6 1.440 1.943 2.447 3.143 3.707 7 1.415 1.895 2.365 2.998 3.499 8 1.397 1.860 2.306 2.896 3.355 9 1.383 1.833 2.262 2.821 3.250 10 1.372 1.812 2.228 2.764 3.169 11 1.363 1.796 2.201 2.718 3.106 12 1.356 1.782 2.179 2.681 3.055 13 1.350 1.771 2.160 2.650 3.012 14 1.345 1.761 2.145 2.624 2.977 15 1.341 1.753 2.131 2.602 2.947 16 1.337 1.746 2.120 2.583 2.921 17 1.333 1.740 2.110 2.567 2.898 18 1.330 1.734 2.101 2.552 2.878 19 1.328 1.729 2.093 2.539 2.861 20 1.325 1.725 2.086 2.528 2.845 21 1.323 1.721 2.080 2.518 2.831 22 1.321 1.717 2.074 2.508 2.819 23 1.319 1.714 2.069 2.500 2.807 24 1.318 1.711 2.064 2.492 2.797 25 1.316 1.708 2.060 2.485 2.787 26 1.315 1.706 2.056 2.479 2.779 27 1.314 1.703 2.052 2.473 2.771 28 1.313 1.701 2.048 2.467 2.763 29 1.311 1.699 2.045 2.462 2.756 30 1.310 1.697 2.042 2.457 2.750 31 1.309 1.696 2.040 2.453 2.744 32 1.309 1.694 2.037 2.449 2.738 33 1.308 1.692 2.035 2.445 2.733 34 1.307 1.691 2.032 2.441 2.728 35 1.306 1.690 2.030 2.438 2.724 0.1 0.05 0.025 0.01 0.005 36 1.306 1.688 2.028 2.434 2.719 37 1.305 1.687 2.026 2.431 2.715 38 1.304 1.686 2.024 2.429 2.712 39 1.304 1.685 2.023 2.426 2.708 40 1.303 1.684 2.021 2.423 2.704 41 1.303 1.683 2.020 2.421 2.701 42 1.302 1.682 2.018 2.418 2.698 43 1.302 1.681 2.017 2.416 2.695 44 1.301 1.680 2.015 2.414 2.692 45 1.301 1.679 2.014 2.412 2.690 97 1.290 1.661 1.985 2.365 2.627 98 1.290 1.661 1.984 2.365 2.627 99 1.290 1.660 1.984 2.365 2.626 100 1.290 1.660 1.984 2.364 2.626 1.282 1.645 1.960 2.326 2.576 Want 98% CI, n = 33, a/2 = d.f. = t = 0 t

0.1 0.05 0.025 0.01 0.005 d.f. = 1 3.078 6.314 12.706 31.821 63.656 2 1.886 2.920 4.303 6.965 9.925 3 1.638 2.353 3.182 4.541 5.841 4 1.533 2.132 2.776 3.747 4.604 5 1.476 2.015 2.571 3.365 4.032 6 1.440 1.943 2.447 3.143 3.707 7 1.415 1.895 2.365 2.998 3.499 8 1.397 1.860 2.306 2.896 3.355 9 1.383 1.833 2.262 2.821 3.250 10 1.372 1.812 2.228 2.764 3.169 11 1.363 1.796 2.201 2.718 3.106 12 1.356 1.782 2.179 2.681 3.055 13 1.350 1.771 2.160 2.650 3.012 14 1.345 1.761 2.145 2.624 2.977 15 1.341 1.753 2.131 2.602 2.947 16 1.337 1.746 2.120 2.583 2.921 17 1.333 1.740 2.110 2.567 2.898 18 1.330 1.734 2.101 2.552 2.878 19 1.328 1.729 2.093 2.539 2.861 20 1.325 1.725 2.086 2.528 2.845 21 1.323 1.721 2.080 2.518 2.831 22 1.321 1.717 2.074 2.508 2.819 23 1.319 1.714 2.069 2.500 2.807 24 1.318 1.711 2.064 2.492 2.797 25 1.316 1.708 2.060 2.485 2.787 26 1.315 1.706 2.056 2.479 2.779 27 1.314 1.703 2.052 2.473 2.771 28 1.313 1.701 2.048 2.467 2.763 29 1.311 1.699 2.045 2.462 2.756 30 1.310 1.697 2.042 2.457 2.750 31 1.309 1.696 2.040 2.453 2.744 32 1.309 1.694 2.037 2.449 2.738 33 1.308 1.692 2.035 2.445 2.733 34 1.307 1.691 2.032 2.441 2.728 35 1.306 1.690 2.030 2.438 2.724 0.1 0.05 0.025 0.01 0.005 36 1.306 1.688 2.028 2.434 2.719 37 1.305 1.687 2.026 2.431 2.715 38 1.304 1.686 2.024 2.429 2.712 39 1.304 1.685 2.023 2.426 2.708 40 1.303 1.684 2.021 2.423 2.704 41 1.303 1.683 2.020 2.421 2.701 42 1.302 1.682 2.018 2.418 2.698 43 1.302 1.681 2.017 2.416 2.695 44 1.301 1.680 2.015 2.414 2.692 45 1.301 1.679 2.014 2.412 2.690 97 1.290 1.661 1.985 2.365 2.627 98 1.290 1.661 1.984 2.365 2.627 99 1.290 1.660 1.984 2.365 2.626 100 1.290 1.660 1.984 2.364 2.626 1.282 1.645 1.960 2.326 2.576 Want 90% CI, n = 600, a/2 = d.f. = t = 0 t Table gives right-tail area. (e.g., for a right-tail area of 0.025 and d.f. = 15, the t value is 2.131.)

Example 2 Estimate with 98% confidence the mean gallons of water used per shower for Dallas Cowboys after a game. The sample mean of 16 showers is 30.00 gallons and the sample standard deviation is 10.4 gallons. m.o.e. = t a 2, n-1 s n = t.01, 15 10.4 16 = 2.602 10.4 4 = 6.765 gal.

Example 2, continued... 98% confidence interval: Point estimate m.o.e. 30.00 6.765 or 23.235 to 36.765 gallons

Example 2, Statement in the I am 98% confident that the population mean gallons of water used per shower for Dallas Cowboys after a game will fall within the interval 23.235 to 36.765 gallons. A statement must contain four parts: 1. amount of confidence. 2. the parameter being estimated 3. the population to which we generalize 4. the calculated interval.

Confidence vs. Probability BEFORE a sample is collected, there is a 95% probability that the future to be computed sample mean, will fall within m.o.e. units of m. AFTER the sample is collected, the computed sample mean either fell within m.o.e. units of m, or it did not. After the event, it does not make sense to talk about probability. Analogy: Suppose you own 95 tickets in a 100-ticket lottery. The drawing was held one hour ago, but you don t know the result. P(win) = 0 or 1, but you are very CONFIDENT that you have won the lottery.

What to use for confidence interval about the mean, z or t? How to decide? For standard deviation known, use z For standard deviation estimated from the sample, use t.

Sample size and Margin of Error for population mean So far we have worked on finding confidence intervals after a sample has been drawn Today: Work on something BEFORE sample is drawn What sample size should I use to get a particular margin of error? s m.o.e. = Z a 2 n

Sample size and margin of error population mean s m.o.e. = Z a 2 n Call m.o.e E E = Z s a 2 n Now Solve for n n = Z 2 2 a 2 E s 2

Sample size and margin of error population mean We can think of s as coming from historical data or our best estimate. n = Z 2 2 a 2 E s 2

Example What sample size is need to estimate the mean mpg of Toyota Camrys with a margin of error of.2 mpg at 90% confidence if the historical standard deviation is.88 mpg? First, Notice we are talking about a mean (not a proportion). This helps us to choose the correct formula. Next, Find the appropriate z value Finally, plug and chug. n = 1.645 2 0.88 2 0.2 2 = 52.39

Confidence Intervals for the Population Proportion. Examples Proportion of people who were laid off this year Proportion of college graduates in Ogden Proportion of skiers who are from out of state. Point estimate of population proportion is Our interval Estimate is going to be In order to form the margin of error we need to know shape of.we need its distribution For large samples is normally distributed Use z tables

Confidence Intervals for the Population Proportion. Focus on Margin of Error Z α/2 * Some measure of Standard error Z a 2 Confidence interval is: + Z a 2 _ p (1 p) n _ p (1 p) n

Example The governor will spend more money convincing voters of a new program if he finds fewer than 50% of voters currently support it. In a telephone survey of 200 randomly selected voters, 82 say they support the proposed program Construct a 95% confidence interval for the population of ALL voters who support the program. Should the governor spend more money convincing voters?

Example Continued + Z a 2 p (1 p) n P-bar = sample proportion = 82/200 =.41.41 + 1.96.41(.59) 200.41 ± 0.068 = (.342,.478 )

Example continued What can we conclude? The CI is.342 to.478.50 is NOT in the CI, therefore.50 is not a plausible value Less than.5 of the voters support the proposed program; therefore the governor should spend more on promotion.

Sample size and Margin of Error for population proportion Start with Error and solve for Sample size E = n = Z a 2 _ p (1 p) It does not make sense to talk about the sample proportion p-bar before taking a sample Solution: Make an educated guess of sample proportion and put in in the formula. n _ Book calls the educated guess p*

Example In a survey, the planning value for the population proportion is p* =.35. How large of a sample should be take to provide a 95% confidence interval with a margin of error of.05 n = n = (1.96 2 *.35(.65))/(.05 2 ) n = 349.59 Round Up n = 350