Chapter 2 Modeling Distributions of Data

Similar documents
CHAPTER 2 Modeling Distributions of Data

Chapter 2: Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Section 2.2 Normal Distributions

Key: 5 9 represents a team with 59 wins. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games.

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Section 2.2 Normal Distributions. Normal Distributions

Chapter 2: The Normal Distributions

Lecture 6: Chapter 6 Summary

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Chapter 2: The Normal Distribution

Chapter 5: The standard deviation as a ruler and the normal model p131

appstats6.notebook September 27, 2016

UNIT 1A EXPLORING UNIVARIATE DATA

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

Student Learning Objectives

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION

AP Statistics. Study Guide

CHAPTER 2: Describing Location in a Distribution

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Chapter 3 - Displaying and Summarizing Quantitative Data

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Chapter 6: DESCRIPTIVE STATISTICS

IT 403 Practice Problems (1-2) Answers

Normal Distribution. 6.4 Applications of Normal Distribution

Chapter 1. Looking at Data-Distribution

No. of blue jelly beans No. of bags

Normal Data ID1050 Quantitative & Qualitative Reasoning

Chapter 6 Normal Probability Distributions

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Chapter 5snow year.notebook March 15, 2018

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

6-1 THE STANDARD NORMAL DISTRIBUTION

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

Unit 5: Estimating with Confidence

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

The Normal Distribution

Data Analysis & Probability

CHAPTER 2: SAMPLING AND DATA

DAY 52 BOX-AND-WHISKER

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

MAT 110 WORKSHOP. Updated Fall 2018

15 Wyner Statistics Fall 2013

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

Section 10.4 Normal Distributions

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

The Normal Distribution

Name Date Types of Graphs and Creating Graphs Notes

Averages and Variation

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Chapter 2 Describing, Exploring, and Comparing Data

How individual data points are positioned within a data set.

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

Density Curves Sections

So..to be able to make comparisons possible, we need to compare them with their respective distributions.

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Homework Packet Week #3

AP Statistics Prerequisite Packet

Basic Statistical Terms and Definitions

Measures of Dispersion

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

2.1: Frequency Distributions and Their Graphs

Chapter 2: Statistical Models for Distributions

MATH NATION SECTION 9 H.M.H. RESOURCES

Measures of Position

CHAPTER 2 DESCRIPTIVE STATISTICS

Understanding and Comparing Distributions. Chapter 4

4.3 The Normal Distribution

Frequency Distributions

AP Statistics Summer Assignment:

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Statistics Lecture 6. Looking at data one variable

3 Graphical Displays of Data

The Normal Distribution

BIOL Gradation of a histogram (a) into the normal curve (b)

CHAPTER 3: Data Description

Chapter 6 The Standard Deviation as Ruler and the Normal Model

Measures of Central Tendency

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Table of Contents (As covered from textbook)

Chapter2 Description of samples and populations. 2.1 Introduction.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

+ Statistical Methods in

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Transcription:

Chapter 2 Modeling Distributions of Data Section 2.1 Describing Location in a Distribution Describing Location in a Distribution Learning Objectives After this section, you should be able to: FIND and INTERPRET the percentile of an individual value within a distribution of data. ESTIMATE percentiles and individual values using a cumulative relative frequency graph. FIND and INTERPRET the standardized score (z-score) of an individual value within a distribution of data. DESCRIBE the effect of adding, subtracting, multiplying by, or dividing by a constant on the shape, center, and spread of a distribution of data. The Practice of Statistics, 5 th Edition 2 Measuring Position: Percentiles One way to describe the location of a value in a distribution is to tell what percent of observations are less than it. The p th percentile of a distribution is the value with p percent of the observations less than it. Cumulative Relative Frequency Graphs A cumulative relative frequency graph displays the cumulative relative frequency of each class of a frequency distribution. (refer to hand out for graph)

Measuring Position: z-scores A z-score tells us how many standard deviations from the mean an observation falls, and in what direction. If x is an observation from a distribution that has known mean and standard deviation, the standardized score of x is: z = x - mean standard deviation A standardized score is often called a z-score. Transforming Data Transforming converts the original observations from the original units of measurements to another scale. Transformations can affect the shape, center, and spread of a distribution. Effect of Adding (or Subtracting) a Constant Adding the same number a to (subtracting a from) each observation: adds a to (subtracts a from) measures of center and location (mean, median, quartiles, percentiles), but Does not change the shape of the distribution or measures of spread (range, IQR, standard deviation). Effect of Multiplying (or Dividing) by a Constant Multiplying (or dividing) each observation by the same number b: multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b multiplies (divides) measures of spread (range, IQR, standard deviation) by b, but does not change the shape of the distribution

Section 2.2 Density Curves and Normal Distributions Density Curves and Normal Distributions Learning Objectives After this section, you should be able to: ESTIMATE the relative locations of the median and mean on a density curve. ESTIMATE areas (proportions of values) in a Normal distribution. FIND the proportion of z-values in a specified interval, or a z-score from a percentile in the standard Normal distribution. FIND the proportion of values in a specified interval, or the value that corresponds to a given percentile in any Normal distribution. DETERMINE whether a distribution of data is approximately Normal from graphical and numerical evidence. The Practice of Statistics, 5 th Edition 2 Exploring Quantitative Data In Chapter 1, we developed a kit of graphical and numerical tools for describing distributions. Now, we ll add one more step to the strategy. Exploring Quantitative Data 1. Always plot your data: make a graph, usually a dotplot, stemplot, or histogram. 2. Look for the overall pattern (shape, center, and spread) and for striking departures such as outliers. 3. Calculate a numerical summary to briefly describe center and spread. 4. Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve.

Density Curves A density curve is a curve that is always on or above the horizontal axis, and has area exactly 1 underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval. Example Batting averages The histogram below shows the distribution of batting average (proportion of hits) for the 432 Major League Baseball players with at least 100 plate appearances in a recent season. The smooth curve shows the overall shape of the distribution. Describing Density Curves Our measures of center and spread apply to density curves as well as to actual sets of observations. Distinguishing the Median and Mean of a Density Curve The median of a density curve is the equal-areas point, the point that divides the area under the curve in half. The mean of a density curve is the balance point, at which the curve would balance if made of solid material. The median and the mean are the same for a symmetric density curve. They both lie at the center of the curve. The mean of a skewed curve is pulled away from the median in the direction of the long tail.

Describing Density Curves A density curve is an idealized description of a distribution of data. We distinguish between the mean and standard deviation of the density curve and the mean and standard deviation computed from the actual observations. The usual notation for the mean of a density curve is µ (the Greek letter mu). We write the standard deviation of a density curve as σ (the Greek letter sigma). Normal Distributions One particularly important class of density curves are the Normal curves, which describe Normal distributions. All Normal curves have the same shape: symmetric, single-peaked, and bell-shaped Any specific Normal curve is completely described by giving its mean µ and its standard deviation σ.

Normal Distributions A Normal distribution is described by a Normal density curve. Any particular Normal distribution is completely specified by two numbers: its mean µ and standard deviation σ. The mean of a Normal distribution is the center of the symmetric Normal curve. The standard deviation is the distance from the center to the change-of-curvature points on either side. We abbreviate the Normal distribution with mean µ and standard deviation σ as N(µ,σ). Why are the Normal distributions important in statistics? Normal distributions are good descriptions for some distributions of real data. Normal distributions are good approximations of the results of many kinds of chance outcomes. Many statistical inference procedures are based on Normal distributions. The 68-95-99.7 Rule (Emperical Rule) Although there are many Normal curves, they all have properties in common. The 68-95-99.7 Rule In the Normal distribution with mean µ and standard deviation σ: Approximately 68% of the observations fall within σ of µ. Approximately 95% of the observations fall within 2σ of µ. Approximately 99.7% of the observations fall within 3σ of µ.

The Standard Normal Distribution All Normal distributions are the same if we measure in units of size σ from the mean µ as center. The standard Normal distribution is the Normal distribution with mean 0 and standard deviation 1. If a variable x has any Normal distribution N(µ,σ) with mean µ and standard deviation σ, then the standardized variable z = x - m s has the standard Normal distribution, N(0,1). The Standard Normal Table The standard Normal Table (Table A) is a table of areas under the standard Normal curve. The table entry for each value z is the area under the curve to the left of z. Suppose we want to find the proportion of observations from the standard Normal distribution that are less than 0.81. We can use Table A: P(z < 0.81) =.7910 Z.00.01.02 0.7.7580.7611.7642 0.8.7881.7910.7939 0.9.8159.8186.8212

Normal Distribution Calculations We can answer a question about areas in any Normal distribution by standardizing and using Table A or by using technology. How To Find Areas In Any Normal Distribution Step 1: State the distribution and the values of interest. Draw a Normal curve with the area of interest shaded and the mean, standard deviation, and boundary value(s) clearly identified. Step 2: Perform calculations show your work! Do one of the following: (i) Compute a z-score for each boundary value and use Table A or technology to find the desired area under the standard Normal curve; or (ii) use the normalcdf command and label each of the inputs. Step 3: Answer the question. Working Backwards: Normal Distribution Calculations Sometimes, we may want to find the observed value that corresponds to a given percentile. There are again three steps. How To Find Values From Areas In Any Normal Distribution Step 1: State the distribution and the values of interest. Draw a Normal curve with the area of interest shaded and the mean, standard deviation, and unknown boundary value clearly identified. Step 2: Perform calculations show your work! Do one of the following: (i) Use Table A or technology to find the value of z with the indicated area under the standard Normal curve, then unstandardize to transform back to the original distribution; or (ii) Use the invnorm command and label each of the inputs. Step 3: Answer the question.

Assessing Normality The Normal distributions provide good models for some distributions of real data. Many statistical inference procedures are based on the assumption that the population is approximately Normally distributed. A Normal probability plot provides a good assessment of whether a data set follows a Normal distribution. Interpreting Normal Probability Plots If the points on a Normal probability plot lie close to a straight line, the plot indicates that the data are Normal. Systematic deviations from a straight line indicate a non-normal distribution. Outliers appear as points that are far away from the overall pattern of the plot. Other methods to show normality: - Calculate the percent of data that are 1, 2, and 3 standard deviations away from the mean. Does it follow the Emperical Rule? - Graph the data. Does the graph look approximately normal? (Not used very often, because the scale you use may affect the way the graph looks)