Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Similar documents
CHAPTER 2 Modeling Distributions of Data

Chapter 2 Modeling Distributions of Data

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Chapter 2: The Normal Distributions

Key: 5 9 represents a team with 59 wins. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games.

appstats6.notebook September 27, 2016

Chapter 5: The standard deviation as a ruler and the normal model p131

Averages and Variation

Chapter 2: Modeling Distributions of Data

AP Statistics. Study Guide

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

UNIT 1A EXPLORING UNIVARIATE DATA

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Section 2.2 Normal Distributions. Normal Distributions

Lecture 6: Chapter 6 Summary

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Descriptive Statistics

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

CHAPTER 2: SAMPLING AND DATA

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Measures of Central Tendency

Chapter 2: Frequency Distributions

Measures of Dispersion

Measures of Dispersion

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

Chapter 2 Describing, Exploring, and Comparing Data

Name Date Types of Graphs and Creating Graphs Notes

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Student Learning Objectives

AP Statistics Summer Assignment:

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Chapter 2: Descriptive Statistics

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

Frequency Distributions

DAY 52 BOX-AND-WHISKER

15 Wyner Statistics Fall 2013

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

L E A R N I N G O B JE C T I V E S

Chapter 3 Analyzing Normal Quantitative Data

No. of blue jelly beans No. of bags

MATH NATION SECTION 9 H.M.H. RESOURCES

Ms Nurazrin Jupri. Frequency Distributions

Section 9: One Variable Statistics

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Bar Graphs and Dot Plots

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

6-1 THE STANDARD NORMAL DISTRIBUTION

IT 403 Practice Problems (1-2) Answers

6th Grade Vocabulary Mathematics Unit 2

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Chapter 6: DESCRIPTIVE STATISTICS

1.2. Pictorial and Tabular Methods in Descriptive Statistics

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

Parents Names Mom Cell/Work # Dad Cell/Work # Parent List the Math Courses you have taken and the grade you received 1 st 2 nd 3 rd 4th

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

Mean,Median, Mode Teacher Twins 2015

Understanding and Comparing Distributions. Chapter 4

Univariate Statistics Summary

Statistics Lecture 6. Looking at data one variable

Processing, representing and interpreting data

Chapter2 Description of samples and populations. 2.1 Introduction.

Section 10.4 Normal Distributions

CHAPTER 2 DESCRIPTIVE STATISTICS

Chapter 5snow year.notebook March 15, 2018

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

Descriptive Statistics

How individual data points are positioned within a data set.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Chapter 2: The Normal Distribution

MAT 110 WORKSHOP. Updated Fall 2018

4.2 Data Distributions

Section 2.2 Normal Distributions

Chapter 5. Normal. Normal Curve. the Normal. Curve Examples. Standard Units Standard Units Examples. for Data

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

CHAPTER 2 Modeling Distributions of Data

Chapter 5: The normal model

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Homework Packet Week #3

Chapter 6. THE NORMAL DISTRIBUTION

AND NUMERICAL SUMMARIES. Chapter 2

Distributions of random variables

Table of Contents (As covered from textbook)

STA 570 Spring Lecture 5 Tuesday, Feb 1

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Chapter 6 The Standard Deviation as Ruler and the Normal Model

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

3.2 Measures of Central Tendency Lesson MDM4U Jensen

Transcription:

Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the highest score in the class? What if it is below the average of the entire class? Maybe the teacher might curve the grade. We will focus on the act of describing the location of an individual within a distribution. Let s consider the class scores below: 79 81 80 77 73 83 74 93 78 80 75 67 73 77 83 86 90 79 85 83 89 84 82 77 72 Here is a stemplot with the data. Notice that the distribution is roughly symmetric with no apparent outliers. Where does your score in comparison to everyone else? Measuring Position: Percentiles 9 03 One way to describe your position is to tell what percent of students in the class earned scores that were below yours. That is, we can calculate the percentile. Definition: Percentile The pth percentile of a distribution is the value with p percent of the observations less than it. Here, our score is the fourth from the top of the class. Since 21 of the 25 observations are below our score it is at the 84 th percentile in the test score distribution. Using these scores, let s calculate the percentile of the following: a) The score at 72. 6 7 7 2334 7 5777899 8 00123334 8 569 b) The score at 93. c) The two students at 80. *Note: Some may define the pth percentile as the value with p percent less than or equal to it.

Cumulative Relative Frequency Graphs There are some interesting graphs that can be made using percentiles. One of the graphs starts with a frequency table for a quantitative variable. Here is a frequency table that summarizes the ages of the first 44 U.S. presidents when they were inaugurated: Age Frequency Relative Frequency 40-44 2 45-49 7 50-54 13 55-59 12 60-64 7 65-69 3 Cumulative Frequency Cumulative relative Frequency The extra columns will be used to help us determine the relative frequency, cumulative frequency, and cumulative relative frequency. To determine the relative frequency we would divide the count of each class by the total and multiply by 100 to get the percentage. To determine the cumulative frequency we would add the counts in the frequency column for the current class and all classes with smaller values of the variable. To determine the cumulative relative frequency, we would divide the entries in the cumulative frequency by the total and multiply by 100 to receive the percentage. We can make a cumulative relative frequency graph of the data using the table.

What can we learn from this graph? Barack Obama was inaugurated at the age of 47. Is this unusually young? Estimate and interpret the 65 th percentile of the distribution. Measuring Position: z-scores By looking back at your test score, we knew that the score is above what seems to be the average. Let s use the data of the test scores to determine the 1-variable statistics. Mean Median Standard Deviation We can describe the location of your score by telling how many standard deviations above or below the mean score is. Since the mean is 80 and the standard deviation is about 6, the score of 86 is about one standard deviation above the mean. Converting the observations in this manner is called standardizing. Definition: Standardized value (z-score) If x is an observation from a distribution that has known mean and standard deviation, the standardized value of x is x mean z = standard deviation A standardized value is often called a z-score. Let s revisit the scores we calculated the percentiles for and determine their z-scores. a) the grade at 93 b) the grade at 72 We can also use z-scores for comparisons. Suppose you took a Chemistry test and got an 82 on the test. At first you can be disappointed, but your teacher described the scores as fairly symmetric with a mean of 76 and a standard deviation of 4. How does your score compare to your statistics grade?

Transforming Data To find the standardized score(z-score) for an individual observation, we transformed this data value by subtracting the mean and dividing by the standard deviation. Transforming converts the observations from the original units of measurement to a standardized scale. What effect does transforming-adding or subtracting; multiplying or dividing- have on the shape, center, and spread of the entire distribution? Let s investigate. Soon after the metric system was introduced in Australia, a group of students were asked to guess the width of their classroom to the nearest meter. Here are the guesses in order from lowest to highest: 8 9 10 10 10 10 10 10 11 11 11 11 12 12 13 13 13 14 14 14 15 15 15 15 15 15 15 15 16 16 16 17 17 17 17 18 18 20 22 25 27 35 38 40 Let s create a dotplot and examine the 1-variable-statistics to describe the SOCS. Shape: Center: Spread: Outliers: Effect of adding or subtracting a constant The actual width of the room was actually 13 meters wide. How close were the student guesses? We can examine the distribution of students guessing errors by defining a new variable: error = guess 13 That is, we will subtract 13 from each observation. What can you guess would happen to our distribution? How will it effect the SOCS? Let s use the calculator to display the effect.

Effect of Adding (or Subtracting) a Constant Adding the same number a (either positive, zero, or negative) to each observation - adds a to measures of center and location (mean, median, quartiles, percentiles), but - does not change the shape of the distribution or measures of spread (range, IQR, standard deviation). Effect of multiplying or dividing a constant Since the metric system was barely introduced, it may not be useful to tell the students they were wrong by a few meters. So to put it in terms they may understand, we can convert the data into feet. There is roughly 3.28 feet in meter, so for the student that had an error of -5 meters can translate to 3.28 feet 5 meters = 16.4 feet 1meters So let s change the units of measurement from meters to feet. We need to multiply the error values by 3.28. What effect do you think it will have with the graph? Effect of Multiplying (or Dividing) by a Constant Multiplying (or dividing) each observation by the same number b (positive, negative, or zero) - multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b, - does not change the shape of the distribution Connecting transformations and z-scores How does transforming deal with z-scores? Well to find a z-score it is a combination of subtracting the mean from every score and dividing it by the standard deviation. Let s use the calculator to plot the z-scores. How do you think the distribution will change? Density Curves We already have a few steps to approach our data since the very beginning. 1) Plot your data: make a graph, usually a dot plot, stemplot, or histogram. 2) Look for the overall pattern (SOCS) 3) Calculate the numerical summary to describe the center and spread (mean/standard deviation or median/iqr) We will add the following: 4) Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve.

The following is a histogram of the scores of all 947 seventh-grade students in Gary, Indiana, on the vocabulary part of the Iowa Test of Basic Skills (ITBS). A smooth curve is drawn on top as a good description of the overall pattern of the data. The shaded region of scores less than 6.0 or less is shaded to compare to the area that is given in the graph on the right. The total area of the histogram bars is 100% (a proportion of 1), since all the observations are represented. In moving from histogram bars to a smooth curve, we make a specific choice: adjust the scale of the graph so that the total area of the curve is exactly 1. Now the total area represents all the observations, just like the histogram. We can interpret areas under the curve as proportions of the observations. Definition: Density Curve A density curve is a curve that - is always on or above the horizontal axis, and - has area exactly 1 underneath it. A density curve describes the overall pattern of a distribution. The area under the curve and above any interval of values on the horizontal axis is the proportion of all observations that fall in that interval. Density curves come in many shapes. A density curve can give a good approximation of the overall pattern. Outliers, which are departures from the pattern, are not described by the curve. *Note: No set of data is exactly described by a density curve. The curve is an approximation that is easy to use and accurate enough for practical use. Describing Density Curves Our measures of center and spread also apply to density curves as well as to actual sets of observations. Areas under a density curve represent proportions of the total number of observations. The median of a data set is the point with half the observations on either side. So the median of a density curve is the equal-areas point, the point with half the area under the curve to its left and the remaining half of its area to the right.

Because density curves are idealized patterns, a symmetric density curve is exactly symmetric. The median and mean of a symmetric curve are exactly the same. We can see below how a skewed distribution effects the location of the mean. The mean of a set of observations is their arithmetic average. The mean of a density curve is the point at which the curve would balance if it were made of solid material. From the previous section we had described the mean and standard deviation of a set of data with the symbols x and s x respectively. With a distribution curve we will denote the mean with the Greek letter mu (µ) and the standard deviation with the Greek Letter sigma (σ).