Section 6.3: Measures of Position

Similar documents
Measures of Position

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

DAY 52 BOX-AND-WHISKER

Averages and Variation

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Measures of Dispersion

appstats6.notebook September 27, 2016

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

AND NUMERICAL SUMMARIES. Chapter 2

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Descriptive Statistics: Box Plot

No. of blue jelly beans No. of bags

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

+ Statistical Methods in

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

How individual data points are positioned within a data set.

Box Plots. OpenStax College

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Measures of Position. 1. Determine which student did better

Measures of Central Tendency

Univariate Statistics Summary

Numerical Summaries of Data Section 14.3

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

CHAPTER 2 DESCRIPTIVE STATISTICS

MATH NATION SECTION 9 H.M.H. RESOURCES

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

UNIT 1A EXPLORING UNIVARIATE DATA

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Understanding and Comparing Distributions. Chapter 4

1.3 Graphical Summaries of Data

CHAPTER 3: Data Description

Exploratory Data Analysis

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Lecture Notes 3: Data summarization

Chapter 3 - Displaying and Summarizing Quantitative Data

6th Grade Vocabulary Mathematics Unit 2

AP Statistics Prerequisite Packet

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

MAT 155. Z score. August 31, S3.4o3 Measures of Relative Standing and Boxplots

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

Learning Log Title: CHAPTER 8: STATISTICS AND MULTIPLICATION EQUATIONS. Date: Lesson: Chapter 8: Statistics and Multiplication Equations

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

23.2 Normal Distributions

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Processing, representing and interpreting data

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Homework Packet Week #3

WHOLE NUMBER AND DECIMAL OPERATIONS

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

2.1: Frequency Distributions and Their Graphs

Section 9: One Variable Statistics

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

IT 403 Practice Problems (1-2) Answers

15 Wyner Statistics Fall 2013

Ex.1 constructing tables. a) find the joint relative frequency of males who have a bachelors degree.

Chapter 3 Analyzing Normal Quantitative Data

Descriptive Statistics

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II. 3 rd Nine Weeks,

Using a percent or a letter grade allows us a very easy way to analyze our performance. Not a big deal, just something we do regularly.

Let s take a closer look at the standard deviation.

Understanding Statistical Questions

Mean,Median, Mode Teacher Twins 2015

Boxplots. Lecture 17 Section Robb T. Koether. Hampden-Sydney College. Wed, Feb 10, 2010

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Chapter 2 Describing, Exploring, and Comparing Data

CHAPTER 2: SAMPLING AND DATA

Chapter 6: DESCRIPTIVE STATISTICS

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread

Bar Graphs and Dot Plots

Distributions of Continuous Data

Maths Revision Worksheet: Algebra I Week 1 Revision 5 Problems per night

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Chapter 5: The standard deviation as a ruler and the normal model p131

National 5 Mathematics Applications Notes with Examples

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

IQR = number. summary: largest. = 2. Upper half: Q3 =

Chapter 5snow year.notebook March 15, 2018

Chapter2 Description of samples and populations. 2.1 Introduction.

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Measures of Dispersion

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Transcription:

Section 6.3: Measures of Position Measures of position are numbers showing the location of data values relative to the other values within a data set. They can be used to compare values from different data sets or to compare values within the same data set. Why do we need them? Consider the following example. Example 1. Graduate Record Exam My friend Suzanne was applying to graduate schools and was required to take the Graduate Record Exam (GRE). The GRE is comprised of two sections: verbal and analytical. Suzanne knew that her verbal abilities were strong, but was worried about the analytical (math) section. Consequently, I helped her prepare for several weeks on the analytical section. On the test day, Suzanne received her raw scores immediately after completing the exam. Confused and worried, she called to tell me her scores: Verbal section Analytical section 380 420 To be accepted in the programs she was interested in, Suzanne needed a high score on verbal section. I assured her that she shouldn't panic because these were only her raw scores. She needs to wait until the end of the testing period to receive the summary of students scores in order to have a better idea of how she really did on the exam. After waiting several long weeks a letter from the testing company came with the following summary: Section Mean Standard Deviation Verbal 356 12 Analytical 400 20 Based on the mean, it appears as though the verbal section (mean 356) was a little more dicult than the analytical section (mean 400). Perhaps this explains her reverse performance. Also notice Suzanne's two scores are measured on diferent scales (standard deviations). Objective: Determine and interpret z-scores. The best way to evaluate how my friend did on these sections is to compare her scores to all of her peers by nding out how many standard deviations her scores are from the mean scores. In other words, we need to standardize the values by converting them to z-scores. 198

Denition 2. The z-score (or standardized value) indicates the number of standard deviations a given value x is above or below the mean. z = x x s A positive z-score means the value is situated above the mean. A zero z-score means the data value is situated exactly at the mean. A negative z-score means the data value is situated below the mean. Example 3. Find Suzanne's z-score on the verbal section. Recall, her raw score was 380, the mean was 356, and the standard deviation was 12. z verbal = x x s z verbal = z verbal = 24 12 z verbal = 2 380 356 12 z-score formula Insert the score (x) and the mean (x) for the verbal section On the denominator insert the standard deviation (s) Perform the subtraction on the numerator Divide 24 by 12 This is her z-score for the verbal section. Interpretation: Her verbal score was 2 standard deviations above the mean. Example 4. Find Suzanne's z-score on the analytical section. Recall, her raw score was 420, the mean was 400, and the standard deviation was 20. z analytical = x x s z analytical = z analytical = 20 20 z analytical = 1 420 400 20 z-score formula Insert the score (x), the mean (x) for the verbal section On the denominator insert the standard deviation (s) Perform the subtraction on the numerator Divide 20 by 20 This is her z-score for the analytical section. Interpretation: Her analytical score was 1 standard deviations above the mean. Conclusion: Suzanne is certainly above average on both sections. We can quickly see that with the positive z-scores. 199

Example 5. Which section do you think she performed better? The z-scores show that Suzanne's score on the verbal section is 2 standard deviations above the mean while her score on the analytical section is 1 standard deviation above the mean. Therefore, Suzanne performed better on the verbal section because it has a larger z-score. Objective: Determine and interpret quartiles. Quartiles are also measures of position. For example, if your child's pediatrician tells you that your child's height is at the third quartile (Q 3 ) for his/her age this means 75 percent of children of the same age as your child are of the same height or shorter than your child. Denition 6. As the name implies, quartiles of a data set divide the data set into four equal parts, each containing 25% of the data. The rst quartile (Q 1 ) separates the bottom 25% of sorted values from the top 75%. The second quartile (Q 2 ) which is also the median separates the bottom 50% of sorted values from the top 50%. The third quartile (Q 3 ) separates the bottom 75% of sorted values from the top 25%. Steps to nd quartiles: 1. Arrange the data in order of lowest to highest 2. Find the median which is also the second quartile (Q 2 ) 3. Find the middle of the rst half of the data which is the rst quartile (Q 1 ) 4: Find the middle of the second half of the data which is the third quartile (Q 3 ) Example 7. Commute distance to school The following data is the commute distance (in miles) of a group of students. Find the quartiles for this data. 200

First, we order the values: 1 2 4 5 6 4 1 12 10 30 1 6 10 6 5 9 7 5 8 17 25 35 12 10 1 1 1 2 4 4 5 5 5 6 6 6 7 8 9 10 10 10 12 12 17 25 30 35 Second, we nd the median (Q 2 ): Position of the median = 24 + 1 2 = 12.5 We will need to determine what number is exactly between the 12 th and 13 th data value. The midpoint formula will help up determine this. Q 2 = Median = 6 + 7 2 = 6.5 Next, we nd the middle of the rst half of the data which is the rst quartile (Q 1 ) and the middle of the second half of the data which is Q 3 (see picture below). Thus, the quartiles for the data in this example are 4, 6.5, and 12, respectively. Objective: Examine a data set for outliers using quartiles and interquartile range First, we need to calculate a few statistics: the interquartile range, the lower fence, and the upper fence. Let us introduce each of these statistics and how we can calculate them. Denition 8. The interquartile range (denoted IQR) is a measure of spread, based on dividing a data set into quartiles. IQR = Q 3 Q 1 The interquartile range is interpreted to be the spread of the middle 50% of the data. Note that IQR is not sensitive to outliers since it represents the dierence between Q 3 and Q 1. Denition 9. Fences are used to help us determine what values are extreme in our data. Lower fence = Q 1 (1.5)(IQR) 201

Upper fence = Q + 3 (1.5)(IQR) Any data value outside of the fences is considered an outlier (symbolized on a plot with a dot or asterisk *). Example 10. Find the outliers, if any, for the commute distance to school example. First, we compute the IQR: IQR = Q 3 Q 1 Interquartile range formula Insert the values for third and rst quartile IQR = 12-4 Recall that for the commute distance data Q 1 = 4 and Q 3 = 12. Subtract 4 from 12 IQR = 8 The interquartile range is 8. It shows that the middle 50% of the students' commute distances do not dier by more than 8. Next, we calculate the fences: Lower fence = Q 1 (1.5)(IQR) Lower fence formula Insert the values for the rst quartile and IQR Lower fence = 4 (1.5)(8) Q 1 = 4 and IQR=8 Multiply 1.5 by 8 Lower fence =4 12 Perform the subtraction Lower fence = 8 Upper fence = Q 3 + (1.5)(IQR) Upper fence formula Insert the values for the third quartile and IQR Upper fence = 12 + (1.5)(8) Q 3 = 12 and IQR=8 Multiply 1.5 by 8 Upper fence =12 + 12 Perform the addition Upper fence =24 Examine if there are any data value outside of these fences. 1 1 1 2 4 4 5 5 5 6 6 6 7 8 9 10 10 10 12 12 17 25 30 35 There are no values in the set smaller than the lower fence value of - 8. There are three values in the set larger than the upper fence value of 24. These values are 25, 30, and 35 and we consider them outliers. 202

Objective: Compute the ve-number summary Denition 11. A ve number summary consists of: 1) The minimum (smallest observation) 2) The rst quartile (Q1) 3) The median (Q2) 4) The third quartile(q3) 5) The maximum (largest observation) Example 12. Write the 5 - number summary for the commute distance to school example. Minimum = 1 Q 1 = 4 Q 2 = 6.5 Q 3 = 12 Maximum = 35 Objective: Construct and interpret box-plots Denition 13. The box-plot is a convenient graphical display of the ve-number summary of a data set. Boxplots can reveal the spread of the data, the distribution of the data, and the presence of outliers. They are excellent for comparing data sets. Steps to draw a box-plot 1. Compute the ve-number summary and the lower and upper fences. 2. Draw a horizontal line and label it with an appropriate scale. 3. Draw vertical lines at Q 1, Median, and Q 3. Enclose these vertical lines in a box. 4. Draw a line from Q 1 to the smallest data value that is within the lower fence. Draw a line from Q 3 to the largest value that is within the upper fence. 5. Any values outside the fences are outliers and are marked with a dot. Below is the box-plot for the commute distance to school example. 203

The dots on the plot symbolize the outliers of this data set. Box plots are useful for comparing data from two distributions. Example 14. The box plot of service times for two fast-food restaurants is shown below. Compare the spread of the two data sets, their distribution shapes, and identify outliers. Notice that store 2 has a shorter median service time of 2.1 minutes compared to the median service time at store 1 of 2.3 minutes. However, store 2 is less consistent since it has a wider spread of data. Also at store 2, 75% of customers were served within 5.7 minutes, while at store 1, the same percent of customers were served within 2.9 minutes. Both stores have a right skewed distribution of service times and there are no outliers present in any of these two data sets. 204

6.3 Practice 1. Supose the average score on the commercial drivers license test in Maryland is 79 with a standard deviation of 5. Find the corresponding z - scores for each raw score. a. 89 b. 79 c. 77 d. 92 e. 81 2. The average cost in dollars of a wedding in United States (not including the cost for a honeymoon) is 26,650 with a standard deviation of 2,650. Find the corresponding z - scores for each raw score. a. 29,300 b. 10,750 c. 32,745 3. After two years of school, a student who attends a university has a loan of 19,000 where the average debt is 16,000 and the standard deviation 1,500. After two years of school, another student who attends a college has a loan of 6,250 where the average debt is 6,000 and the standard deviation 500. Which student has a higher debt in relationship to his or her peers? 4. The tallest living man has a height of 243 cm. The tallest living woman is 234 cm tall. Heights of men have a mean of 173 cm and a standard deviation of 7 cm. Heights of women have a mean of 162 cm and a standard deviation of 5 cm. Relative to the population of the same gender, nd who is taller? 5. A group of diners were asked how much they would pay (in dollars) for a meal. Their responses were (in dollars): 5, 7, 6, 8, 10, 9, 25, 12. a. Identify the ve number summery for this data. b. Find the interquartile range. c. Find the fences and examine the data for outliers. d. Draw the boxplot. 6. The sodium content of real cheese and the sodium content of cheese substitute are shown below in a side-by-side boxplot. 205

a. Identify the ve number summary for the substitute cheese distribution. b. Which data has a higher median? c. Which data has a larger variation? d. Approximately what percent of real cheese have a sodium content greater than the 1 st quartile of the cheese substitute? 7. The following side-by-side boxplot describes the commute time in minutes for a sample of 150 college students attending CCBC. Refer to the plot to answer the questions below. a. If any, list the outliers for the male data set and for the female data set. b. The bottom 25% of the commute times in either distribution (male or female) are below what value? c. Which quartile in the distribution of males' commute time has the smallest spread? d. What is the shape of the distribution of the males' commute times? 206

6.3 Answers 1. a. z = 2 b. z =0 c. z = -0.4 d. z=2.6 e. z = 0.4 2. a. z = 1 b. z = -6 c. z = 2.30 3. z = 2 for the student who attends the university z = 0.5 for the student who attends the college The student attending the university has a higher debt in relationship to his or her peers. This is because his or her z-score is larger. 4. z man =10 z woman =14.4 The woman is taller relative to the population of the same gender. This is because her z-score is larger. 5. a. min = 5 dollars, Q 1 =6.5 dollars, Q 2 =8.5 dollars, Q 3 =11 dollars, max = 25 dollars b. IQR = 4.5 dollars c. Upper Fence = 17.75 dollars Lower Fence = - 0.25 dollars There is one value outside the upper fence (25). No values are outside the lower fence. 207

d. 6. a. min = 135, Q 1 =215; Q 2 = 265, Q 3 =300, max =340 b. Substitute Cheese c. Real Cheese d. 50% 7. a. Females: 40, 40, 45, and 50 minutes; Males: 60 minutes b. 10 minutes c. 2 nd quartile d. Skewed right 208