Measures of Central Tendency

Similar documents
Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Averages and Variation

Downloaded from

2.1: Frequency Distributions and Their Graphs

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency

15 Wyner Statistics Fall 2013

LESSON 3: CENTRAL TENDENCY

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

Chapter 3 - Displaying and Summarizing Quantitative Data

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

Univariate Statistics Summary

Measures of Dispersion

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40

Measures of Dispersion

September 11, Unit 2 Day 1 Notes Measures of Central Tendency.notebook

Frequency Distributions

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

UNIT 1A EXPLORING UNIVARIATE DATA

Chapter 1. Looking at Data-Distribution

CHAPTER 2: SAMPLING AND DATA

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BBA240 STATISTICS/ QUANTITATIVE METHODS FOR BUSINESS AND ECONOMICS

AND NUMERICAL SUMMARIES. Chapter 2

CHAPTER 3: Data Description

Learning Log Title: CHAPTER 8: STATISTICS AND MULTIPLICATION EQUATIONS. Date: Lesson: Chapter 8: Statistics and Multiplication Equations

+ Statistical Methods in

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

3.1 Measures of Central Tendency

Chapter 2 Describing, Exploring, and Comparing Data

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

CHAPTER 2 DESCRIPTIVE STATISTICS

3.2-Measures of Center

Numerical Summaries of Data Section 14.3

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

AP Statistics Summer Assignment:

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Chapter 6: DESCRIPTIVE STATISTICS

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Section 3.2 Measures of Central Tendency MDM4U Jensen

Descriptive Statistics

Quartile, Deciles, Percentile) Prof. YoginderVerma. Prof. Pankaj Madan Dean- FMS Gurukul Kangri Vishwavidyalaya, Haridwar

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

1.3 Graphical Summaries of Data

Math 155. Measures of Central Tendency Section 3.1

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

1. To condense data in a single value. 2. To facilitate comparisons between data.

STA 570 Spring Lecture 5 Tuesday, Feb 1

Lecture Notes 3: Data summarization

Mean,Median, Mode Teacher Twins 2015

Chapter 5snow year.notebook March 15, 2018

Basic Statistical Terms and Definitions

Date Lesson TOPIC HOMEWORK. Displaying Data WS 6.1. Measures of Central Tendency WS 6.2. Common Distributions WS 6.6. Outliers WS 6.

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

MATH NATION SECTION 9 H.M.H. RESOURCES

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

Chapter 3 Analyzing Normal Quantitative Data

Table of Contents (As covered from textbook)

3.2 Measures of Central Tendency Lesson MDM4U Jensen

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Processing, representing and interpreting data

L E A R N I N G O B JE C T I V E S

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread

UNIT 3. Chapter 5 MEASURES OF CENTRAL TENDENCY. * A central tendency is a single figure that represents the whole mass of data.

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Measures of Position

Section 6.3: Measures of Position

AP Statistics Prerequisite Packet

CHAPTER 7- STATISTICS

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Elementary Statistics

Lecture 3 Questions that we should be able to answer by the end of this lecture:

No. of blue jelly beans No. of bags

IT 403 Practice Problems (1-2) Answers

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Measures of Central Tendency

Lecture 1: Exploratory data analysis


Chapter Two: Descriptive Methods 1/50

1 Overview of Statistics; Essential Vocabulary

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

Center, Shape, & Spread Center, shape, and spread are all words that describe what a particular graph looks like.

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Transcription:

Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of values in the data set. The mean of a sample data set is denoted by X and the mean of a population data set by the Greek letter. X = n x = Example : Find the mean of the following data set: Quiz Scores:, 5, 7, 7, 6, 8, 0, 9, 5, 0, 8 x X x 76 = = = 6.9 n When calculating the mean from a frequency distribution, this becomes x f mean = X = = n Mean for Grouped Data The mean for grouped data is calculated by multiplying the frequencies and midpoints of the classes. X = f X m n xf f

Page of 6 Example : Miles Run Below is a frequency distribution of miles run per week. Find the mean. Class Boundaries Frequencies 5.5-0.5 0.5-5.5 5.5-0.5 3 0.5-5.5 5 5.5-30.5 4 30.5-35.5 3 35.5-40.5 f = 0 Solution Class Boundaries Frequencies Midpoint, X m f X m 5.5-0.5 8 8 0.5-5.5 3 6 5.5-0.5 3 8 54 0.5-5.5 5 3 5 5.5-30.5 4 8 30.5-35.5 3 33 99 35.5-40.5 38 76 f = 0 m 490 X f X m 490 = = = n 0 4.5 miles Weighted Mean Sometimes, you must find the mean of a data set in which not all values are equally represented. For such cases we compute the Weighted Mean we multiply each value by its corresponding weight and divide the sum of the products by the sum of the weights.

Page 3 of 6 x = x w x + w x + w3 x3 +... + wnx w + w + w +... + w = wx w 3 n n where w, w,..., w n are the weights and x, x,..., xn are the values. Examples: Grade point average. We assign the letter grades the number values A=4, B=3, C=, D=, F=0, and then each grade value is counted into the GPA according to the number of credits earned with that grade. Course grade. Suppose the final grade in a course is calculated according to the following scale: Homework counts for 5%, 3 exams count 0% each, and the final exam is worth 5%. We can weight the score for each component of the final grade with its percentage to calculate the final grade. Properties of Mean.. The algebraic sum of of the deviations of a set of numbers from their arithmetic mean is zero.. If is the mean of a set x,, x n of n numbers and is the mean of another set y,, y m of m numbers, then x c, the mean of the combined set is given by: x y x c = nx + my n+ m The Median The value of the middle term when all values are arranged in ascending or descending order. It is the value which separates the largest 50% of data values from the lowest 50%. In a histogram, half of the area is on either side of the median.

Page 4 of 6 If the number of values,, is odd, the middle value is the median. If is even, the mean of the two middle values is the median. Example 3: The following data set represents the quiz scores of a group of students. Quiz Scores:, 5, 7, 7, 6, 8, 0, 9, 5, 0, 8 Find the median value for the set of quiz scores. Find the median if the low score of is dropped. Example 4: Find the median of the following set of data. Marks 4 5 6 7 8 9 0 Frequency 3 0 8 3 Median with Grouped Data Since the median divides the frequency histogram into two equal areas, this fact gives us a method for determining the median. The median can also be estimated using the following formula: median n = l + ( fm ) f m c l = lower class boundary of the median group n = total frequency f = cumulative frequency of group before median group m c = class width of median group f = frequency of the median group m Example 5: The temperature of a component was monitored at regular intervals on 80 occasions. The frequency distribution was as follows:

Temperature x ( C ) 30.0-30. 30.3-30.5 30.6-30.8 30.9-3. 3.-3.4 Frequency f 6 5 0 3 Temperature x ( ) 3.5-3.7 3.8-3.0 Frequency f 9 5 Find the median using: a) the histogram method b) the formula C Page 5 of 6 The Mode The Mode of a data set is the value of the variable that occurs most often. A data set can also have more than one mode or no mode at all. Example 6: The set.3.4.8.3 4.5 3. has.3 as the mode. It is unimodal. Example 7: The set 3 4 7 8 3 has no mode. Example 8: The set 3 5 5 5 7 7 8 8 8 has two modes, 5 and 8. It is bimodal. For grouped data, the mode is computed as follows: First the modal group is identified. Let l = lower class boundary of the modal group c = class width f = m frequency of modal group fm f m + = fm fm = fm f m + = frequency of group preceding modal group = frequency of group after modal group mode c = l + + Mode can also be found using a histogram. Once the modal class has been identified, the value of the mode itself lies within that range and can be found by a simple construction.

Page 6 of 6 Example 9: the masses of 50 castings gave the following frequency distribution. Mass (kg) 0-3-5 6-8 9- -4 5-7 8-30 Frequency f 3 7 6 0 8 5 x If we draw the histogram, using central values as the midpoints of the bases of the rectangles, we obtain The modal class is the third class with boundaries 5.5 and8.5 kg. The two diagonal lines AD and BC are drawn as shown. The x value of their point of intersection is taken as the mode of the set of observations. For this case the mode = 7.3 Exercise: Find the mode of the frequency distribution above using the formula. ote: The mode is the only measure of central tendency that can be used in finding the most typical case when the data is categorical. Mode is not a very good measure of center as it is not based on all observations.

Page 7 of 6 Properties of Mean, Median, and Mode Mean is the most commonly used measure of central tendency. One drawback of the mean is that it is heavily influenced by a few very high or very low data values (extremes or outliers). In these cases it is more common to use the median e.g. household income in Kenya. The mode has the advantage that it can be used to measure data sets even if they contain only qualitative data. A disadvantage is that a data set may not have a mode. Of the three measures of center, only the mean is based on all observations. Shapes of Data Distributions. Symmetric The data distribution is approximately the same shape on either side of a central dividing line. The mean and median (and mode if unimodal) are equal in a symmetric distribution. 4 0 8 6 4 0 3 4 5 6 7 8 9 Examples: Men s Heights, SAT Math scores

Page 8 of 6. Left-Skewed A few data values are much lower than the majority of values in the set. (Tail extends to the left) Generally the mean is less than the median (and mode) in a left-skewed distribution. 4 0 8 6 4 0 3 4 5 6 7 8 9 Example: Exam scores with a few students doing poorly

Page 9 of 6 3. Right-Skewed A few data values are much higher than the majority of values in the set. (Tail extends to the right) Generally the mean is greater than the median (and mode) in a right-skewed distribution. 0 8 6 4 0 3 4 5 6 7 8 9 Examples: Personal Income in Kenya, Men s weights Question: Homes in a certain area have a mean price of Kshs 0 million but a median price of Kshs.5 million. How can you explain this best?

Page 0 of 6 Measures of Position Fractiles divide a data set into consecutive intervals so that each interval has (at least approximately) the same number of data values. The most common fractiles are: Quartiles divide a data set into fourths. For example, the lower quartile, is found a quarter-way when observations are arranged in ascending order, while the upper quartile, is found three-quarter way. Q 3 Example : The set 6 0 : 30 4 : 48 50 : 56 6 has Q 0 + 30 = 5 as Q And 50 + 56 = 53 as Q 3 For grouped data, the values of Q and Q 3 are computed using the following formulae: Q = l + Q n 4 f Q f Q c Q 3 3n 4 = l + Q 3 f Q 3 f Q The symbols in these two equations have the same meanings as the median formula. 3 c Example : Find the values of Q and 3 Q of the following hypothetical data. Class 0-0 0-30 30-40 40-50 50-60 60-70 70-80 Frequency 0 5 36 4 9 5

Page of 6 Percentiles divide an ordered data set into 00 equal parts. For example, the 36 th percentile is the value which separates the lowest 36% of data values from the highest 64% of data values and is denoted by P36. A percentile rank for a datum represents the percentage of data values below the datum. ( X ) # of values below + 0.5 Percentile = 00% total # of values Deciles divide a data set into 0 equal parts. For example, the 7 th decile is the value which separates the lowest 7/0 of ordered data values from the highest 3/0 of data values and is denoted D7. ote: There are 99 percentiles P-P99, 3 quartiles Q-Q3, and 9 deciles D-D9. P50 = Q = D5 = Median

Page of 6 Measures of Dispersion (Spread) The mean, median and mode give important information regarding the general mass of the data, however they do not tell us anything about how spread out the observations are from the central values. The set 6, 7, 8, 9, 30 has a mean of 8 And 5, 9, 0, 36, 60 also has a mean of 8 These two sets have the same mean but clearly the first is more tightly arranged around the mean than the second. We therefore need a measure to indicate the spread of the values about the mean. Common Measures of Spread. Range the difference between the largest and smallest data values in a data set. range = ( highest value lowest value ) For a grouped frequency distribution, range is the difference between lower limit of lowest class and upper limit of highest class. Range deals only with the extreme values which may be outliers, it does not take care of the intermediate values and is therefore considered the poorest measure of dispersion.. Quartile Deviation Let and is called the Interquartile Range. Q Q 3 be the lower and upper quartiles. The difference 3 Q Q Half the interquartile range, denoted by Q, is the quartile deviation i.e. Q = Q Q ( ) 3 Quartile deviation deals only with the middle 50 percent of the data and ignores the rest. It is therefore not a very good measure of spread though better than the range.

Page 3 of 6 3. Standard Deviation The most commonly used measure of dispersion. It takes into account the deviation of every data value from the mean. Standard deviation is the root mean square (r.m.s.) of deviations from the mean and is calculated as follows:. Calculate the mean of the data set.. Subtract the mean from each data value in the set. These values are called the deviations of the data values. 3. Square each of the deviations calculated in Step. 4. Take the mean of the squares calculated from Step 3. 5. Take the square root of the result of Step 4. Example : Find the standard deviation of the data set of quiz scores: Quiz Scores:, 5, 7, 7, 6, 8, 0, 9, 5, 0, 8 Definition: Standard Deviation Let x, x, x3,..., x be observations with arithmetic mean x, then the standard deviation, S (or ) is S = ( X ) i X i= If x, x, x3,..., x occur with respective frequencies f, f, f3... f, then S = ( ) i i= X X f i where = f i= i

For a grouped frequency distribution, formula. x i Page 4 of 6 represent class midpoints (class-marks) in the above Using the above formula especially when large sets of data are involved can be quite tedious. An equivalent but simpler formula is: S Xi fi = i= ( X ) Example : Determine the standard deviation of the classified data below: Class -5 6-0 -5 6-30 3-35 36-40 Frequency 7 5 4 6 ote: If there are several sets of data of the same sizes but with different standard deviations, then the set with the least standard deviations is said to have its observations most closely clustered around their arithmetic mean. This set of data has the lowest variability and is therefore most consistent. Such a set of data is usually recommended for further analysis. 4. Variance the square of the standard deviation, represented by Exercises:. Find the standard deviation of the data set whose frequency distribution is given by: Class Frequency ( f ) 90-99 4 80-89 6 70-79 4 60-69 3 50-59 40-49 S.

Page 5 of 6. The lengths of 70 bars were measured and the following frequency distribution obtained: Length x (mm).-.4.5-.7.8-.0.-.3 Frequency f 3 5 0 6 Length x (mm).4-.6.7-.9 3.0-3. Frequency f 8 6 Find the mean and standard deviation of the data. 3. A set of 0 observations was found to have mean verification revealed that two observations 30 and 45 were wrong while the correct observations were 54 and 4. X = 40 and S = 5. Subsequent Determine the correct values of the mean and standard deviation if a) The wrong values were discarded and not replaced b) The wrong values were replaced with correct ones. 4. The mean height of students in a class is 5 cm. The mean height of the boys is 58 cm. The mean height of the girls is 48 cm. Determine the percentage of boys in the class. 5. Find the variance of the following data: Length x (cm) Frequency f 8-6 3 7-35 5 36-44 9 45-53 54-6 5 63-7 4 7-80

Page 6 of 6