Measures of Central Tendency: Mean, Median, Mode and Midrange A Measure of Central Tendency is a value that represents a typical or central entry of a data set. Four most commonly used measures of central tendency: > mean > median > mode > range The mean (arithmetic average) of a data set is the sum of the data entries divided by the number of entries. > Population Mean: > Sample Mean: Symbols Descriptions Summation of the values (upper case Greek letter Sigma) Rounding Rule: round answers to one decimal place more than the number of decimal places in the original data. Variable that represents quantitative data entries Number of data entries in a Population Number of data entries in a Sample Mean for a population (lowercase Greek letter Mu) Mean for a sample (read x bar) Summation of the value of x Example 1: Find the mean of the array. 4, 3, 8, 9, 1, 7, 12 Example 2: Find the mean of the following numbers: The median of a data set is the middle data entry when the data set is sorted in ascending or descending order. If the data set has an even number of entries, the median is the mean of the two middle data entries. The abbreviation for the median is MD. 23, 25, 26, 29, 39, 42, 50 Example 3: Find the mean of the array. 2.0, 4.9, 6.5, 2.1, 5.1, 3.2, 16.6 1
Example 1: Find the median. 2, 3, 4, 7, 8 Example 2: Find the median. 6, 7, 8, 9, 9, 10 The mode of a data set is the data entry that occurs with the greatest frequency. If no entry is repeated, that data has no mode. If two entries occur with the same greatest frequency (the same number of times), each entry is a mode and the data set is called bimodal. > Hint: putting numbers in L1 and then sorting to see the groupings might make finding the mode easier. Example 1: Find the mode. 1, 2, 1, 2, 2, 2, 1, 3, 3 Example 2: Find the mode. 0, 1, 2, 3, 4 The midrange is the number exactly midway between the lowest value and the highest value of the data set. To find the midrange, you add the highest and lowest data values and divide by 2 (averaging them). The abbreviation for midrange is MR. Example 3: Find the mode. 4, 4, 6, 7, 8, 9, 6, 9 Example 1: Find the midrange of the data set. 3, 3, 5, 6, 8 You try: Find the following data for each set of numbers. 1. 1.5, 3.3, 8.4, 1.0, 2.6, 6.8, 2.1, 4.2, 5.3, 7.0, 8.4 MD Mode MR 2
2) 13, 20, 23, 28, 15, 20, 16, 15, 17, 23 Trimmed Mean: A disadvantage of the mean is that it can be affected by extremely high or low values. One way to make the mean more resistant to exceptional values, but still sensitive to specific data values is to do a trimmed mean. Usually a 5% trimmed mean is used. How to compute a 5% trimmed mean: Order the data from smallest to largest. Delete the bottom 5% of the data and the top 5% of the data. (Note: if 5% is a decimal, round to the nearest integer.) Compute the mean of the remaining 90%. Example: First compute the mean for the entire sample, then compute a 5% trimmed mean. 14 20 20 20 20 23 25 30 30 30 35 35 35 40 40 42 50 50 80 80 Mean for entire sample = 5% trimmed Mean = *Now find the median for the entire sample, then compute a 5% trimmed median. Is the trimmed mean or the original mean closer to the median? A weighted mean (average) is the mean of a data set whose entries have varying weights. A weighted mean is given by weight of each entry x. where w is the Example 1: The chart below shows your scores on tests, projects, quizzes, and homework, and lists the weights of each. Compute the weighted average of your scores. Score, x Weight, w xw Test 83 40% Projects 95 15% Quiz 90 30% Homework 60 15% 3
Example 2: Suppose your midterm test score is 83 and your final exam score is 95. Using weights of 40% for the midterm and 60% for the final exam, compute the weighted averages of your scores. If the minimum average for an A is 90, will you earn an A? Measures of Variation - Arrays Example: Six batteries were chosen randomly from two different brands. The number of continuous burning hours are given. Find the mean for each brand. Brand A: 10 60 50 30 40 20 Brand B: 35 45 30 35 40 25 = = What can you conclude about these two brands? Now, let's compare them graphically: Brand A: 10 15 20 25 30 35 40 45 50 55 60 Brand B: 10 15 20 25 30 35 40 45 50 55 60 Even though the means are the same for both brands, the spread or variation is quite different. The graphs show that brand B performs more consistently; it is less variable. For the spread or variability of a data set, three measures are commonly used: range, variance, and standard deviation. Can you still draw the same conclusion? The range of a data set is the difference between the maximum and minimum data entries in the set. Range = maximum - minimum The range for Brand A shows that 50 hours separate the largest value from the smallest value. As a measure of variation, the range has the advantage of being easy to compute. Its disadvantage, however, is that it uses only 2 entries from the data set. Two measures of variation that use all the entries in a data set are variance and standard deviation. However, before you learn about these measures of variation, you need to know what is meant by the deviation of an entry in a data set. The range for Brand B shows that 20 hours separate the largest value from the smallest value. Therefore, one extremely high or low data value can affect the range. 4
The deviation of an entry x in a sample data set is the difference between the entry and the mean of the data set. Deviation of x = Using the given data: a) find the mean. b) subtract the mean from each salary. Salary (1000's of $'s) x 40 23 41 50 49 32 41 29 52 58 Deviation (1000's of $'s) In this example, notice that the sum of the deviations is 0. Because this is true for any data set it doesn't make sense to find the average of the deviations. To overcome this problem, you can square the deviation. In a sample data set, the mean of the squares of the deviation is called the sample variance. Variance: Example: Find the variance and standard deviation. 6, 3, 8, 5, 3 x Standard Deviation: Same example - use calculator 1. put numbers in L1. Or: 1. 2nd stat 2. Math 3. 7:stdDev(L1) 4. Enter 2. run "stat, calc, one-variable stats, L1" and read the numbers. (remember - you have to square the standard deviation to get the variance. Square this number to get the variance. Square the entire number for the standard deviation, NOT the rounded version of your answer. 5
Variance using GDC: 1. 2nd stat 2. Math 3. 8:Variance(L1) Example: Find the variance and standard deviation. 15, 8, 12, 5, 19, 14, 8, 16, 13 x Example: Find the variance and standard deviation. 90, 96, 68, 96, 72, 86, 92, 84, 98, 43, 94, 72, 63, 78, 35, 82, 84, 63, 90, 40, 94, 90, 86, 100 6