LESSON 3: CENTRAL TENDENCY

Similar documents
Chapter 2 Describing, Exploring, and Comparing Data

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency

1. To condense data in a single value. 2. To facilitate comparisons between data.

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Frequency Distributions

Measures of Central Tendency

2.1: Frequency Distributions and Their Graphs

Averages and Variation

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Statistics. MAT 142 College Mathematics. Module ST. Terri Miller revised December 13, Population, Sample, and Data Basic Terms.

Chapter Two: Descriptive Methods 1/50

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Numerical Summaries of Data Section 14.3

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Test Bank for Privitera, Statistics for the Behavioral Sciences

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Quartile, Deciles, Percentile) Prof. YoginderVerma. Prof. Pankaj Madan Dean- FMS Gurukul Kangri Vishwavidyalaya, Haridwar

UNIT 1A EXPLORING UNIVARIATE DATA

Downloaded from

Univariate Statistics Summary

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Measures of Dispersion

MAT 110 WORKSHOP. Updated Fall 2018

MATH 117 Statistical Methods for Management I Chapter Two

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

CHAPTER 2: SAMPLING AND DATA

Chapter 3 - Displaying and Summarizing Quantitative Data

Section 3.2 Measures of Central Tendency MDM4U Jensen

Chapter 6: DESCRIPTIVE STATISTICS

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

STA 570 Spring Lecture 5 Tuesday, Feb 1

SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BBA240 STATISTICS/ QUANTITATIVE METHODS FOR BUSINESS AND ECONOMICS

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

AND NUMERICAL SUMMARIES. Chapter 2

Chapter 2 - Frequency Distributions and Graphs

UNIT 3. Chapter 5 MEASURES OF CENTRAL TENDENCY. * A central tendency is a single figure that represents the whole mass of data.

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

3.2-Measures of Center

IT 403 Practice Problems (1-2) Answers

/ / / x means sum of scores and n =/ f is the number of scores. J 14. Data. Knowing More. Mean, Median, Mode

Special Review Section. Copyright 2014 Pearson Education, Inc.

MATH NATION SECTION 9 H.M.H. RESOURCES

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

Middle Years Data Analysis Display Methods

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

PS2: LT2.4 6E.1-4 MEASURE OF CENTER MEASURES OF CENTER

3. Data Analysis and Statistics

Chapter 2: Frequency Distributions

Elementary Statistics

Spell out your full name (first, middle and last)

Descriptive Statistics

Lesson 18-1 Lesson Lesson 18-1 Lesson Lesson 18-2 Lesson 18-2

CHAPTER 2 DESCRIPTIVE STATISTICS


Lecture 3 Questions that we should be able to answer by the end of this lecture:

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Lecture 3 Questions that we should be able to answer by the end of this lecture:

CHAPTER 3: Data Description

Exploratory Data Analysis

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

+ Statistical Methods in

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

1.3 Graphical Summaries of Data

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

Table of Contents (As covered from textbook)

B.2 Measures of Central Tendency and Dispersion

3.2 Measures of Central Tendency Lesson MDM4U Jensen

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Chapter 1. Looking at Data-Distribution

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Applied Statistics for the Behavioral Sciences

How individual data points are positioned within a data set.

Measures of Position. 1. Determine which student did better

CHAPTER 7- STATISTICS

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

B. Graphing Representation of Data

Teaching univariate measures of location-using loss functions

Descriptive Analysis

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

September 11, Unit 2 Day 1 Notes Measures of Central Tendency.notebook

Descriptive Statistics Descriptive statistics & pictorial representations of experimental data.

1 Overview of Statistics; Essential Vocabulary

ECLT 5810 Data Preprocessing. Prof. Wai Lam

Getting to Know Your Data

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Transcription:

LESSON 3: CENTRAL TENDENCY Outline Arithmetic mean, median and mode Ungrouped data Grouped data Percentiles, fractiles, and quartiles Ungrouped data Grouped data 1 MEAN Mean is defined as follows: Sum of the measurements Mean = Number of measurements In the following, sample mean and population means are discussed separately. Note the difference of notation - sample mean is denote by X and the population mean is denoted by µ. The number of values in a sample is denoted by n and the number of values in the population is denoted by N. 2 1

MEAN Mean of Data Set Data Set is Sample Data Set is Population Sample Mean Population Mean 3 SAMPLE MEAN The sample mean is the sum of all the sample values divided by the number of sample values: Xi i= X = 1 n where X stands for the sample mean n is the total number of values in the sample X i is the value of the i-th observation represents a summation n 4 2

SAMPLE MEAN Statistic: a measurable characteristic of a sample. A sample of five executives received the following amounts of bonus last year: $12,000, $14,000, $18,000, $17,000, and $19,000. Find the average bonus for these five executives. Since these values represent a sample size of 5, the sample mean is (12,000 + 14,000 +18,000 + 17,000 +19,000)/5 = 5 SAMPLE MEAN Sample mean approximated from grouped data: X = Where X is the sample mean f is the frequency of the th class interval X is the midpoint of the th class interval n is the total number of observations, n f represents a summation f n X = 6 3

SAMPLE MEAN Compute the average days to maturity of 40 investments from the following frequency distribution: Days to Maturity Number of Investments 30-40 3 40-50 1 50-60 8 60-70 10 70-80 7 80-90 7 90-100 4 7 POPULATION MEAN The population mean is the sum of all the population values divided by the number of population values: Xi i= µ = 1 N Where µ stands for the population mean N is the total number of values in the population X i is the value of the i-th observation represents a summation n 8 4

POPULATION MEAN Parameter: a measurable characteristic of a population. The Keller family owns four cars. The following is the mileage attained by each car: 55,000, 25,000, 40,000, and 80,000. Find the average miles covered by each car. The mean is (55,000 + 25,000 + 40,000 + 80,000)/4 = 9 PROPERTIES OF MEAN 1. Data possessing an interval scale or a ratio scale, usually have a mean. 2. All the values are included in computing the mean. 3. A set of data has a unique mean. 4. The mean is affected by unusually large or small data values. 5. The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero. 10 5

PROPERTIES OF MEAN Consider the set of values: 3, 8, and 4. The mean is 5. Illustrating the fifth property, (3-5) + (8-5) + (4-5) = -2 +3-1 = 0. In other words, n i= 1 ( X i X ) = 0 11 MEDIAN Median: The midpoint of the values after they have been ordered from the smallest to the largest, or the largest to the smallest. There are as many values above the median as below it in the data array. For an even set of numbers, the median will be the arithmetic average of the two middle numbers. Median is denoted by m 12 6

MEDIAN After data are ordered If is odd median is the ( n n +1) th number 2 If n is even median is the arithmetic average of n th 2 and n 1 + th 2 number 13 MEDIAN The median is the most appropriate measure of central location to use when the data under consideration are raned data, rather than quantitative data. For example, if 13 universities are raned according to the reputation, university 7 is the one of median reputation. 14 7

MEDIAN Median is used when few extreme values influence mean too much. For example, one rich family may affect the mean income. So, median income is often reported in place of mean income. Median is used when all values are not available. For example, in life testing the experiment may end before generating all values. So, mean may not be calculated and median is used instead. 15 MEDIAN Compute the median for the following data. The age of a sample of five college students is: 21, 25, 19, 20, and 22. Arranging the data in ascending order gives: Thus the median is 16 8

MEDIAN Compute the median for the following data. The height of four basetball players, in inches, is 76, 73, 80, and 75. Arranging the data in ascending order gives: Thus the median is 17 MODE The mode is the value of the observation that appears most frequently. The mode is most useful when an important aspect of describing the data involves determining the number of times each value occurs. If the data are qualitative (e.g., number of graduate in mechanical, automotive, industrial, etc.) then, mode is useful (e.g., a modal class is mechanical). For grouped data, mode is the midpoint of the class interval of the highest frequency. 18 9

MODE EXAMPLE: The exam scores for ten students are: 81, 93, 84, 75, 68, 87, 81, 75, 81, 87. The modal score = 19 MODE Find the mode for the following grouped data on days to maturity of 40 investments Days to Maturity Number of Investments 30-40 3 40-50 1 50-60 8 60-70 10 70-80 7 80-90 7 90-100 4 20 10

BIMODAL HISTOGRAM Frequency 8 7 6 5 4 3 2 1 0 14 15 16 17 18 19 20 21 22 23 24 25 26 Number of Units Sold 21 MEAN, MEDIAN, MODE Mean: affected by unusually large/small data, may be used if the data are quantitative (ratio or interval scale). Median: most appropriate if the data are raned (ordinal scale) Mode: most appropriate if the data are qualitative (nominal scale) Appropriate measures if the data has ratio or interval scale: mean, median, mode ordinal scale: median, mode nominal scale: mode 22 11

FINDING MEDIAN AND MODE FROM AN ORDERED STEM-AND-LEAF PLOT Find the median and mode from the following ordered stem-and-leaf plot on days to maturity of 40 investments Stem Leaves 3 1 8 9 4 7 5 0 1 1 3 5 5 6 7 6 0 2 3 4 4 5 6 7 8 9 7 0 0 0 1 5 8 9 8 0 1 3 5 6 7 9 9 5 8 9 9 23 RELATIVE VALUES OF MEAN, MEDIAN, MODE Mode<Median<Mean If distribution is positively sewed Mode=Median=Mean If distribution is symmetric Mean<Median<Mode if distribution is negatively sewed 24 12

RELATIVE STANDING PERCENTILES, FRACTILES, QUARTILES Percentiles divide the distribution into 100 groups. A percentile is a point below which a stated percentage of observations lie. The p-th percentile is a point below which p% of the values lie. For example, if the 78 th percentile of GMAT scores is 600, then 78% scores are below 600. Percentiles are not unique. For example, if 78% scores are below 600 and 82% scores are below 610, then the 78 th percentile may be any point above 600 and below 610. 25 RELATIVE STANDING PERCENTILES, FRACTILES, QUARTILES Alternate to percentile is fractile. A fractile is a point below which a stated fraction of observations lie. The d fractile, Q d is a point below which 100d% of the values lie. For example, if the 0.78 fractile of GMAT scores is 600, then 78% scores are below 600. Alternate to the 78 th percentile is 0.78 fractile. In this case, Q. 78 = 600 26 13

RELATIVE STANDING PERCENTILES, FRACTILES, QUARTILES Quartiles divide data into four groups of equal frequency. First quartile = 25 th percentile = 0.25 fractile = Second quartile = 50 th percentile = 0.50 fractile = = median Third quartile = 75 th percentile = 0.75 fractile = Q.25 Q.50 Q.75 27 RELATIVE STANDING PROCEDURE TO FIND A GIVEN PERCENTILE Procedure to find Q d. Assume 1 n 1 d n n Step 1: Sort the data in the ascending order (low to high) X X, X,, 1, 2 3 K Step 2: Find which is the largest integer such that ( n + 1)d Step 3: Compute the d fractile i.e., 100d-th percentile as follows Q X + n + 1 d X X X n [( ) ]( ) d = +1 Note: Step 3 finds the required percentile by interpolating between and X X + 1 28 14

RELATIVE STANDING PROCEDURE TO FIND A GIVEN PERCENTILE Example: Consider data set 2, 3, 5, 6, 8, 10, 12, 15, 18, 20. Find the 20 th percentile Note: the data set is already ordered. So, Step 1 is not necessary. Step 2: Find the largest integer ( n + 1) d = So, = Step 3: Compute Q.20 + n + 1 d X X = [( ) ]( ) = X + 1 29 RELATIVE STANDING PROCEDURE TO FIND A GIVEN PERCENTILE Example: Consider data set 2, 3, 5, 6, 8, 10, 12, 15, 18, 20. Find the 75 th percentile Note: the data set is already ordered. So, Step 1 is not necessary. Step 2: Find the largest integer ( n + 1) d = So, = Step 3: Compute Q.75 + n + 1 d X X = [( ) ]( ) = X + 1 30 15

RELATIVE STANDING PROCEDURE TO FIND A GIVEN PERCENTILE Find the 80 th percentile from the following ordered stemand-leaf plot on days to maturity of 40 investments Stem Leaves 3 1 8 9 4 7 5 0 1 1 3 5 5 6 7 6 0 2 3 4 4 5 6 7 8 9 7 0 0 0 1 5 8 9 8 0 1 3 5 6 7 9 9 5 8 9 9 31 RELATIVE STANDING: GROUPED DATA PROCEDURE TO FIND A GIVEN PERCENTILE For the grouped data, read the percentiles directly from the graph for the cumulative relative frequency distribution. Find the 80 th percentile from the graph for the cumulative relative frequency distribution shown on the next slide and constructed from the data on days to maturity of 40 investments. 32 16

OGIVE CUMULATIVE RELATIVE FREQUENCY GRAPH Cumulative Frequency 1.000 0.800 0.600 0.400 0.200 0.000 0.900 1.000 0.725 0.550 0.300 0.075 0.100 40 50 60 70 80 90 100 Number of Days to Maturity 33 READING AND EXERCISES Lesson 3 Reading: Section 2-2, pp. 38-47 Exercises: 2-18, 2-20 (and 2-4a), 2-26 34 17