Teaching univariate measures of location-using loss functions

Size: px
Start display at page:

Download "Teaching univariate measures of location-using loss functions"

Transcription

1 Original Article Teaching univariate measures of location-using loss functions Jon-Paul Paolino Department of Mathematics and Computer Sciences, Mercy College, Dobbs Ferry, 10522, NY USA Summary Keywords: This article presents a new method for introductory teaching of the sample mean, median and mode(s) from a univariate dataset. These basic statistical concepts are taught at various levels of education from elementary school curriculums to courses at the tertiary level. These descriptive measures of location can be taught as optimized solutions to certain loss functions. Although proving these require some understanding of derivatives as used in a first year calculus course, the attained insight is valuable for higher level statistical thinking. Using the statistical computing software R, we visually illustrate the minimization of these loss functions using some example datasets. teaching statistics; mean; median; mode; loss function. BACKGROUND AND MOTIVATION The sample mean, median and mode(s) are descriptive measures of location that are introduced to students in elementary school and are also taught at the tertiary level across the world. Indeed, many tertiary level students need at least one introductory course in statistics to complete an undergraduate degree. At the elementary school level, the prerequisite to successfully calculate the sample mean, median and mode(s) is a basic working knowledge of counting and arithmetic. In the introductory tertiary level statistics course, this skill set is still assumed, as these skills are considered necessary for tertiary school admittance. Some courses, however, offer introductory statistics classes that take differential and integral calculus as a prerequisite to enrollment. These courses teach topics such as probability as area under a curve and simple linear regression using least squares minimization. The new teaching method proposed in this article would be the most suitable for students in tertiary learning environments, as it assumes a higher level of mathematical maturity. This article also demonstrates the application of this method by using different sequences of univariate data with varying distribution characteristics. Using R (R Core Team 2017), we show how these three measures of location optimize a particular loss function using a graphical representation. REVIEW OF THE MEASURES OF LOCATION FOR A SAMPLE Sample mean The sample mean is a descriptive measure that is calculated by adding all of the observed values and then dividing by the total number of observations. It is also commonly known as the average. The mean is also called the location of balance because the sum of the deviations above the mean will equal the sum of the deviations below the mean. It can be calculated by using the formula below, where x i represents the ith observed data point from a sample size of n. x ¼ n x i n Sample median ¼ x 1 þ x 2 þ þ x n : n The sample median is a descriptive measure that separates the bottom 50% of the data from the top 50% of the data. The median is also referred to as the 50th percentile or the second quartile. Typically, it is taught that the data must be arranged in increasing order where x (i) indicates the ith ordered data value. The formula below can be used to determine the location of the median: 16

2 Teaching univariate measures 17 8 x n þ 1 >< 2 x ¼ 0 1 : 2 x n þ x n A >: 2 2 þ 1 The sample size n determines the location of the median in the dataset. If n is odd, then the top formula must be used. If n is even, then the bottom formula must be used. Calculating the median is certainly not as onerous as calculating the mean. Mean as the minimizer of the squared loss deviations The sample mean can be shown to be the point that minimizes the squared loss function. This point of minimization can be found using either the first derivative from calculus and setting the first derivative equal to zero, or by the vertex formula for parabolas. The squared loss function is shown below: fðcþ ¼ n ðx i cþ 2 : Sample mode(s) A sample mode is a descriptive measure, defined as one or more values that occur most frequently in the dataset. A dataset can have multiple modes, since there may be ties for the most frequently occurring data values. A mode can be used for qualitative or quantitative variables unlike the mean and median, which can only be used for quantitative variables. Unless at least one data value occurs more than once, there is no mode. It must be remembered that if quantitative data are grouped, as in a histogram or stem-and-leaf presentation, the appearance of the graph depends on the grouping choice, and hence the most commonly occurring group is not uniquely defined. In this article, we use the symbol x _ (x with an arc on the top) to indicate a sample mode. The calculation of sample mode(s) is rudimentary. MEASURES OF LOCATION EXPRESSED AS LOSS FUNCTIONS Mean as the balancing point of a distribution As mentioned previously, the sample mean is the balancing point of the distribution. The function below can be used to illustrate this point of balance (i.e. where the deviation above equals the deviation below): fðcþ ¼ n ðx i cþ: Subsequently, f(c) must be set equal to zero, and then the value of c can be solved. By using basic knowledge of summation properties, it can be derived that the sample mean is the point that makes the deviation scores sum to zero. Median as the minimizer of the absolute loss deviations Next, the sample median minimizes the absolute loss function. This property is more difficult to show because it involves taking the derivative of an absolute value function. Consider the absolute loss function below, fðcþ ¼ n jx i cj: Taking the first derivative with respect to c gives sgn(x i c). The next step is to find the value of c which makes the summation equal to zero. Upon inspection of x i c, it becomes clear that c should equal the sample median. When c ¼ ex, sgn(x i c) will equal 1 the same number of times as it will equal 1, this is because the median separates the bottom 50% of the data from the top 50% of the data. It is also noteworthy that when the sample size is even, there may be multiple points around the median that minimize the sum of absolute deviations function. This occurs because the derivative the absolute loss function would not achieve an absolute minimum. Table 1. Sequence datasets with descriptive statistics used for demonstration Dataset (#) Sample Sample Sample _ mean, x median, ex mode, x 1 {2,3,3,4,4,4,5,5,5,5,6,6,6,7,7,8} {3,3,4,4,4,4,4,5,5,5,6,6,6,6,6,7,7} and 6 3 {1,1,1,1,2,2,2,3,3,4,4,4,5,5,5,5} and 5 4 {0,1,2,4,5,8,8,8,9,9,9,9,9,10,10,11} {0,1,1,2,2,2,2,2,3,3,3,6,7,9,10,11} {0,0,0,0,0,0,0,1,1,1,1,1,2,2,2,3,3} {0,0,1,1,1,2,2,2,2,2,3,3,3,3,3,3,3} {0,1,2,3,4,5,6,7,8,9,10} 5 5 No Mode 9 {0,1,2,3,4,5,6,7,8,9} No Mode

3 18 Jon-Paul Paolino Mode as the minimizer of the zero one loss indicator function Next, we explain how a sample mode minimizes the zero one loss indicator function. However, there are some drawbacks when explaining the loss function that it optimizes. Consider the zero one loss indicator function below, where fðcþ ¼ n I xi c; ( I xi ¼ 1ifx i ¹ c : 0ifx i ¼ c The main shortcoming is that the indicator function, as shown above, is not continuous, and therefore, it is not differentiable. Now, the minimum value cannot be found by simply differentiating the function and then setting it equal to zero. In order for f(c) to be as small as possible, I xi c must be zero as many times as possible, and this can be achieved by letting c ¼ x. _ Choosing the data value that occurs most frequently will yield the greatest number of observed zeros, and therefore, minimize f(c). ILLUSTRATIVE EXAMPLES Using some example datasets, we visually explain the loss function minimization principles using R software. These datasets are presented in Table 1 below. The first three datasets (Dataset 1, Dataset 2 and Dataset 3) are symmetric distributions. However, these datasets differ in that Dataset 1 is unimodal, Dataset 2 is bimodal, and Dataset 3 is bimodal with the modes at the extreme ends of the distribution. The next two datasets (Dataset 4 and Dataset 5) are distributions that are negatively skewed and positively skewed, respectively. Dataset 6 is an exponential distribution, and Dataset 7 is its mirror image, sometimes referred to as a J- Shaped distribution. Finally, Dataset 8 and Dataset 9 are both uniform distributions, with an odd and even number of data points respectively. Nine figures are shown below, one for each dataset. We used R to generate the nine figures and the graphs within each figure. We created one figure for each dataset, and within each figure, there are three separate panels. The left panel shows the squared loss function; the centre panel shows the absolute loss function, and the right panel shows the zero one loss function. Each graph shows the location where the loss function attains the minimal value, all of which correspond with the values shown in Table 1. Fig. 1. Graphical displays of loss functions for Dataset 1. Left panel: Squared Loss. Center panel: Absolute Loss.

4 Teaching univariate measures 19 Fig. 2. Graphical displays of loss functions for Dataset 2. Left panel: Squared Loss. Center panel: Absolute Loss. Considering Dataset 1, the mean, median and mode of the dataset all equal 5, which correspond to the minimums of the loss functions shown in Figure 1. Considering Dataset 5, the mean, median and mode of the dataset is 3, 2.5 and 2, respectively (Figures 2 4), which correspond to the minimums of the loss functions shown in Figure 5. Fig. 3. Graphical displays of loss functions for Dataset 3. Left panel: Squared Loss. Center panel: Absolute Loss.

5 20 Jon-Paul Paolino Fig. 4. Graphical displays of loss functions for Dataset 4. Left panel: Squared Loss. Center panel: Absolute Loss. Fig. 5. Graphical displays of loss functions for Dataset 5. Left panel: Squared Loss. Center panel: Absolute Loss.

6 Teaching univariate measures 21 Fig. 6. Graphical displays of loss functions for Dataset 6. Left panel: Squared Loss. Center panel: Absolute Loss. DISCUSSION In this article, we discuss a different approach to teach the mean, median and mode from a sample of data using loss function minimization. We explain how the sample mean and the sample median can be derived using differential calculus techniques, which are taught in the first year calculus course. The graphs can also be used to demonstrate how the sample mode(s) should not be considered a measure of centre. In addition, the sample mode(s) can be poorly defined Fig. 7. Graphical displays of loss functions for Dataset 7. Left panel: Squared Loss. Center panel: Absolute Loss.

7 22 Jon-Paul Paolino Fig. 8. Graphical displays of loss functions for Dataset 8. Left panel: Squared Loss. Center panel: Absolute Loss. if other graphs such as histograms are used to present the data. We use the software R, to visually explain these properties by using graphical methods. Technology can often be a beneficial pedagogical resource in a statistics class, and its efficacy is shown in this article (Figures 6 9). The classical approaches to teaching measures of location, similar to those seen in an elementary statistics class, show how to compute the mean, median and mode(s) by hand, but these approaches typically do not include a demonstration, such as the examples given in this article. Fig. 9. Graphical displays of loss functions for Dataset 9. Left panel: Squared Loss. Center panel: Absolute Loss.

8 Teaching univariate measures 23 Using the loss function optimization methods discussed in this article offers additional insight when attempting to teach the properties of the mean, median and mode. Finally, we hope that understanding these principles will help develop higher order statistical thinking skills that may become useful in future studies. Reference R Development Core Team. (2017), R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, Available at: org/ Appendix A: R Code Used for Producing Figures The following R code can be used to produce the figures displayed in found in this article. We provide the code for Figure 1. option < par(mfrow = c(1,3). mean.1 = curve((3-x)^2 + (4-x)^2 + (4-x) ^2 + (5-x)^2 + (5-x)^2 + (5-x)^2 + (6-x) ^2 + (6-x)^2 + _ (7-x)^2, 0, 10, n = 101, add = FALSE, type = l, xlab = c, ylab = Squared Loss, cex.lab = 1.5, cex. axis = 1.5,_cex.sub = 1.5). median.1 = curve(abs(3-x) + abs(4-x) + abs(4- x) + abs(5-x) + abs(5-x) + abs(5-x) + abs(6- x) + abs(6-x) + abs(7-x), 0, 10, n = 101, add = FALSE, type = l, xlab = c, ylab = Absolute Loss, cex.lab = 1.5, cex.axis = 1.5, cex. sub = 1.5). x < seq(0, 10, 0.01). mode.1 < (x > =0&x< 3) * 9 + (x == 3) * 8+(x> 3& x < 4)*9+(x==4)*7+(x> 4& x < 5) * 9 + (x == 5)* 6 + (x > 5& x < 6) * 9 + (x == 6) * 7 + (x > 6&x< 7) * 9 + (x== 7) * 8 + (x > 7&x< =10) * 9. plot(x, mode.1, type = l, cex = 1.5,,xlab = c, ylab = 0 1 Loss, cex.lab = 1.5, cex.axis = 1.5, cex.sub = 1.5).

LESSON 3: CENTRAL TENDENCY

LESSON 3: CENTRAL TENDENCY LESSON 3: CENTRAL TENDENCY Outline Arithmetic mean, median and mode Ungrouped data Grouped data Percentiles, fractiles, and quartiles Ungrouped data Grouped data 1 MEAN Mean is defined as follows: Sum

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

1. To condense data in a single value. 2. To facilitate comparisons between data.

1. To condense data in a single value. 2. To facilitate comparisons between data. The main objectives 1. To condense data in a single value. 2. To facilitate comparisons between data. Measures :- Locational (positional ) average Partition values Median Quartiles Deciles Percentiles

More information

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES STP 6 ELEMENTARY STATISTICS NOTES PART - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES Chapter covered organizing data into tables, and summarizing data with graphical displays. We will now use

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

3. Data Analysis and Statistics

3. Data Analysis and Statistics 3. Data Analysis and Statistics 3.1 Visual Analysis of Data 3.2.1 Basic Statistics Examples 3.2.2 Basic Statistical Theory 3.3 Normal Distributions 3.4 Bivariate Data 3.1 Visual Analysis of Data Visual

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data so that it is possible to get a general overview of the results. Remember,

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

CHAPTER 2: SAMPLING AND DATA

CHAPTER 2: SAMPLING AND DATA CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),

More information

Measures of Central Tendency

Measures of Central Tendency Measures of Central Tendency MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2017 Introduction Measures of central tendency are designed to provide one number which

More information

CHAPTER 3: Data Description

CHAPTER 3: Data Description CHAPTER 3: Data Description You ve tabulated and made pretty pictures. Now what numbers do you use to summarize your data? Ch3: Data Description Santorico Page 68 You ll find a link on our website to a

More information

Pre-Calculus Multiple Choice Questions - Chapter S2

Pre-Calculus Multiple Choice Questions - Chapter S2 1 Which of the following is NOT part of a univariate EDA? a Shape b Center c Dispersion d Distribution Pre-Calculus Multiple Choice Questions - Chapter S2 2 Which of the following is NOT an acceptable

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Measures of Central Tendency

Measures of Central Tendency Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

UNIT 1A EXPLORING UNIVARIATE DATA

UNIT 1A EXPLORING UNIVARIATE DATA A.P. STATISTICS E. Villarreal Lincoln HS Math Department UNIT 1A EXPLORING UNIVARIATE DATA LESSON 1: TYPES OF DATA Here is a list of important terms that we must understand as we begin our study of statistics

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,

More information

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 3 Descriptive Measures Slide 3-2 Section 3.1 Measures of Center Slide 3-3 Definition 3.1 Mean of a Data Set The mean of a data set is the sum of the observations divided by the number of observations.

More information

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set. Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in

More information

SLStats.notebook. January 12, Statistics:

SLStats.notebook. January 12, Statistics: Statistics: 1 2 3 Ways to display data: 4 generic arithmetic mean sample 14A: Opener, #3,4 (Vocabulary, histograms, frequency tables, stem and leaf) 14B.1: #3,5,8,9,11,12,14,15,16 (Mean, median, mode,

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

Course of study- Algebra Introduction: Algebra 1-2 is a course offered in the Mathematics Department. The course will be primarily taken by

Course of study- Algebra Introduction: Algebra 1-2 is a course offered in the Mathematics Department. The course will be primarily taken by Course of study- Algebra 1-2 1. Introduction: Algebra 1-2 is a course offered in the Mathematics Department. The course will be primarily taken by students in Grades 9 and 10, but since all students must

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

appstats6.notebook September 27, 2016

appstats6.notebook September 27, 2016 Chapter 6 The Standard Deviation as a Ruler and the Normal Model Objectives: 1.Students will calculate and interpret z scores. 2.Students will compare/contrast values from different distributions using

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables

More information

Univariate Statistics Summary

Univariate Statistics Summary Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:

More information

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread MATH 112 Section 7.2: Measuring Distribution, Center, and Spread Prof. Jonathan Duncan Walla Walla College Fall Quarter, 2006 Outline 1 Measures of Center The Arithmetic Mean The Geometric Mean The Median

More information

STA Module 4 The Normal Distribution

STA Module 4 The Normal Distribution STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

ALGEBRA II A CURRICULUM OUTLINE

ALGEBRA II A CURRICULUM OUTLINE ALGEBRA II A CURRICULUM OUTLINE 2013-2014 OVERVIEW: 1. Linear Equations and Inequalities 2. Polynomial Expressions and Equations 3. Rational Expressions and Equations 4. Radical Expressions and Equations

More information

STANDARDS OF LEARNING CONTENT REVIEW NOTES. Grade 6 Mathematics 3 rd Nine Weeks,

STANDARDS OF LEARNING CONTENT REVIEW NOTES. Grade 6 Mathematics 3 rd Nine Weeks, STANDARDS OF LEARNING CONTENT REVIEW NOTES Grade 6 Mathematics 3 rd Nine Weeks, 2016-2017 1 2 Content Review: Standards of Learning in Detail Grade 6 Mathematics: Third Nine Weeks 2016-2017 This resource

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Measures of Dispersion

Measures of Dispersion Measures of Dispersion 6-3 I Will... Find measures of dispersion of sets of data. Find standard deviation and analyze normal distribution. Day 1: Dispersion Vocabulary Measures of Variation (Dispersion

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Chapter 1. Looking at Data-Distribution

Chapter 1. Looking at Data-Distribution Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw

More information

Section 3.2 Measures of Central Tendency MDM4U Jensen

Section 3.2 Measures of Central Tendency MDM4U Jensen Section 3.2 Measures of Central Tendency MDM4U Jensen Part 1: Video This video will review shape of distributions and introduce measures of central tendency. Answer the following questions while watching.

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4

More information

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

MATH& 146 Lesson 8. Section 1.6 Averages and Variation MATH& 146 Lesson 8 Section 1.6 Averages and Variation 1 Summarizing Data The distribution of a variable is the overall pattern of how often the possible values occur. For numerical variables, three summary

More information

UNIT 1: NUMBER LINES, INTERVALS, AND SETS

UNIT 1: NUMBER LINES, INTERVALS, AND SETS ALGEBRA II CURRICULUM OUTLINE 2011-2012 OVERVIEW: 1. Numbers, Lines, Intervals and Sets 2. Algebraic Manipulation: Rational Expressions and Exponents 3. Radicals and Radical Equations 4. Function Basics

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

A-SSE.1.1, A-SSE.1.2-

A-SSE.1.1, A-SSE.1.2- Putnam County Schools Curriculum Map Algebra 1 2016-2017 Module: 4 Quadratic and Exponential Functions Instructional Window: January 9-February 17 Assessment Window: February 20 March 3 MAFS Standards

More information

Chapter Two: Descriptive Methods 1/50

Chapter Two: Descriptive Methods 1/50 Chapter Two: Descriptive Methods 1/50 2.1 Introduction 2/50 2.1 Introduction We previously said that descriptive statistics is made up of various techniques used to summarize the information contained

More information

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016) CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1 Daphne Skipper, Augusta University (2016) 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is

More information

Descriptive Statistics Descriptive statistics & pictorial representations of experimental data.

Descriptive Statistics Descriptive statistics & pictorial representations of experimental data. Psychology 312: Lecture 7 Descriptive Statistics Slide #1 Descriptive Statistics Descriptive statistics & pictorial representations of experimental data. In this lecture we will discuss descriptive statistics.

More information

AND NUMERICAL SUMMARIES. Chapter 2

AND NUMERICAL SUMMARIES. Chapter 2 EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Algebra 1, 4th 4.5 weeks

Algebra 1, 4th 4.5 weeks The following practice standards will be used throughout 4.5 weeks:. Make sense of problems and persevere in solving them.. Reason abstractly and quantitatively. 3. Construct viable arguments and critique

More information

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation Objectives: 1. Learn the meaning of descriptive versus inferential statistics 2. Identify bar graphs,

More information

Downloaded from

Downloaded from UNIT 2 WHAT IS STATISTICS? Researchers deal with a large amount of data and have to draw dependable conclusions on the basis of data collected for the purpose. Statistics help the researchers in making

More information

Quartile, Deciles, Percentile) Prof. YoginderVerma. Prof. Pankaj Madan Dean- FMS Gurukul Kangri Vishwavidyalaya, Haridwar

Quartile, Deciles, Percentile) Prof. YoginderVerma. Prof. Pankaj Madan Dean- FMS Gurukul Kangri Vishwavidyalaya, Haridwar Paper:5, Quantitative Techniques for Management Decisions Module:6 Measures of Central Tendency: Averages of Positions (Median, Mode, Quartile, Deciles, Percentile) Principal Investigator Co-Principal

More information

Chapter 2: The Normal Distributions

Chapter 2: The Normal Distributions Chapter 2: The Normal Distributions Measures of Relative Standing & Density Curves Z-scores (Measures of Relative Standing) Suppose there is one spot left in the University of Michigan class of 2014 and

More information

Use of GeoGebra in teaching about central tendency and spread variability

Use of GeoGebra in teaching about central tendency and spread variability CREAT. MATH. INFORM. 21 (2012), No. 1, 57-64 Online version at http://creative-mathematics.ubm.ro/ Print Edition: ISSN 1584-286X Online Edition: ISSN 1843-441X Use of GeoGebra in teaching about central

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

Algebra II Quadratic Functions

Algebra II Quadratic Functions 1 Algebra II Quadratic Functions 2014-10-14 www.njctl.org 2 Ta b le o f C o n te n t Key Terms click on the topic to go to that section Explain Characteristics of Quadratic Functions Combining Transformations

More information

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS To Describe Data, consider: Symmetry Skewness TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS Unimodal or bimodal or uniform Extreme values Range of Values and mid-range Most frequently occurring values In

More information

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys Unit 7 Statistics AFM Mrs. Valentine 7.1 Samples and Surveys v Obj.: I will understand the different methods of sampling and studying data. I will be able to determine the type used in an example, and

More information

CHAPTER 2 DESCRIPTIVE STATISTICS

CHAPTER 2 DESCRIPTIVE STATISTICS CHAPTER 2 DESCRIPTIVE STATISTICS 1. Stem-and-Leaf Graphs, Line Graphs, and Bar Graphs The distribution of data is how the data is spread or distributed over the range of the data values. This is one of

More information

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics Section. Displaying Quantitative Data with Graphs Mrs. Daniel AP Statistics Section. Displaying Quantitative Data with Graphs After this section, you should be able to CONSTRUCT and INTERPRET dotplots,

More information

Lecture 6: Chapter 6 Summary

Lecture 6: Chapter 6 Summary 1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z

More information

Integrated Math B. Syllabus. Course Overview. Course Goals. Math Skills

Integrated Math B. Syllabus. Course Overview. Course Goals. Math Skills Syllabus Integrated Math B Course Overview Integrated Math is a comprehensive collection of mathematical concepts designed to give you a deeper understanding of the world around you. It includes ideas

More information

UNIT 3. Chapter 5 MEASURES OF CENTRAL TENDENCY. * A central tendency is a single figure that represents the whole mass of data.

UNIT 3. Chapter 5 MEASURES OF CENTRAL TENDENCY. * A central tendency is a single figure that represents the whole mass of data. UNIT 3 Chapter 5 MEASURES OF CENTRAL TENDENCY Points to Remember * A central tendency is a single figure that represents the whole mass of data. * Arithmetic mean or mean is the number which is obtained

More information

This lesson is designed to improve students

This lesson is designed to improve students NATIONAL MATH + SCIENCE INITIATIVE Mathematics g x 8 6 4 2 0 8 6 4 2 y h x k x f x r x 8 6 4 2 0 8 6 4 2 2 2 4 6 8 0 2 4 6 8 4 6 8 0 2 4 6 8 LEVEL Algebra or Math in a unit on function transformations

More information

IT 403 Practice Problems (1-2) Answers

IT 403 Practice Problems (1-2) Answers IT 403 Practice Problems (1-2) Answers #1. Using Tukey's Hinges method ('Inclusionary'), what is Q3 for this dataset? 2 3 5 7 11 13 17 a. 7 b. 11 c. 12 d. 15 c (12) #2. How do quartiles and percentiles

More information

No. of blue jelly beans No. of bags

No. of blue jelly beans No. of bags Math 167 Ch5 Review 1 (c) Janice Epstein CHAPTER 5 EXPLORING DATA DISTRIBUTIONS A sample of jelly bean bags is chosen and the number of blue jelly beans in each bag is counted. The results are shown in

More information

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12 Tool 1: Standards for Mathematical ent: Interpreting Functions CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12 Name of Reviewer School/District Date Name of Curriculum Materials:

More information

Getting to Know Your Data

Getting to Know Your Data Chapter 2 Getting to Know Your Data 2.1 Exercises 1. Give three additional commonly used statistical measures (i.e., not illustrated in this chapter) for the characterization of data dispersion, and discuss

More information

Applied Calculus. Lab 1: An Introduction to R

Applied Calculus. Lab 1: An Introduction to R 1 Math 131/135/194, Fall 2004 Applied Calculus Profs. Kaplan & Flath Macalester College Lab 1: An Introduction to R Goal of this lab To begin to see how to use R. What is R? R is a computer package for

More information

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40 Chpt 3 Data Description 3-2 Measures of Central Tendency 1 /40 Chpt 3 Homework 3-2 Read pages 96-109 p109 Applying the Concepts p110 1, 8, 11, 15, 27, 33 2 /40 Chpt 3 3.2 Objectives l Summarize data using

More information

4.2 Data Distributions

4.2 Data Distributions NOTES Data Distribution: Write your questions here! Dotplots Histograms Find the mean number of siblings: Find the median number of siblings: Types of distributions: The mean on the move: Compare the mean

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years. 3: Summary Statistics Notation Consider these 10 ages (in years): 1 4 5 11 30 50 8 7 4 5 The symbol n represents the sample size (n = 10). The capital letter X denotes the variable. x i represents the

More information

Ch6: The Normal Distribution

Ch6: The Normal Distribution Ch6: The Normal Distribution Introduction Review: A continuous random variable can assume any value between two endpoints. Many continuous random variables have an approximately normal distribution, which

More information

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies. Instructions: You are given the following data below these instructions. Your client (Courtney) wants you to statistically analyze the data to help her reach conclusions about how well she is teaching.

More information

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture: Lecture 3 Questions that we should be able to answer by the end of this lecture: Which is the better exam score? 67 on an exam with mean 50 and SD 10 or 62 on an exam with mean 40 and SD 12 Is it fair

More information

Sections Graphical Displays and Measures of Center. Brian Habing Department of Statistics University of South Carolina.

Sections Graphical Displays and Measures of Center. Brian Habing Department of Statistics University of South Carolina. STAT 515 Statistical Methods I Sections 2.1-2.3 Graphical Displays and Measures of Center Brian Habing Department of Statistics University of South Carolina Redistribution of these slides without permission

More information

YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1)

YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1) YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1) Algebra and Functions Quadratic Functions Equations & Inequalities Binomial Expansion Sketching Curves Coordinate Geometry Radian Measures Sine and

More information

Applied Statistics for the Behavioral Sciences

Applied Statistics for the Behavioral Sciences Applied Statistics for the Behavioral Sciences Chapter 2 Frequency Distributions and Graphs Chapter 2 Outline Organization of Data Simple Frequency Distributions Grouped Frequency Distributions Graphs

More information

Displaying Distributions - Quantitative Variables

Displaying Distributions - Quantitative Variables Displaying Distributions - Quantitative Variables Lecture 13 Sections 4.4.1-4.4.3 Robb T. Koether Hampden-Sydney College Wed, Feb 8, 2012 Robb T. Koether (Hampden-Sydney College)Displaying Distributions

More information

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques SEVENTH EDITION and EXPANDED SEVENTH EDITION Slide - Chapter Statistics. Sampling Techniques Statistics Statistics is the art and science of gathering, analyzing, and making inferences from numerical information

More information

6th Grade Vocabulary Mathematics Unit 2

6th Grade Vocabulary Mathematics Unit 2 6 th GRADE UNIT 2 6th Grade Vocabulary Mathematics Unit 2 VOCABULARY area triangle right triangle equilateral triangle isosceles triangle scalene triangle quadrilaterals polygons irregular polygons rectangles

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

+ Statistical Methods in

+ Statistical Methods in 9/4/013 Statistical Methods in Practice STA/MTH 379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Discovering Statistics

More information

2.1: Frequency Distributions and Their Graphs

2.1: Frequency Distributions and Their Graphs 2.1: Frequency Distributions and Their Graphs Frequency Distribution - way to display data that has many entries - table that shows classes or intervals of data entries and the number of entries in each

More information

Descriptive Statistics

Descriptive Statistics Chapter 2 Descriptive Statistics 2.1 Descriptive Statistics 1 2.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Display data graphically and interpret graphs:

More information

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis. 1.3 Density curves p50 Some times the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. It is easier to work with a smooth curve, because the histogram

More information

Chapter 3 Analyzing Normal Quantitative Data

Chapter 3 Analyzing Normal Quantitative Data Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing

More information

15 Wyner Statistics Fall 2013

15 Wyner Statistics Fall 2013 15 Wyner Statistics Fall 2013 CHAPTER THREE: CENTRAL TENDENCY AND VARIATION Summary, Terms, and Objectives The two most important aspects of a numerical data set are its central tendencies and its variation.

More information

Chapter 4: Analyzing Bivariate Data with Fathom

Chapter 4: Analyzing Bivariate Data with Fathom Chapter 4: Analyzing Bivariate Data with Fathom Summary: Building from ideas introduced in Chapter 3, teachers continue to analyze automobile data using Fathom to look for relationships between two quantitative

More information

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution Name: Date: Period: Chapter 2 Section 1: Describing Location in a Distribution Suppose you earned an 86 on a statistics quiz. The question is: should you be satisfied with this score? What if it is the

More information

Mathematics. Algebra, Functions, and Data Analysis Curriculum Guide. Revised 2010

Mathematics. Algebra, Functions, and Data Analysis Curriculum Guide. Revised 2010 Mathematics Algebra, Functions, and Data Analysis Curriculum Guide Revised 010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning

More information

Muskogee Public Schools Curriculum Map, Math, Grade 8

Muskogee Public Schools Curriculum Map, Math, Grade 8 Muskogee Public Schools Curriculum Map, 2010-2011 Math, Grade 8 The Test Blueprint reflects the degree to which each PASS Standard and Objective is represented on the test. Page1 1 st Nine Standard 1:

More information

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties.

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties. Standard 1: Number Sense and Computation Students simplify and compare expressions. They use rational exponents and simplify square roots. IM1.1.1 Compare real number expressions. IM1.1.2 Simplify square

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

AP Statistics Prerequisite Packet

AP Statistics Prerequisite Packet Types of Data Quantitative (or measurement) Data These are data that take on numerical values that actually represent a measurement such as size, weight, how many, how long, score on a test, etc. For these

More information