Chapter 3 - Displaying and Summarizing Quantitative Data

Similar documents
Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

Chapter 2 Describing, Exploring, and Comparing Data

1.3 Graphical Summaries of Data

Chapter 1. Looking at Data-Distribution

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Table of Contents (As covered from textbook)

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

CHAPTER 3: Data Description

No. of blue jelly beans No. of bags

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 6: DESCRIPTIVE STATISTICS

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Understanding and Comparing Distributions. Chapter 4

Chapter 5. Understanding and Comparing Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

CHAPTER 2: SAMPLING AND DATA

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Univariate Statistics Summary

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

CHAPTER 2 DESCRIPTIVE STATISTICS

STA 570 Spring Lecture 5 Tuesday, Feb 1

AND NUMERICAL SUMMARIES. Chapter 2

UNIT 1A EXPLORING UNIVARIATE DATA

Section 3.1 Shapes of Distributions MDM4U Jensen

Organizing and Summarizing Data

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

Averages and Variation

AP Statistics Summer Assignment:

Chapter 2 - Graphical Summaries of Data

Descriptive Statistics

Sections 2.3 and 2.4

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Displaying Distributions - Quantitative Variables

Name Date Types of Graphs and Creating Graphs Notes

DAY 52 BOX-AND-WHISKER

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

2.1: Frequency Distributions and Their Graphs

Chapter 2 Modeling Distributions of Data

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Lecture 6: Chapter 6 Summary

Chapter 5snow year.notebook March 15, 2018

Middle Years Data Analysis Display Methods

Exploratory Data Analysis

Homework Packet Week #3

15 Wyner Statistics Fall 2013

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter 6: Comparing Two Means Section 6.1: Comparing Two Groups Quantitative Response

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Bar Graphs and Dot Plots

Chpt 3. Data Description. 3-2 Measures of Central Tendency /40

Chapter 2: Graphical Summaries of Data 2.1 Graphical Summaries for Qualitative Data. Frequency: Frequency distribution:

Understanding Statistical Questions

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

2.3 Organizing Quantitative Data

Chapter 3 Analyzing Normal Quantitative Data

Center, Shape, & Spread Center, shape, and spread are all words that describe what a particular graph looks like.

Mean,Median, Mode Teacher Twins 2015

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

AP Statistics Prerequisite Packet

Measures of Central Tendency:

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

+ Statistical Methods in

Section 9: One Variable Statistics

Chapter 3: Describing, Exploring & Comparing Data

Measures of Central Tendency

Basic Statistical Terms and Definitions

Measures of Position. 1. Determine which student did better

Frequency Distributions

Lecture Notes 3: Data summarization

Numerical Summaries of Data Section 14.3

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

3 Graphical Displays of Data

STP 226 ELEMENTARY STATISTICS NOTES

Chapter 2: Descriptive Statistics

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

MATH NATION SECTION 9 H.M.H. RESOURCES

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Boxplots. Lecture 17 Section Robb T. Koether. Hampden-Sydney College. Wed, Feb 10, 2010

Learning Log Title: CHAPTER 8: STATISTICS AND MULTIPLICATION EQUATIONS. Date: Lesson: Chapter 8: Statistics and Multiplication Equations

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Chapter 3 Understanding and Comparing Distributions

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

Visualizing Data: Freq. Tables, Histograms

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Transcription:

Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative frequencies in specific intervals called classes or bins. The intervals are the same width and can be chosen for convenience. Look similar to Bar Charts, but there are no spaces between bars unless there is no data in that interval. Stem and Leaf Displays (p. 46) - (Also called Stem and Leaf Plots or Stem Plots) Graph that is created by making a stem out of the left most digits and writing them in order in a vertical column. Then creating a leaf out of the right most digit and putting it in the row that corresponds to the appropriate stem. Leaves must be arranged in numerical order. This graph gives a quick picture of the distribution while including the numerical values. If a row does not have any leaves to put in the stems must still be included to show where gaps might be. If decimal values are used, the decimal values can be rounded. Example: Length of ownership (in months) of cars before they are traded in. 72 15 70 40 42 74 64 68 50 53 62 64 45

72 15 70 40 42 74 64 68 50 53 62 64 45 Split Stem and Leaf Plot (p. 46) - Plot that is created by splitting the stems into 2 or more rows. If 2 rows per stem than one will be used for digits 0-4 and the other for digits 5-9. This is useful for a larger sets or sets with large amounts of data in each row. We will see Back to Back Stem and Leaf Plots in Chapter 4.

Dotplots or Line Plot (p. 47) Graph constructed by placing a dot along the axis for each instance of a data value. Axis can be vertical or horizontal. Distribution for Graphs of Quantitative Variables Whenever a distribution of a graph for a quantitative variable is described it should include its shape, center, and spread. 3.2 Shape of a Distribution Is the graph symmetric, one side is close to the mirror image of the other or if folded along a vertical line through the middle the graph would match pretty closely. Is the graph skewed, which means one tail stretches out farther than the other. The graph is said to be skewed to the side of the longer tail Are there peaks or modes on the graph? This is where a large part of the data is occurring. Unimodal - one peak. Bimodal - two peaks. Multimodal - 3 or more peaks. Uniform if no peaks and fairly straight across graph. Are there any outliers, extreme values that are away from the rest of the data. Sometimes outliers will be ignored. Are there any gaps or spaces in the graph?

Example: Dotplot of Kentucky Derby winning times over the years. 3.3 The Center of the Distribution: The Midrange and The Median For a unimodal, symmetric distribution the center is the point on the graph where graph folds to give mirror image of each side. What if the distribution is skewed or multimodal? Maximum Value+ Minimum Value Midrange (or Midpoint) -. Easy to calculate, 2 but it is very sensitive to extreme outlying values. It is not usually used to summarize a distribution. Median (p. 51) - Score in the middle when values are arranged in numerical order st n+ 1 If n is odd then median is in the 2 position st n If n is even then median is the average of the numbers in the position and 2 st n the + 1 2 position.

Example: 10 25 30 35 40 60 60 65 100 10 25 30 35 40 60 60 65 80 100 The median is one of the many ways to find the center of the data. Another important way will be mentioned later. 3.4 The Spread of the Distribution: The Range and The Interquartile Range The spread measures how much the data varies from the center Range (p. 52) - Maximum Minimum. Sensitive to outlying values so does not always represent data properly. Instead of looking at the ends of the data the range of the middle of the data could be measured. Interquartile Range (p. 52) - IQR = Q3 Q1 Quartiles split the sorted data into Quarters. The median is the middle quartile or Q 2. The Third Quartile, Q 3or the Upper Quartile, is the median of the upper half of the numbers. The First Quartile, Q 1or the Lower Quartile, is the median of the lower half of the numbers The lower and upper quartiles are also known as the 25 th and 75 th percentiles, respectively. (If n is odd the book includes the median in both halves when figuring up the quartiles. I will not, since that is the way the calculator leaves it out if n is odd.)

Example: 10 25 30 35 40 60 60 65 100 10 25 30 35 40 60 60 65 80 100 3.5 Boxplots and 5-Number Summaries The 5-number summary (p. 54) is the following Minimum Score, Q, 1 Median, Q, 3 Maximum Score The Boxplot or Box and Whisker Plot (p. 54) is a graphical representation of the 5-number summary. Boxplots are useful when comparing different groups of data.

Creating a Boxplot: 1. Draw an axis and label appropriately. 2. Draw a box with the ends being the Q 1and Q 3and a line in the middle of the box for the median. The box shows where the data in the 25 th percentile to the 75 th percentile are located or the middle 50%. 3. Calculate Upper and Lower Fences or Limits to determine which points are outliers. Outliers will be any values outside interval formed by fences. Upper Fence = Q3 + 1.5 I.Q.R. Lower Fence = Q 1.5 I.Q.R. 1 4. Mark outliers with a special symbol (*). Draw whiskers out to smallest and largest values of data that are not outliers. (Sometimes a different symbol is used for Far Outliers data values farther than 3 IQRs from the quartiles.) Example: The following data represent the weight (in pounds) of 15 five year old girls. 30 32 34 35 36 36 37 38 41 42 44 45 47 62 65

Histogram and Boxplot for daily wind speeds (see also Fig 3.13, p. 55). How does each represent distribution? 3.6 The Center of Symmetric Distributions: The Mean If the distribution is symmetric then calculations for the center and spread can be used that include all data values. Mean(p. 57) The value found by summing up the numbers and dividing by n. Σx x1+ x2+ + xn x = = n n Typically will see mean rounded to one more digit than data has. Example: 10 25 30 35 40 60 60 65 100

Median is point where there are half of the scores above and half of the scores below. Median is not sensitive to outlying values. Mean is point where histogram would balance. Mean is more sensitive to outlying values. If distribution is symmetric then mean = median. If distribution is skewed the mean is pulled closer to the tail than the median. If distribution is symmetric and there are no outliers mean is preferred measure of center. If distribution is skewed or has outliers than median is preferred since the median is resistant to extreme large or small values. 3.7 The Spread of Symmetric Distributions: The Standard Deviation Variance and Standard Deviation (p. 59) IQR only uses two quartiles (or 50%) of the data, so it ignores how individual values vary. The Standard Deviation takes into account how far each value is from the mean. These differences are called deviations. Variance: s 2 ( x x) 2 Σ = n 1 Standard Deviation: s= ( x x) 2 Σ n 1 Why ( ) 2 each term? Why Standard Deviation instead of Variance? Like the mean the standard deviation is only appropriate for symmetric data. Also as with the mean typically round standard deviation to one more digit than original data contains.

Example: Find the Standard Deviation of 1, 3, 4, 8, 9. What to Tell about a Quantitative Variable (p. 61) What can go wrong? (p. 64-76)