Chapter 5. Understanding and Comparing Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Similar documents
Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Understanding and Comparing Distributions. Chapter 4

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

Chapter 3 - Displaying and Summarizing Quantitative Data

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

Chapter 1. Looking at Data-Distribution

CHAPTER 3: Data Description

Chapter 6: DESCRIPTIVE STATISTICS

Table of Contents (As covered from textbook)

STA 570 Spring Lecture 5 Tuesday, Feb 1

Boxplots. Lecture 17 Section Robb T. Koether. Hampden-Sydney College. Wed, Feb 10, 2010

To calculate the arithmetic mean, sum all the values and divide by n (equivalently, multiple 1/n): 1 n. = 29 years.

Averages and Variation

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

Name Date Types of Graphs and Creating Graphs Notes

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Chapter 2 Modeling Distributions of Data

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

1.3 Graphical Summaries of Data

M7D1.a: Formulate questions and collect data from a census of at least 30 objects and from samples of varying sizes.

appstats6.notebook September 27, 2016

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Name: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations

DAY 52 BOX-AND-WHISKER

CHAPTER 2 DESCRIPTIVE STATISTICS

No. of blue jelly beans No. of bags

3.3 The Five-Number Summary Boxplots

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Chapter 6: Comparing Two Means Section 6.1: Comparing Two Groups Quantitative Response

Chapter 3 Understanding and Comparing Distributions

Lecture Notes 3: Data summarization

Sections 2.3 and 2.4

Chapter 2 Describing, Exploring, and Comparing Data

Lecture 6: Chapter 6 Summary

15 Wyner Statistics Fall 2013

Descriptive Statistics: Box Plot

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Box Plots. OpenStax College

Box and Whisker Plot Review A Five Number Summary. October 16, Box and Whisker Lesson.notebook. Oct 14 5:21 PM. Oct 14 5:21 PM.

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

VCEasy VISUAL FURTHER MATHS. Overview

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

UNIT 1A EXPLORING UNIVARIATE DATA

Your Name: Section: 2. To develop an understanding of the standard deviation as a measure of spread.

Stat 428 Autumn 2006 Homework 2 Solutions

Measures of Dispersion

AND NUMERICAL SUMMARIES. Chapter 2

Displaying Distributions - Quantitative Variables

Mean,Median, Mode Teacher Twins 2015

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Understanding Statistical Questions

3 Graphical Displays of Data

8 Organizing and Displaying

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

SLStats.notebook. January 12, Statistics:

More Numerical and Graphical Summaries using Percentiles. David Gerard

Section 6.3: Measures of Position

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

Chapter 2: The Normal Distribution

Bar Graphs and Dot Plots

Chapter 5: The beast of bias

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

MATH NATION SECTION 9 H.M.H. RESOURCES

Chapter 2 Exploring Data with Graphs and Numerical Summaries

This lesson is designed to improve students

1. Descriptive Statistics

Exploratory Data Analysis

Section 9: One Variable Statistics

3 Graphical Displays of Data

Ch3 E lement Elemen ar tary Descriptive Descriptiv Statistics

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Day 4 Percentiles and Box and Whisker.notebook. April 20, 2018

Week 4: Describing data and estimation

Univariate Statistics Summary

Chapter 2: Descriptive Statistics

Measures of Position

Chapter 5: The standard deviation as a ruler and the normal model p131

Making Science Graphs and Interpreting Data

Chapter 3 Analyzing Normal Quantitative Data

+ Statistical Methods in

Middle Years Data Analysis Display Methods

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

UNIT 1: NUMBER LINES, INTERVALS, AND SETS

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS

NAME: DIRECTIONS FOR THE ROUGH DRAFT OF THE BOX-AND WHISKER PLOT

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Quantitative - One Population

Transcription:

Chapter 5 Understanding and Comparing Distributions

The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram of the Average Wind Speed for every day in 1989. Slide 5-3

The Big Picture (cont.) The distribution is unimodal and skewed to the right. The high value may be an outlier Median daily wind speed is about 1.90 mph and the IQR is reported to be 1.78 mph. Can we say more? Slide 5-4

The Five-Number Summary The five-number summary of a distribution reports its median, quartiles, and extremes (maximum and minimum). Example: The fivenumber summary for the daily wind speed is: Max 8.67 Q3 2.93 Median 1.90 Q1 1.15 Min 0.20 Slide 5-5

Daily Wind Speed: Making Boxplots A boxplot is a graphical display of the fivenumber summary. Boxplots are useful when comparing groups. Boxplots are particularly good at pointing out outliers. Slide 5-6

Constructing Boxplots 1. Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box. Slide 5-7

Constructing Boxplots (cont.) 2. Put up fences around the main part of the data. The upper fence is 1.5 IQRs above the upper quartile. Upper fence = Q3 + 1.5(IQR) The lower fence is 1.5 IQRs below the lower quartile. Lower fence = Q1-1.5(IQR) Note: the fences only help with constructing the boxplot and should not appear in the final display. Slide 5-8

Constructing Boxplots (cont.) 3. Use the fences to make whiskers. Draw lines from the ends of the box up and down to the most extreme data values found within the fences. If a data value falls outside one of the fences, we do not connect it with a whisker. Slide 5-9

Constructing Boxplots (cont.) 4. Add the outliers by displaying any data values beyond the fences with special symbols. We often use a different symbol for far outliers that are farther than 3 IQRs from the quartiles. Slide 5-10

Wind Speed: Making Boxplots (cont.) Compare the histogram and boxplot for daily wind speeds: How does each display represent the distribution? Slide 5-11

Comparing Groups It is almost always more interesting to compare groups. With histograms, note the shapes, centers, and spreads of the two distributions. What does this graphical display tell you? Slide 5-12

Comparing Groups (cont.) Boxplots offer an ideal balance of information and simplicity, hiding the details while displaying the overall summary information. We often plot them side by side for groups or categories we wish to compare. What do these boxplots tell you? Slide 5-13

Example (p. 96 #11) Slide 5-14

Example (p. 96 #11) Slide 5-15

Example (p. 98 #20 and 21) See book Slide 5-16

Homework p. 95 #5, 7, 12, 15, 22, 29, 36 due on Wednesday Read p. 87 end Slide 5-17

What About Outliers? If there are any clear outliers and you are reporting the mean and standard deviation, report them with the outliers present and with the outliers removed. The differences may be quite revealing. Note: The median and IQR are not likely to be affected by the outliers. Slide 5-18

Variable Timeplots: Order, Please! For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data. Timeplots show data measurements in time order. Timeplots help us to detect trends over time. You cannot always extrapolate from the data. (Don t predict the future) It is probably okay to predict a less windy June next year compared to a more windy November. It would not be okay to predict another storm on November 21 st (outlier). Time Slide 5-19

*Re-expressing Skewed Data to Improve Symmetry When the data are skewed it can be hard to summarize them simply with a center and spread, and hard to decide whether the most extreme values are outliers or just part of a stretched out tail. How can we say anything useful about such data? Slide 5-20

*Re-expressing Skewed Data to Improve Symmetry (cont.) Slide 5-21

*Re-expressing Skewed Data to Improve Symmetry (cont.) Re-expression changes the measuring scale for the variable by altering the distance between the values. All of the lines below represent the numbers 1 to 10, on a decimal, logarithmic, and squared scale. On our familiar decimal measuring scale, the distance between numbers is the same for all numbers. On a logarithmic scale, the distance between the numbers decreases as the numbers get larger On a square scale, the distance between the numbers decreases as the values get smaller. All of the dots represent the same sequence of values from 1 to 10 on different measuring scales. Slide 5-22

What Can Go Wrong? (cont.) Avoid inconsistent scales, either within the display or when comparing two displays. Label clearly so a reader knows what the plot displays. Good intentions, bad plot: Slide 5-23

What Can Go Wrong? (cont.) Beware of outliers Be careful when comparing groups that have very different spreads. Consider these side-by-side boxplots of cotinine levels: Re-express... Slide 5-24

What have we learned? We ve learned the value of comparing data groups and looking for patterns among groups and over time. We ve seen that boxplots are very effective for comparing groups graphically. We ve experienced the value of identifying and investigating outliers. We ve graphed data that has been measured over time against a time axis and looked for longterm trends both by eye and with a data smoother. Slide 5-25

Example (p. 101 #21, 35, 37, 39) See book Slide 5-26

Homework p. 95 #5, 7, 12, 15, 22, 29, 36 Slide 5-27

Classwork Investigative Task HW: Read p. 104-107 Slide 5-28