Name: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations

Similar documents
Chapter 6: Comparing Two Means Section 6.1: Comparing Two Groups Quantitative Response

Stat Day 6 Graphs in Minitab

Homework Packet Week #3

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

Math 167 Pre-Statistics. Chapter 4 Summarizing Data Numerically Section 3 Boxplots

Chapter 3 - Displaying and Summarizing Quantitative Data

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Understanding and Comparing Distributions. Chapter 4

UNIT 1A EXPLORING UNIVARIATE DATA

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

How individual data points are positioned within a data set.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

AND NUMERICAL SUMMARIES. Chapter 2

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Measures of Position

Chapter 3 Analyzing Normal Quantitative Data

Section 9: One Variable Statistics

MATH11400 Statistics Homepage

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

AP Statistics Summer Assignment:

CHAPTER 3: Data Description

MATH NATION SECTION 9 H.M.H. RESOURCES

Lecture 6: Chapter 6 Summary

STA 570 Spring Lecture 5 Tuesday, Feb 1

Name Date Types of Graphs and Creating Graphs Notes

Sections 2.3 and 2.4

Chapter 5. Understanding and Comparing Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Averages and Variation

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Chapter 1. Looking at Data-Distribution

Chapter 2 Describing, Exploring, and Comparing Data

Table of Contents (As covered from textbook)

Chapter 3: Describing, Exploring & Comparing Data

Boxplots. Lecture 17 Section Robb T. Koether. Hampden-Sydney College. Wed, Feb 10, 2010

Chapter2 Description of samples and populations. 2.1 Introduction.

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Chapter 6 The Standard Deviation as Ruler and the Normal Model

Key: 5 9 represents a team with 59 wins. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games.

CHAPTER 2 DESCRIPTIVE STATISTICS

AP Statistics Prerequisite Packet

Name Geometry Intro to Stats. Find the mean, median, and mode of the data set. 1. 1,6,3,9,6,8,4,4,4. Mean = Median = Mode = 2.

CHAPTER 2: SAMPLING AND DATA

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

1.3 Graphical Summaries of Data

AP Statistics. Study Guide

Box Plots. OpenStax College

Pre-Calculus Multiple Choice Questions - Chapter S2

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 3. Descriptive Measures. Slide 3-2. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 5: The normal model

Bar Graphs and Dot Plots

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Lecture Notes 3: Data summarization

Descriptive Statistics

Chapter 2 Modeling Distributions of Data

Section 6.3: Measures of Position

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 2 Exploring Data with Graphs and Numerical Summaries

+ Statistical Methods in

Exploratory Data Analysis

appstats6.notebook September 27, 2016

Chapter 5: The standard deviation as a ruler and the normal model p131

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

3 Graphical Displays of Data

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Numerical Descriptive Measures

This lesson is designed to improve students

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BBA240 STATISTICS/ QUANTITATIVE METHODS FOR BUSINESS AND ECONOMICS

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Understanding Statistical Questions

CCGPS UNIT 4 Semester 2 COORDINATE ALGEBRA Page 1 of 34. Describing Data

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

HS Mathematics Item Specification C1 TP

Chapter 5. Normal. Normal Curve. the Normal. Curve Examples. Standard Units Standard Units Examples. for Data

4.2 Data Distributions

Data Analyst Nanodegree Syllabus

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Univariate Statistics Summary

No. of blue jelly beans No. of bags

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

3 Graphical Displays of Data

Lecture 1: Exploratory data analysis

Ex.1 constructing tables. a) find the joint relative frequency of males who have a bachelors degree.

Chapter 2: Descriptive Statistics

2.1: Frequency Distributions and Their Graphs

Transcription:

Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations Name: Chapter P: Preliminaries Section P.2: Exploring Data Example 1: Think About It! What will it look like? Consider the following Variables: 1 A. Point values of letters in the board game Scrabble B. Prices of properties on the Monopoly game board C. Jersey numbers of San Francisco 49ers football players in 2017 D. Weights of rowers on the 2016 U.S. men's Olympic team E. Blood pressure measurements for a sample of healthy adults F. Quiz percentages for a class of statistics students (quizzes were quite straightforward for most students) G. Annual snowfall amounts for a sample of cities around the U.S. a) Identify each of the variables above as categorical or quantitative. b) Matching. The following dotplots display the distributions of these variables, but the variables are not shown in the same order as they are listed. Moreover, the scales have been intentionally left off the axes! For each dotplot, try to identify the variable displayed (by letter, from the previous list). Also, provide a brief explanation of your reasoning in each case. [Note: You might make different matches than other students; be prepared to justify your choices.] One of the goals of this matching game is to illustrate that you can anticipate what the distribution of a set of data might look like by considering the context of the data. 1. 2. 3. 4. 5. 6. 7. 1 Rossman, A., & Chance, B., Workshop Statistics: Discovery with Data, 4 th Ed. Wiley & Sons, 2011.

Stat 300 Text: Intro. to Statistical Investigations Section P2 Page 2 of 6 P2: Exploring Data In this section, you will learn how to explore a data set by carry out an Exploratory Data Analysis or EDA. By exploring your data set, you will begin to learn how to begin to examine the data in a way that can help you interpret the data and help make informed decisions. Often, you will need to look at several different exploratory features in order to get a good feel for your data. Do not limit yourself to only considering one graph or calculating one statistic. Exploratory Data Analysis or EDA Guidelines (Graph) What does the data distribution look like? o Possible s - Symmetric (mound shape), Skewed Left, Skewed Right, Uniform, Bimodal. o Graphs to Consider Dotplot, Histogram, Boxplot Where is the distribution centered? What is a typical or representative value? o Different ways to measure the center Mean, Median, Mode, Midrange Spread or Variability How far does the data spread from the center? o Different ways to measure the spread Standard Deviation, Range, Inner Quartile Range (IQR) Are there any unusual observations that deviate from the overall pattern on the distribution? Example 2: Below you are given the heights (in feet) for a random sample of dwarf mandarin trees from your local Green Acres nursery. Use this data set to conduct an Exploratory Data Analysis (EDA). We will first create the graphs and calculate the values by hand, then we will use StatCrunch to expedite the process. Tree Heights (in feet) 1 8 6.5 5.5 4 1.5 3 2 2.5 4 7.5 3.5 The shape of a data set describes the overall picture or distribution. à Five Main s or Distributions: Symmetric (Bell or mounded), Uniform, Skewed Left, Skewed Right, Bimodal (or symmetric with to mounds). Enter your data into the first column à Warning: You may have to consult several different graphs to get a good idea of its shape. Dotplot One dot represents SC Dotplot Graph > Dotplot Under Column click on your variable. Label:

Stat 300 Text: Intro. to Statistical Investigations Section P2 Page 3 of 6 Histogram Each bar represents: SC Histogram Graph > Histogram Under Column click on your variable. Under Type: Choose: Frequency (How many) or Relative Frequency (Proportion or %) Boxplot Each Box or Whisker represents: Label: SC Boxplot Graph > Boxplot Under Column click on your variable. Under Other Options: Click: Use fences to identify outliers & Draw boxes horizontally. The center of a data set is a typical or representative data value à Three main ways to measure the center: Mean, Median, Mode Definitions: Enter your data into the first column SC Summary Statistics Stat > Summary Stats > Columns Under Statistics: Control + Click to select the mean, median, and mode of the data set. Mode:

Stat 300 Text: Intro. to Statistical Investigations Section P2 Page 4 of 6 Spread The spread of a distribution describes the variability of a data set. Are the values close together? Are they far apart? Do they cluster in one area? How far do the data range? Enter your data into the first column à Five Main ways to measure the spread: Min, Max, Range, Standard Deviation, Interquartile Range. Definitions: Min: Range: Max: SC Dotplot Stat > Summary Stats > Columns Under Statistics: Use Control + Click to select the std. dev., min, max, Q1, Q3, IQR Standard Deviation: Quartiles: Interquartile Range: are data values that deviate markedly from the overall patter of the other data values. Many data sets do not have any outliers. à Outlier are anything beyond the upper and lower fence; ABOVE the upper fence: Q3 + 1.5 * IQR Find the outliers by creating the boxplot and asking StatCrunch to mark the outliers using fences. BELOW the lower fenec: Q1 1.5 *IQR

Stat 300 Text: Intro. to Statistical Investigations Section P2 Page 5 of 6 Example 3: Dataset: Old Faithful 2 In the reading for section P.2, you read about a statistical investigation in which park rangers at Yellowstone National Park were trying to predict the how much time a person usually has to wait to see an eruption of the geyser Old Faithful. In order to predict the next eruption time, researchers collected data on 222 eruption of Old Faithful taken over several days in August 1978. The times in between eruptions for all 222 observations can be found in the dataset labeled Old Faithful in the StatCrunch group. Use this data set to conduct an Exploratory Data Analysis by completing the following: a) Describe the of the distribution. Use StatCrunch to create the dotplot, histogram, and boxplot for time in between eruptions. b) Find the measures of center of the distribution. Use StatCrunch to find the mean, median, and mode for the times in between eruptions. c) Find the measures of spread or variablity of the distribution. Use StatCrunch to find the min, max, range, standard deviation, and inner quartile range (IQR) for the times in between eruptions. d) Does this data set contain any outliers? Use boxplot feature on StatCrunch to determine whether or not this data set contains any outliers. Example 4: Which is better measure of : Mean or the Median? The center of a dataset is meant to represent a typical data value. Below you are given the sales price for seven homes that recently sold in the greater Sacramento area. Home prices in dollars: $300,000 $285,000 $400,000 $385,000 $410,000 $325,000 $2,500,000 a) Use StatCrunch to find the mean and median of this data set. b) Which value(s) best represent a typical value from the data set above the mean, the median, or both values? Explain your answer. c) Describe the shape of the distribution. Are there any outliers? (Use StatCrunch.) d) Change the last value from $2,500,000 to $250,000 and repeat steps a c from above. 2 Tintle, N. et al., Introduction to Statistical Investigations, 1 st ed. Wiley & Sons, 2016.

Stat 300 Text: Intro. to Statistical Investigations Section P2 Page 6 of 6 Example 5: For each of the data sets below, use StatCrunch to help you create a dotplot and find the mean, median, standard deviation, and outliers. Record your answer on the table provided. Data Set #1: 4, 5, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9, 10 Dotplot Spread / Variability Std. Dev.: Data Set #2: 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8 Dotplot Spread / Variability Std. Dev.: Data Set #3: 4, 4, 4, 4, 4, 5, 5, 5, 9, 9, 9, 10, 10, 10, 10, 10 Dotplot Spread / Variability Std. Dev.: Think About It / Write About It The standard deviation can roughly be interpreted as the typical distance between the data values and the mean. Compare the standard deviations for each of these three data sets. Explain how the standard deviation related to the shape of the distribution.