Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Similar documents
Actual Major League Baseball Salaries ( )

Organizing and Summarizing Data

CHAPTER 2: SAMPLING AND DATA

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Chapter 2 - Frequency Distributions and Graphs

Frequency Distributions

2.1: Frequency Distributions

AND NUMERICAL SUMMARIES. Chapter 2

CHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions.

Basic Statistical Terms and Definitions

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

STP 226 ELEMENTARY STATISTICS NOTES

Chapter 2 - Graphical Summaries of Data

Elementary Statistics

Chapter 2: Frequency Distributions

Parents Names Mom Cell/Work # Dad Cell/Work # Parent List the Math Courses you have taken and the grade you received 1 st 2 nd 3 rd 4th

MATH 117 Statistical Methods for Management I Chapter Two

Spell out your full name (first, middle and last)

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Chapter 3 Analyzing Normal Quantitative Data

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Organizing and Summarizing Data

NOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

Chapter 5. Normal. Normal Curve. the Normal. Curve Examples. Standard Units Standard Units Examples. for Data

1.3 Graphical Summaries of Data

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

Averages and Variation

2.3 Organizing Quantitative Data

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Measures of Central Tendency

Chpt 1. Functions and Graphs. 1.1 Graphs and Graphing Utilities 1 /19

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Section 3.2 Measures of Central Tendency MDM4U Jensen

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Name Date Types of Graphs and Creating Graphs Notes

Chapter 6: DESCRIPTIVE STATISTICS

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

Univariate Statistics Summary

Frequency Distributions and Graphs

Courtesy :

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

Descriptive Statistics

Data organization. So what kind of data did we collect?

2.1: Frequency Distributions and Their Graphs

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Chapter 2 Describing, Exploring, and Comparing Data

Week 2: Frequency distributions

Chapter 2. Frequency Distributions and Graphs. Bluman, Chapter 2

B. Graphing Representation of Data

Chapter 1. Looking at Data-Distribution

CHAPTER 3: Data Description

3.2-Measures of Center

Chapter 1 Histograms, Scatterplots, and Graphs of Functions

Chapter 3 - Displaying and Summarizing Quantitative Data

DAY 52 BOX-AND-WHISKER

No. of blue jelly beans No. of bags

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

Organizing Data. Class limits (in miles) Tally Frequency Total 50

Exploratory Data Analysis

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

1.3 Box and Whisker Plot

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

Chapter Two: Descriptive Methods 1/50

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Chapter 2: Understanding Data Distributions with Tables and Graphs

Applied Statistics for the Behavioral Sciences

1 Introduction. 1.1 What is Statistics?

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

AP Statistics Summer Assignment:

Chapter 2: Graphical Summaries of Data 2.1 Graphical Summaries for Qualitative Data. Frequency: Frequency distribution:

Measures of Central Tendency

UNIT 1A EXPLORING UNIVARIATE DATA

Chapter 2: Descriptive Statistics

Chapter2 Description of samples and populations. 2.1 Introduction.

BUSINESS DECISION MAKING. Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

12. A(n) is the number of times an item or number occurs in a data set.

Chapter 2. Frequency distribution. Summarizing and Graphing Data

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Table of Contents (As covered from textbook)

appstats6.notebook September 27, 2016

Section 3.1 Shapes of Distributions MDM4U Jensen

+ Statistical Methods in

Box It Up (A Graphical Look)

CHAPTER 2 DESCRIPTIVE STATISTICS

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

Chapter 2 Organizing Data

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Box Plots. OpenStax College

CAMBRIDGE TECHNOLOGY IN MATHS Year 11 TI-89 User guide

Chapter 2 Scatter Plots and Introduction to Graphing

Transcription:

Section 2.1 - Introduction Graphs are commonly used to organize, summarize, and analyze collections of data. Using a graph to visually present a data set makes it easy to comprehend and to describe the data. Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques. Section 2.2 - Classifications of Data A variable is a type of information, usually a property or characteristic of a person or thing that is measured or observed. (a type of information or property about the thing being observed) Value of the variable is a specific measurement or observation for a variable. VARIABLES Weight Height Eye color Sex GPA Temperature VALUES 163 pounds 5 11 Blue Male 3.5 32 A data set is a collection of several data pertaining to one or more variables. There are two types of variables: Categorical and Numerical: A categorical variable represents categories, and is non-numerical in nature. These values are known as categorical data. Categorical data: Non-numerical Color, nationality, vehicle type, met fan American Idol favorite A numerical variable is a variable where the value is a number that results from a measurement process. The specific values of numerical variables are called numerical data. Examples: height, age, credits per semester Numerical data can be further classified as either continuous or discrete depending on the numerical values it can assume. Continuous data are numerical measurements that can assume any value between two numbers. Measurements that can take any value between any two numbers Usually rounded numbers Height, weight, distance, temperature, annual salary, blood pressure 1

Discrete data are numerical values that can assume only a limited number of values. Represent precise counts at a point in time Full-time students at NCC The number of unemployed in a city Accidents in a state shoe size number of children in a family Example 2.2: For each of the following: a) Identify each variable and classify as either categorical or numerical, and b) If the variable is numerical, then further classify it as either continuous or discrete. 1) a person s height 2) an individual s political affiliation 3) the number of rainy days in July 4) the number of gallons of fuel used to travel from New York to Chicago 5) a person s hair color 6) an opinion to the question: Is the President of the United States doing a good job? 7) a shoe size 8) the number of inches of snowfall per year 9) the number of vowels in a paragraph 10) the amount of soda in a 2 liter bottle 11) a student s credit load for the semester Section 2.3 - Exploring Data Using the Stem-and-Leaf Display A distribution of a numerical variable represents the data values of the variable from the lowest to the highest value along with the number of times each data value occurs. The number of times each data value occurs is called its frequency. See page 36 Identifying Characteristics of a Distribution A stem-and-leaf display is a visual exploratory data analysis technique that shows the shape of a distribution. The display uses the actual data values of the variable to present the shape of the distribution of data values. At times, it may be of interest to compare two distributions using a stem-and-leaf display with a common stem. This type of display is called a back-to-back stem-and-leaf display. 2

Example on pg. 37 in the Text Cholesterol levels of 50 middle-aged men on a regular diet. (mg of Cholest/100ml of blood) 263 258 240 233 225 222 199 282 239 236 232 283 200 212 225 235 240 258 263 274 250 259 241 237 226 213 269 199 253 201 265 226 238 242 259 233 238 229 215 202 319 277 229 239 243 248 245 219 276 246 a) Construct a stem-and-leaf display. b) Describe the characteristics of the distribution. (See page 9 of this Handout) c) Determine the center of the distribution and then decide between which two values on a stem are the greatest concentration of data values located. See next the page of this handout for a template to use for this example. Procedure to Construct a Stem-and-Leaf Display: (SORT the data using your calculator first) 1. Identify the stem and leaf portion of your data values. Generally, the stem can have as many digits as needed to represent the beginning digit(s) of each data value. The leaf should only contain the last or terminating digit of the data values. Example: 263 has 26 as the stem, and 3 as the leaf 26 3 : the leaf is always the terminating (last) digit. The leaf should only contain the last or terminating digit of the data values. 2. Once you have all the stems of your data values, identify the smallest stem and the largest stem. 19 is the smallest stem and 31 is the largest stem List each possible stem once, in a vertical column starting with the smallest stem on top and ending with the largest stem at the bottom. List all numbers between 19 and 31, even if the stem contains no corresponding leaves. Draw a vertical line to the right of the column of stems. 3. For each data value, record the leaf within the corresponding stem row and to the right of the vertical line. Arrange the leaves within each stem row in increasing order from left to right. This provides a more informative stem-and-leaf display. 3

Example on pg. 37 in the Text a) Construct a stem-and-leaf display. b) Describe the characteristics of the distribution. (the shape) c) Decide between which two values on a stem are the greatest concentration of data values located. Review Example 2.3 on pg. 39 and Example 2.4 on pg. 40 in the Text SEE pg. 6 FOR DATA TO BE ENTERED BEFORE NEXT CLASS! 4

Section 2.4 - Frequency Distribution Tables A frequency distribution table is a table in which a data set has been divided into distinct groups, called classes, along with the number of data values that fall into each class, called the frequency. Frequency distribution tables are useful in summarizing large data sets. Procedure for Constructing a Frequency Distribution Table for Numerical Data: 1. Number of Classes (between 5 and 15): Given. If not given, use 7. 2. Class Width: largest data value smallest data value Find using, class width = number of classes Then go up to the next whole number. 3. Clasess (or Class Limits): First class limit starts at the smallest data value First class ends at the smallest data value plus one less than the class width. Class limits for each class after, add the class width to the lower and upper limit of the previous class. 4. Class Frequency: Find the frequencies by looking at the calculator list and counting how many data values fall between the lower and upper limits of each class. Class frequency - vertical axis on the Histogram 5. Class Mark: Find using, class mark = Upper Class Limit Lower Class Limit 2 6. Class Boundary (only if continuous data): Find by subtracting 0.5 to each lower class limit and adding 0.5 to each upper class limit. Class Boundary - the horizontal axis on the Histogram 7. Relative Frequency is the proportion (in decimal form) of data values within each class: First add up the Class Frequency column for the total number of data values. Then, for each class divide the Class Frequency by the total number of data values using, class frequency Relative Frequency of a Class total number of data values 8. Relative Percentage is the percentage of data values within each class: Find by multiplying each Relative Frequency by 100 which changes it to a percentage using, Relative Percentage = relative frequency 100% Note: Round relative frequencies and/or percentages according to the directions of the problem. 5

Example calculating class widths Find the class width of each of the following if there are 7 classes If lowest = 35 and highest = 89 If lowest = 96 and highest = 125 If lowest = 81 and highest = 137 Example on Frequency Distribution Table & Histogram A statistics instructor has recorded the amount of time 50 students need to complete the final examination. The times, stated to the nearest minute, are: (this data set is from the Chapter Review #76 on page 77 in the text) 50, 70, 62, 55, 38, 42, 49, 75, 80, 79, 48, 45, 53, 58, 64, 77, 48, 49, 50, 61, ENTER THIS DATA INTO YOUR 72, 74, 10, 95, 120, 79, 75, 48, 72, 37, CALCULATOR AS STATED IN 35, 79, 72, 75, 77, 32, 30, 77, 75, 79, YOUR HOMEWORK! 45, 70, 39, 75, 72, 45, 47, 73, 72, 71 BEFORE CLASS!! a) Construct a frequency distribution table using 7 classes. b) What percent of the students take between 26 and 73 minutes to complete the final exam? c) How many students take more than 90 minutes to complete the final exam? d) Construct a Histogram. (after reading Section 2.5 on the bottom of page 7) a) Classes Class Frequency Class Boundaries Class Mark Relative Frequency Relative Percentage b) What percent of the students take between 26 and 73 minutes to complete the final exam? c) How many students take more than 90 minutes to complete the final exam? 6

d) Histogram: Section 2.5 - Graphs A graph is a descriptive tool used to visualize the characteristics and the relationships of the data quickly and attractively. A well-constructed graph will reveal information that may not be apparent from a quick examination of a frequency distribution table. A bar graph is generally used to depict discrete or categorical data. A bar graph is displayed using two axes. One axis represents the discrete or categorical variable while the other axis usually represents the frequency, or percentage, for each category of the variable. A histogram is a vertical bar graph that represents a continuous variable. To depict the continuous nature of the data, the rectangles are connected to each other without any gaps or breaks between two adjacent rectangles. The width of each rectangle corresponds to the width of each class of the frequency distribution and the height of each rectangle corresponds to the frequency of the class. The vertical sides of each rectangle of a histogram correspond to the class boundaries of each class, and so, there are no breaks or gaps between the rectangles. Example of a Histogram see pg. 55 in the Text The six rectangles correspond to the 6 classes in the frequency distribution table. The width of each rectangle corresponds to the width of each class of the frequency distribution The height of each rectangle corresponds to the frequency of the class. The vertical sides of each rectangle of a histogram correspond to the class boundaries of each class, and so, there are no breaks or gaps between the rectangles. 7

Procedure to construct a Histogram with the TI 83 or TI 84: 1. Enter data into List 1 2. Press Window and set the window as follows: o Xmin = the smallest data value 0.5 o Xmax = the largest data value + the class width. o Xscl = the class width o Ymin = 0 o Ymax = your estimate of the frequency of the largest class o Keep both Yscl and Xres =1 3. Set the Graph type o 2 nd STAT PLOT choose 1 o x List: List 1 o Frequency: 1 4. Graph the histogram o Press Graph o Use the TRACE to get class limits and frequencies o Notice that as you move your cursor across the histogram, the frequency and class boundary of each class is displayed on the bottom of the screen 5. Copy the finished graph on paper. o Give the graph a title. o Label axes (variable on the horizontal and frequency on the vertical) o Mark axes with numbers (horizontally with class boundaries, vertically with frequencies) o Indicate with a jagged line the missing data values on the horizontal axis. o Create a rectangle for each class whose height is the class frequency. Note: If raw data is not given, instead a frequency of each data value is given, then change the above steps as specified below: 1. Enter data into List 1 and enter frequencies into List 2 3. Set the Graph type o 2 nd STAT PLOT choose 1 o x List: List 1 o Frequency: List 2 8

Example 2.12 on pg. 69 in the Text Use the information in the histogram below to answer the following questions. a) How many students are represented by this histogram? b) How many students are older than 26? c) What percent of the students are older than 26? d) What percent of the students are younger than 21? e) What percent of the students are older than 20 but younger than 30? Review Example 2.6 on pg. 48 in the Text 9

Section 2.7 - Identifying Shapes and Interpreting Graphs Symmetric bell-shaped distribution has it s highest frequency, or peak, at the center with the frequencies steadily decreasing but identically distributed on both sides of center. Skewed to the right distribution has a greater number of relatively low scores and a few extremely high scores. The shape of a skewed distribution has a longer tail on the right side which represents the few extremely high scores. Skewed to the left distribution is a distribution that has a larger number of relatively high scores and a few extremely low scores. Its shape has a longer tail on the left side which represents the few extremely low scores. A Uniform distribution is one where all the classes contain the same number of data values or frequencies. A U shaped distribution is U shaped. It has its two greatest frequencies occurring at each extreme end of the distribution with the lower frequencies of the distribution occurring at the center. A reverse J-shaped distribution has the greatest frequency of data values occurring at one end of the distribution and then tails off gradually in the opposite direction. A Bimodal distribution is a distribution that has two separate, distinct, and relatively high peaks with the greatest frequencies. It usually indicates the distribution represents two different populations. 10

Stem Leaf 11

Classes or Class Limits Class Frequency Class Boundaries Class Mark Relative Frequency Relative Percentage 12