+ Statistical Methods in

Similar documents
Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

2.1: Frequency Distributions

Chapter 2 - Graphical Summaries of Data

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Chapter 2: Graphical Summaries of Data 2.1 Graphical Summaries for Qualitative Data. Frequency: Frequency distribution:

Organizing and Summarizing Data

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Test Bank for Privitera, Statistics for the Behavioral Sciences

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

Elementary Statistics

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

STP 226 ELEMENTARY STATISTICS NOTES

Frequency Distributions

Chapter 2: Frequency Distributions

MATH 117 Statistical Methods for Management I Chapter Two

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1: Frequency Distributions and Their Graphs

Chapter 2: Understanding Data Distributions with Tables and Graphs

Lecture Series on Statistics -HSTC. Frequency Graphs " Dr. Bijaya Bhusan Nanda, Ph. D. (Stat.)

AND NUMERICAL SUMMARIES. Chapter 2

Chapter 2: Descriptive Statistics

Frequency Distributions and Graphs

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Chapter 2 Describing, Exploring, and Comparing Data

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Chapter 2 - Frequency Distributions and Graphs

Raw Data. Statistics 1/8/2016. Relative Frequency Distribution. Frequency Distributions for Qualitative Data

Chapter 2 Descriptive Statistics I: Tabular and Graphical Presentations. Learning objectives

download instant at Summarizing Data: Listing and Grouping

Ms Nurazrin Jupri. Frequency Distributions

Round each observation to the nearest tenth of a cent and draw a stem and leaf plot.

UNIT 1A EXPLORING UNIVARIATE DATA

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

Basic Statistical Terms and Definitions

B. Graphing Representation of Data

2.3 Organizing Quantitative Data

Making Science Graphs and Interpreting Data

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Name Date Types of Graphs and Creating Graphs Notes

AP Statistics Summer Assignment:

CHAPTER 2: ORGANIZING AND VISUALIZING VARIABLES

2. The histogram. class limits class boundaries frequency cumulative frequency

12. A(n) is the number of times an item or number occurs in a data set.

Chapter 2. Frequency Distributions and Graphs. Bluman, Chapter 2

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2 Descriptive Statistics. Tabular and Graphical Presentations

Tabular & Graphical Presentation of data

CHAPTER 2: SAMPLING AND DATA

Slides Prepared by JOHN S. LOUCKS St. Edward s s University Thomson/South-Western. Slide

2.4-Statistical Graphs

Univariate Statistics Summary

Statistical Tables and Graphs

CHAPTER 2. Objectives. Frequency Distributions and Graphs. Basic Vocabulary. Introduction. Organise data using frequency distributions.

Statistics for Managers Using Microsoft Excel, 7e (Levine) Chapter 2 Organizing and Visualizing Data

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Organisation and Presentation of Data in Medical Research Dr K Saji.MD(Hom)

STAT STATISTICAL METHODS. Statistics: The science of using data to make decisions and draw conclusions

8. MINITAB COMMANDS WEEK-BY-WEEK

28 CHAPTER 2 Summarizing and Graphing Data

UNIT 15 GRAPHICAL PRESENTATION OF DATA-I

Section 2-2. Histograms, frequency polygons and ogives. Friday, January 25, 13

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Raw Data is data before it has been arranged in a useful manner or analyzed using statistical techniques.

Courtesy :

Distributions of Continuous Data

No. of blue jelly beans No. of bags

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

Chapter 2 Modeling Distributions of Data

1.3 Graphical Summaries of Data

23.2 Normal Distributions

Downloaded from

Chapter 3 - Displaying and Summarizing Quantitative Data

Slides by. John Loucks. St. Edward s University. Slide South-Western, a part of Cengage Learning

Using a percent or a letter grade allows us a very easy way to analyze our performance. Not a big deal, just something we do regularly.

MATH1635, Statistics (2)

Frequency distribution

Table of Contents (As covered from textbook)

Statistics Lecture 6. Looking at data one variable

Chapter 2. Frequency distribution. Summarizing and Graphing Data

Chapter 2 - Descriptive Statistics

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

- 1 - Class Intervals

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

4) Discrete data can have an infinite number of values within a specific interval. Answer: FALSE Diff: 2 Keywords: discrete data Reference: Page 24

MTH 3210: PROBABILITY AND STATISTICS DESCRIPTIVE STATISTICS WORKSHEET

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

Chapter Two: Descriptive Methods 1/50

Parents Names Mom Cell/Work # Dad Cell/Work # Parent List the Math Courses you have taken and the grade you received 1 st 2 nd 3 rd 4th

3 Graphical Displays of Data

CHAPTER 2 DESCRIPTIVE STATISTICS

AP Statistics Prerequisite Packet

Applied Statistics for the Behavioral Sciences

3 Graphical Displays of Data

Transcription:

+ Statistical Methods in Practice STA/MTH 3379 + Dr. A. B. W. Manage Associate Professor of Statistics Department of Mathematics & Statistics Sam Houston State University Discovering Statistics 2nd Edition Daniel T. Larose Chapter 2: Describing Data Using Graphs and Tables Lecture PowerPoint Slides + Chapter 2 Overview 3 + The Big Picture 4 2.1 Graphs and Tables for Categorical Data 2.2 Graphs and Tables for Quantitative Data 2.3 Further Graphs and Tables for Quantitative Data 2.4 Graphical Misrepresentations of Data Where we are coming from and where we are headed In Chapter 1 we learned the basic concepts of statistics, such as population, sample, and types of variables, along with methods of collecting data. In Chapter 2 we learn about graphs and tables for summarizing qualitative and quantitative data, and we examine how to prevent our graphics from being misleading. In Chapter 3, we will learn how to describe a data set using numerical measures like statistics rather than graphs and tables. 1

+ 2.1: Graphs and Tables for Categorical Data Objectives: Construct and interpret a frequency distribution and a relative frequency distribution for qualitative data. Construct and interpret bar graphs and Pareto charts. 5 6 Frequency Distributions Data sets are not always clear. We need ways to summarize the values in a data set. The frequency, or count, of a category refers to the number of observations in each category. A frequency distribution for a qualitative variable is a listing of all the values (e.g., categories) that the variable can take, together with the frequencies for each value. Construct and interpret pie charts. Construct crosstabulations to describe the relationship between two variables. Construct a clustered bar graph to describe the relationship between two variables. Relative Frequency Distributions Suppose you don t know the size of the sample in the survey. Comparing the frequency to the total sample size gives us the relative frequency. The relative frequency of a particular category of a qualitative variable is its frequency divided by the sample size. A relative frequency distribution for a qualitative variable is a listing of all values that the variable can take, together with the relative frequencies for each value. 7 Bar Graphs (Bar Charts) Frequency distributions and relative frequency distributions are tabular. The graphical equivalent of these distributions is called a bar graph. A bar graph (or bar chart) is used to represent the frequencies or relative frequencies for categorical data. It is constructed as follows. 1. On the horizontal axis, provide a label for each category. 2. Draw rectangles (bars) of equal width for each category. The height of each rectangle represents the frequency or relative frequency for that category. Ensure that the bars are not touching each other. 8 2

Pareto Charts 9 Pie Charts 10 The bars in a bar graph may be presented horizontally or vertically. Pie charts are a common graphical device for displaying the relative frequencies of a categorical variable A pie chart is a circle divided into sections, with each section representing a particular category. The size of the section is proportional to the relative frequency of the category. A Pareto chart is a bar graph in which the rectangles are presented in decreasing order from left to right. Crosstabulations Crosstabulation is a tabular method for simultaneously summarizing the data for two categorical variables. Steps for Constructing a Crosstabulation 1. Put the categories of one variable at the top of each column, and the categories of the other variable at the beginning of each row. 2. For each row and column combination, enter the number of observations that fall in the two categories. 3. The bottom of the table gives the column totals, and the right-hand column gives the row totals. 11 Clustered Bar Graphs Clustered bar graphs are useful for comparing two categorical variables and are often used in conjunction with crosstabulations. Emotion Gender Sadness Fear Anger Disbelief Vulnerability Not sure Total Female 94 21 87 80 28 4 314 Male 56 16 141 50 36 5 304 Total 150 37 228 130 64 9 618 12 Crosstabulations are also known as two-way tables or contingency tables. Emotion Disbelief Gender Sadness Fear Anger Vulnerability Not sure Total Female 94 21 87 80 28 4 314 Male 56 16 141 50 36 5 304 Total 150 37 228 130 64 9 618 3

+ 2.2: Graphs and Tables for Quantitative Data Objectives: Construct and interpret a frequency distribution and a relative frequency distribution for discrete and continuous data. 13 Frequency Distributions and 14 Relative Frequency Distributions Section 2.1 introduced tables and graphs for summarizing qualitative data. Most of the data sets we will encounter are quantitative. We can apply frequency and relative frequency distributions to quantitative data just as we did for qualitative data. Consider Table 2.13 on page 54. Use histograms and frequency polygons to summarize quantitative data. Construct and interpret stem-and-leaf displays and dotplots. Recognize distribution shape, symmetry, and skewness. Classes We can combine several ages together into classes, in order to produce a more concise distribution. Classes represent a range of data values and are used to group the elements in a data set. 15 Class Limits We use the following to construct frequency distributions and histograms. The lower class limit of a class equals the smallest value within that class. The upper class limit of a class equals the largest value within that class. The class width equals the difference between the lower class limits of two successive classes. The class boundary of two successive classes is found by taking the sum of the upper class limit of a class and the lower class limit of the class to its right, and dividing sum by two. The lower class boundary of the left-most class equals its upper class boundary minus the class width. The upper class boundary of the right-most class equals its lower class boundary plus the class width. To construct a frequency distribution for continuous data: 1. Choose the number of classes. 2. Determine the class width. 3. Find the upper and lower class limits. 4. Calculate the class boundaries. 5. Find the frequencies of each class. 16 4

Histograms One example of a graphical summary for quantitative data is a histogram. A histogram is constructed using rectangles for each class of data. The heights of the rectangles represent the frequencies or relative frequencies of the class. The widths of the rectangles represent the class widths of the corresponding distribution. The class boundaries are placed on the horizontal axis, so that the rectangles are touching each other. 17 Histograms Twenty management students, in preparation for graduation, took a course to prepare them for a management aptitude test. A simulated test provided the following scores: 77 89 84 83 80 80 83 82 85 92 87 88 87 86 99 93 79 83 81 78 18 To construct a histogram: 1. Find the class limits and draw the horizontal axis. 2. Determine the frequencies and draw the vertical axis. 3. Draw the rectangles. Frequency Polygons Frequency polygons provide the same information as histograms, but in a slightly different format. A frequency polygon is constructed as follows: 1. For each class, plot a point at the class midpoint, at a height equal to the frequency for that class. 2. Join each consecutive pair of points with a line segment. 19 Stem-and-Leaf Displays Stem-and-leaf displays contain more information than frequency distributions and histograms. Consider the final-exam scores of 20 psychology students below: 75 81 82 70 60 59 94 77 68 98 86 68 85 72 70 91 78 86 51 67 Find the leading digits of the numbers. Place these five numbers, called the stems, in a column: 20 5 6 7 8 9 91 0887 507208 12656 481 Now consider the ones place of each data value. Place this number, called the leaf, next to its stem. 5

Dotplots 21 Distribution Shape 22 A simple but effective graphical display is a dotplot. In a dotplot, each data point is represented by a dot above the number line. Below is a dotplot of the 20 management aptitude test scores. Dotplots are useful for comparing two variables. Suppose an instructor taught two sections of a management course and gave a simulated MAT exam in each section. The two groups could be compared using dotplots. Frequency distributions are tabular summaries of the set of values that a variable takes. The distribution of a variable is a table, graph, or formula that identifies the variable values and frequencies for all elements in the data set. The shape of a distribution is the overall form of a graphical summary, approximated by a smooth curve. A distribution is symmetric if there is a line (axis of symmetry) that splits the image in half so that one side is the mirror image of the other. A distribution is skewed if it has a longer tail on one side of the image. Distribution Shape 23 + 2.3: Further Graphs and Tables for Quantitative Data 24 Symmetric, bell-shaped Objectives: Build cumulative frequency distributions and cumulative relative frequency distributions. Create frequency ogives and relative frequency ogives. Right-skewed Construct and interpret time series graphs. Left-skewed 6

25 Cumulative Frequency Distributions Since quantitative data can be put in ascending order, we can keep track of the accumulated counts at or below a certain value using a cumulative frequency distribution or cumulative relative frequency distribution. Cumulative Frequency Distributions The frequency distribution below displays the total 2007 attendance for 25 Major League Baseball teams. We can use this to construct a cumulative relative frequency distribution. 26 For a discrete variable, a cumulative frequency distribution shows the total number of observations less than or equal to the category value. For a continuous variable, a cumulative frequency distribution shows the total number of observations less than or equal to the upper class limit. A cumulative relative frequency distribution shows the proportion of observations less than or equal to the category value (for a discrete variable) or the proportion of observations less than or equal to the upper class limit (for a continuous variable). Ogives 27 Time Series Graphs 28 Histograms and frequency polygons are the graphical equivalent of frequency distributions. Ogives are the graphical equivalent of cumulative frequency distributions. An ogive (pronounced oh jive ) is the graphical equivalent of a cumulative frequency distribution or a cumulative relative frequency distribution. Like a frequency polygon, an ogive consists of a set of plotted points connected by line segments. The x coordinates of these points are the upper class limits; the y coordinates are the cumulative frequencies or cumulative relative frequencies. Data analysts are often interested in how the value of a variable changes over time. Data that are analyzed with respect to time are called time series data. A graph of time series data is called a time series plot. The horizontal axis of a time series plot represents time (e.g., hours, days, months, years). The values of the time series data are plotted on the vertical axis, and line segments are drawn to connect the points. Atmospheric CO 2 at Mauna Loa 7

+ 2.4: Graphical Misrepresentations of Data Objectives: 29 30 Eight Common Methods for Understand what can make a graph misleading, confusing, or deceptive. In the Information Age, when our world is awash in data, it is important for citizens to understand how graphics may be misleading, confusing, or deceptive. Such an understanding enhances our statistical literacy and makes us less prone to be deceived by misleading graphics. 1. Graphing/selecting an inappropriate statistic. 2. Omitting the zero on the relevant scale. 3. Manipulating the scale. 4. Using two dimensions (area) to emphasize a onedimensional difference. 5. Careless combination of categories in a bar graph. 6. Inaccuracy in relative lengths of bars in a bar graph. 7. Biased distortion or embellishment. 8. Unclear labeling. 31 32 Example 2.19 Inappropriate choice of statistic Example 2.20 Omitting the zero MediaMatters.com reported that CNN.com used a misleading graph to exaggerate the difference between the percentages of Democrats and Republicans who agreed with the Florida court s decision to remove the feeding tube from Terri Schiavo in 2005. 8

33 34 Example 2.21 Manipulating the scale This figure shows a Minitab relative frequency bar graph of the majors chosen by 25 business school students. Example 2.22 Using two dimensions for a one-dimensional difference This graphic compares the leaders in career points scored in the NBA All-Star Game among players active in 2007. If we wanted to de-emphasize the differences, we could extend the vertical scale up to its maximum, 1.0 = 100%. The height of the players is supposed to represent the total points, but this is not clearly labeled. Points should be indicated using a vertical axis, but there is no vertical axis at all. 35 36 Example 2.23 Careless combination of categories and biased embellishment This figure shows a graphic of how often people have observed drivers running red lights. Example 2.24 Inaccuracy in relative lengths of bars and unclear labeling This figure is a horizontal bar graph of the three teams with the most World Series victories in baseball history. Note that 127 is more than twice as many as 52, and so the Yankees bar should be more than twice as long as the Cardinals bar, which it is not. 9

37 + Chapter 2 Overview 38 Example 2.25 Presenting the same data set as symmetric and skewed The table below displays scores on the TIMSS Science test, administered to eighth-grade students in different countries. 2.1 Graphs and Tables for Categorical Data 2.2 Graphs and Tables for Quantitative Data 2.3 Further Graphs and Tables for Quantitative Data 2.4 Graphical Misrepresentations of Data 10