LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Size: px
Start display at page:

Download "LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA"

Transcription

1 LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA This lab will assist you in learning how to summarize and display categorical and quantitative data in StatCrunch. In particular, you will learn how to obtain frequency and contingency tables for categorical data and display the data with bar charts and pie charts. You will also learn how to obtain the appropriate measures of center and spread for quantitative data and display the data with histograms, and boxplots. Finally, you will study how to display data over time with a time plot. The document should be used as a reference in your work on Lab 1 assignment. 1. Summarizing and Displaying Categorical Data The categorical variables such as gender (possible values: males, females) or marital status (possible values: never married, married, divorced) can be summarized by providing the counts (frequencies) or proportions (relative frequencies) of observations falling into each category (distinct value of the categorical variable). In order to demonstrate the graphical and numerical tools in StatCrunch we will use the Framingham Heart Study data file introduced in Introductory Lab. However, we will add one more column, Smoker (column 4) to the introlabdata.txt data file. The new variable is defined below. For your convenience, we will also provide the definitions of the other three variables in the data file: Column Variable Description of Variable 1 Gender M-Male, F-Female, 2 Age years, 3 Systolic Systolic blood pressure ( mm), 4 Smoker 0 if not a current smoker, 1 if current smoker. The extended data file is given in the table below: Gender Age Systolic Smoker F M M F M M M M F F M M F M Add the entries in the last column (Smoker) to the introlabdata.txt data file used in Introductory Lab. 1

2 (a) Summaries for Categorical Data: Frequency and Contingency Tables Select the Tables option in Stat menu. Frequency and Relative Frequency Table Then click the Frequency option. The feature (in its default setting) provides the frequency and relative frequency for each distinct value within selected columns. One frequency and relative frequency table will be produced for each column (variable) selected. For example, to obtain the frequencies and relative frequencies of females and males with the systolic blood pressure exceeding 135, you should fill in the dialog box as folows: Select the columns to be used in the analysis Specify the data rows to be included in the analysis The frequency and relative frequency table will be displayed for the two gender groups. Notice that if you ignore Next button in the above dialog box and click Calculate button directly, the frequency table for the default options will be obtained. Contingency Table The association between two categorical variables can be summarized with a contingency table. The rows in the table list the categories of one variable and the columns list the categories of the other variable. Each cell in the table is the frequency of observations for theparticular combination of values of the two variables. The contingency table can be obtained using raw data (Contingency Table with data) or summary data (Contingency Table with summary). Select the Contingency with data option in Tables menu. Fill in the corresponding dialog box as follows: 2

3 Select the column which values will be categorized across the rows Select the column which values will be categorized across the columns Specify the data rows to be included in the computation in the Where entry box (optional) Select an optional Group By column. A separate contingency table will be obtained for each distinct value of the column Click the Next button to specify the information to be displayed in each cell of the contingency table: Row Percent, and Column Percent. Leave the remaining boxes unchecked. Click Next. The following output will be generated: The contingency table of Gender and Smoker variables In order to obtain a contingency table when summaries are available (the above instructions apply to the situation when data are available), select the columns that contain the summary counts (0 and 1 in the above 3

4 example) and the column that contains the row labels (Gender). Then enter the name for the column variable (Smoker). Click Next button, specify the additional information (row, column, and total percents) and click Calculate buton to display the results. (b) Graphs for Categorical Data Bar Plot Bar chart uses vertical bars to display the frequency or relative frequency for all distinct values (categories) of selected columns. The length of each bar is equal to the frequency or relative frequency for the corresponding value (category). Bar charts can be used to examine the association between two categorical variables like gender and smoking status. You may either obtain a bar chart when data are available (bar plot with data) or when counts for each category are provided (bar plot with summary). For example, in order to explore the association between Gender and Smoker variables in the Framingham Heart Study data file, we can obtain a bar plot for the variable Smoker for each gender category. A separate bar plot will be generated for each column selected Use an optional Where clause to specify the data rows to be included in the analysis Select an optional Group By column. The frequency or relative frequency of each distinct value of the selected column will be displayed with a bar. You may choose Split bars if you wish to obtain two bar graphs back-to-back, one for each gender. If you ignore Next button and click directly Create Graph! button, the default options will be applied to your graph. Click the Next button to obtain the following dialog box: 4

5 Choose between plotting the frequency or the relative frequency for each distinct value. Click Next button. The following dialog box that allows you to specify axis labels and the title of the bar plot will appear: Click Next button to obtain a dialog box that would allow you to customize the appearance of the bar plot. This dialog box is common to all graphical procedures in StatCrunch and usually appears as the last dialog screen before producing the graph. In particular, you can specify the number of rows and columns per page. A page is defined here by the visible width and height of a browser window. By default, the number of rows and the number of columns per page is one, so one graph per page is produced. 5

6 You may obtain two graphs, one beside each other by entering the number 1 as the number of rows per page and entering 2 as the number of columns. In similar way, you can obtain two graphs in one column (one below the other) by entering the two parameters as 2 and 1, respectively. With the settings, one graph per page will be produced You may also change the colour scheme if required. Now click Create Graph button to obtain the following graph: 6

7 Click the Options button in the above output window. Click the option to edit the above graph (change the graph layout, change the axes labels or the graph title) To copy the graph into the Clipboard so it can be easily pasted into your report without saving To save the graph as a GIF file As most of the graphics in StatCrunch, the bar plot is interactive. To interact with the plot chart (in general, with any StatCrunch graph), draw a rectangle within a desired object (for example, a bar in the bar chart) or around the desired object (a point in a scatterplot) in the graph by clicking and dragging the mouse. The objects will be highlighted in the graph as well all other interactive graphs obtained for the data. Moreover, the corresponding observations in the data table will also be highlighted. Draw a small rectangle in any of the four bars in the above plot to explore the interactivity. Now we will demonstrate how to use bar charts to compare the proportions of smokers among females and males. Click Bar plot with data option and fill the feature dialog box as follows: The Relative Frequency option should be selected in the subsequent dialog box. Moreover, 2 columns per page and 1 row per page should be requested in the graph layout dialog box. The following output will be obtained: 7

8 Pie Chart Pie chart consists of several slices corresponding to all distinct values of a categorical variable and the size of each slice corresponds to the percentage (relative frequency) of observations in the category. You may either obtain a pie chart when data are available (pie chart with data) or when counts for each category are provided (pie chart with summary). Select the Pie Chart with data option in the menu and fill in the corresponding dialog box as follows: A separate chart will be obtained for each column selected; the slices in each chart correspond to the distinct values of the column You may enter an optional Where statement to specify the data rows to be included in the analysis Select an optional Group By column to obtain a separate pie chart for each distinct value of this column Click Next> button. The following dialog box will appear: 8

9 For each category (displayed as a slice in the pie chart), the following three numbers will be provided (separated by comas), respectively: category name, number of observations in the category, the percentage of observations falling into this category. If you ignore Next> button and click directly on Create Graph! button, the default options will be applied to your pie chart. In the 0 smoker category (nonsmokers), there are 3 females and they constitute 60% of females (three out of five). If you click the Next> button, you will obtain a pie chart for the variable Smoker for males. 9

10 2. Summarizing and Displaying Quantitative Data Now you will learn how to obtain the measures of center and spread for quantitative data and how to display the data with histograms and boxplots. (a) Summaries for Quantative Data StatCrunch provides several descriptive statistics for single variables (the columns selected) as well the measures that indicate the extent to which two variables co-vary (tend to rise or fall together). The first are produced by Columns and Rows options, the latter by the Correlation and Covariance options in the Summary Stats submenu. Columns The Columns option provides the following descriptive statistics for the columns selected: sample size, mean, variance, standard deviation (Std. Dev.), standard error (Std. Err.), median, range, minimum, maximum, first quartile (Q1) and third quartile (Q3). Moreover, additional percentiles can also be requested by the user. Click Columns option. The following dialog box will be displayed: Select the columns for which summary statistics will be computed Enter an optional Where clause to specify the data rows to be included in the computation Select an optional Group By column to group results. If a Group By column is selected, the output will be displayed in separate tables for each column selected (default). If you wish to have the output displayed for each group, choose the other radio button. Notice that the two radio buttons in the dialog box are provided to allow the user to organize the output in the most desirable way; they do not affect the data analysis process. 10

11 Suppose that we wish to obtain the summaries for the variable Systolic for non-smokers for each gender. In this case, the dialog box should be filled in as shown below: Notice that as Table groups for each column is selected, the summaries for males and females will be provided in separate tables. Click the Next button. The following dialog box will appear: 11

12 Enter the requested percentiles separated by comas or spaces, i.e. 90, 99 Check the option to have the output placed in the data table All summary statistics to be computed are selected by default (all entries in the left pane are selected). If you wish not to compute some of the statistics, click the statistics to be removed in the left pane and they will be dropped from the list in the right pane. The statistics in the right pane will be displayed in the output (from right to left) in order in which they are listed in the right pane. Finally click the Calculate button to obtain the summaries. Rows The Rows option can be very useful when the entries in the columns in the data table for each row refer to the same object or subject. For example, sales data of each of the four salespersons in a sales department over the last six months (the columns represent the sales figures for each of the six months) or the number of customers on each of the 30 days for several postal outlets in a city. Consider the sales data example. Copy the following data into your StatCrunch data table. 12

13 Suppose now that we wish to obtain the summaries of sales for each of the four salesmen over the January- March period. We may wish to compare the summaries with those for the April-June period for each of the four salespersons. Click the Row option in the Summary Stats submenu. The following dialog box will appear: Now click the Next> button. The next dialog box will allow you to specify the summary statistics to be computed for each row. The default summaries are: count, sum, mean, variance, standard deviation, minimum, median and maximum. Finally, the output will be displayed in the following form: (a) Displaying Quantative Data: Histograms and Boxplots Now we will discuss the graphical tools to display quantitative data. Histogram Histogram is the most important statistical tool to display the quantitative data. In order to obtain a histogram, we divide the range of data into non-overlapping intervals of equal width (called class intervals), count the number of observations falling into each class interval and erect a bar with height equal to the frequency (frequency histogram) or relative frequency (relative frequency histogram) over each class 13

14 interval. The heights of bars in the density histogram are calculated by dividing relative frequency by class width so that the total area of the bars equals 1. We assume that the left endpoint of each class interval is included, the right endpoint is excluded. The endpoints of class intervals are called bins. The bins specify uniquely the class intervals if the starting bin and the common class interval width are provided. Suppose we would like to compare the distributions of systolic blood pressure for the two gender groups in the Framingham Heart Study example. Click the Histogram option in the Graphics menu and fill in the dialog box as follows: Select the columns (variables) to be displayed in the plot. A separate histogram will be produced for each column selected. Specify the data rows to be included in the analysis. The clause is optional- if you do not enter anything into the box, the histogram will be obtained for all rows. Select an optional Group By column to obtain a histogram for each distinct value of the column (variable) Click the Next> button. In the next dialog box you will specify your class intervals by specifying the bin starting Select the Frequency, Relative Frequency or Density histogram The two entries are optional. If you don t enter anything, the bins will be generated automatically 14

15 Click the Next> button to open the dialog box displayed below. This dialog box allows you to superimpose one of the well-known statistical density functions upon the histogram of the data. For example, you might wish to see how well your data fits the density of a normal distribution. If you select the option, you will be required to enter the parameters of the density function. Leave optional in our case and proceed to the next dialog box to specify histogram layout options. Leave optional in this entry box for our data In order to obtain the two histograms (for females and males) displayed side by side, 2 columns per page and 1 row per page should be requested in the graph layout dialog box. The histograms are displayed below. 15

16 Boxplot Boxplot is a graph of the five-number summary: minimum value, first quartile Q1, median, third quartile Q3, and the maximum value. The distance from Q1 to Q3 is called the interquartile range (IQR). We will demonstrate the feature using the Framingham Heart Study data. Click the Boxplot option in the Graphics menu. Suppose we wish to obtain side-by-side boxplots of systolic blood pressure for males and females. To obtain side-by-side boxplots for males and females To obtain separate boxplots for the two gender groups Click the Next > button. You may choose to use fences when constructing the boxplots (optional). The inner fences are located a distance of 1.5 times the IQR to the left and right of Q1 and Q3, respectively. The outer fences are located a distance of 3 times the IQR to the left and right of Q1 and Q3, respectively. You may choose to use fences when constructing the boxplots (optional). A point beyond an inner fence on either side is considered an outlier. A point beyond an outer fence is considered an extreme outlier. You may choose to have the boxes corresponding to two genders (groups) displayed vertically (default option) or horizontally. 16

17 Remember that you may always change your graph appearance by clicking the Options button and edit the appropriate dialog boxes. 3. Displaying Data over Time: Time Plots A data set collected over time is called a time series. Time plot is a graph of time series. It plots each observation, on the vertical scale, against the time it was measured, on the horizontal scale. Time series show trends or changes in data over a period of time. The information obtained by examining a time plot is especially meaningful when the time points at which the variable of interest is being measured are equally spaced. In this case we may label them with the consecutive integer numbers 1, 2, 3,. (a) Index Plots Index plot displays the values of a column (variable) versus the corresponding row index number. The row index numbers usually displays the order in which the data have been collected. Consecutive points in the plot are connected with lines. In order to illustrate the tool we consider the following example. The sales for two department stores (in millions) from 1998 to 2005 are shown in the following table: Year Department 1 Department

18 We will compare the sales for the two department stores with a time plot using the Index Plot feature. Click the Index Plot in the Graphics pull-down menu. Select the columns (variables) to be displayed in the plot. If more than one is selected, each of them will be color-coded and displayed in a single plot (default) Check the box if each column is to be displayed in a single plot. As we wish to compare the sales data for the two department stores, we leave the Separate graph for each column check box unchecked. Click the Next> button and define labels for the axis, assign the title and specify the axis options. In the next dialog box you will specify the graph layout options. Finally, you will obtain the following graph: 18

19 The first tick on the horizontal axis is at the 1 point. Notice that you cannot change the labels (consecutive integers) below the ticks. You will be able to specify any axis labels you wish by using the Scatter Plot tool to be discussed below. (b) Scatter Plot with Lines The Scatter Plot tool available in the Graphics pull-down menu allows the user to obtain a plot of one quantitative variable versus the other quantitative variable. We will discuss scatterplots in StatCrunch in detail in Lab 2 Instructions. Here we will use the Scatter Plot tool in a special case when the variable plotted on the horizontal axis is time (in various units like minutes, hours, days, months, or years) and the other variable plotted on the vertical axis is any quantitative variable varying over time. The points in this kind of scatterplot are connected by lines. The axis labels below the tick marks on the horizontal axis correspond to the values (numerical or categorical) specified in the appropriate column in the data. We will demonstrate how to construct a time plot using the sales data for the two departments. However, to obtain a single time plot with the two variables, we had to rearrange the data as follows: 19

20 Click Scatter Plot option in the Graphics pull-down menu. Time is plotted on the horizontal (X) axis Time series values are plotted on the vertical (Y) axis. Enter an optional Where statement to specify the data rows to be included in the time plot. You may exclude some observations using the option. Click the Next> button and select the Lines option (the consecutive points in the plot will be connected by straight lines). If you wish to have the points in the plot marked clearly you may select both Points and Lines options. Click again the Next> button and specify the graph layout, title and the axes titles. 20

21 Finally the following plot will be obtained: 21

22 (c) Multi Plot The Multi Plot tool available in the Graphics menu allows you to plot multiple pairs of points on the same graph or separate graphs. Pairs may be plotted as points, connected with lines or both plotted with points and connected with lines. Click Add button to add the pairing to the plot. The pairing will then be displayed in the selection box. To delete the pairing, select it and click on Delete. In the next dialog box you will specify the graph layout options. Finally, you will obtain a graph similar to the one on page

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables Select Cases GRAPHS The Select Cases command excludes from further analysis all those cases that do not meet specified selection criteria. Select Cases For a subset of the datafile, use Select Cases. In

More information

INTRODUCTORY LAB INTRODUCTION TO STATCRUNCH 5.0

INTRODUCTORY LAB INTRODUCTION TO STATCRUNCH 5.0 INTRODUCTORY LAB INTRODUCTION TO STATCRUNCH 5.0 StatCrunch is a free web-based statistical software package containing all statistical features covered in introductory statistics courses. It is very easy

More information

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Chapter 3 - Displaying and Summarizing Quantitative Data

Chapter 3 - Displaying and Summarizing Quantitative Data Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative

More information

AND NUMERICAL SUMMARIES. Chapter 2

AND NUMERICAL SUMMARIES. Chapter 2 EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative

More information

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1

Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular

More information

STA 570 Spring Lecture 5 Tuesday, Feb 1

STA 570 Spring Lecture 5 Tuesday, Feb 1 STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row

More information

Bar Charts and Frequency Distributions

Bar Charts and Frequency Distributions Bar Charts and Frequency Distributions Use to display the distribution of categorical (nominal or ordinal) variables. For the continuous (numeric) variables, see the page Histograms, Descriptive Stats

More information

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II) STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and

More information

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter2 Description of samples and populations. 2.1 Introduction. Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that

More information

An Introduction to Minitab Statistics 529

An Introduction to Minitab Statistics 529 An Introduction to Minitab Statistics 529 1 Introduction MINITAB is a computing package for performing simple statistical analyses. The current version on the PC is 15. MINITAB is no longer made for the

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file 1 SPSS Guide 2009 Content 1. Basic Steps for Data Analysis. 3 2. Data Editor. 2.4.To create a new SPSS file 3 4 3. Data Analysis/ Frequencies. 5 4. Recoding the variable into classes.. 5 5. Data Analysis/

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

Homework 1 Excel Basics

Homework 1 Excel Basics Homework 1 Excel Basics Excel is a software program that is used to organize information, perform calculations, and create visual displays of the information. When you start up Excel, you will see the

More information

SAS Visual Analytics 8.2: Working with Report Content

SAS Visual Analytics 8.2: Working with Report Content SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects

More information

Name: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations

Name: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations Name: Chapter P: Preliminaries Section P.2: Exploring Data Example 1: Think About It! What will it look

More information

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal

More information

Chapter 1. Looking at Data-Distribution

Chapter 1. Looking at Data-Distribution Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw

More information

8. MINITAB COMMANDS WEEK-BY-WEEK

8. MINITAB COMMANDS WEEK-BY-WEEK 8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are

More information

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and

More information

WELCOME! Lecture 3 Thommy Perlinger

WELCOME! Lecture 3 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important

More information

MATH 117 Statistical Methods for Management I Chapter Two

MATH 117 Statistical Methods for Management I Chapter Two Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical

More information

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

B. Graphing Representation of Data

B. Graphing Representation of Data B Graphing Representation of Data The second way of displaying data is by use of graphs Although such visual aids are even easier to read than tables, they often do not give the same detail It is essential

More information

Name Date Types of Graphs and Creating Graphs Notes

Name Date Types of Graphs and Creating Graphs Notes Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.

More information

Selected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.

Selected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13. Minitab@Oneonta.Manual: Selected Introductory Statistical and Data Manipulation Procedures Gordon & Johnson 2002 Minitab version 13.0 Minitab@Oneonta.Manual: Selected Introductory Statistical and Data

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

Lecture 6: Chapter 6 Summary

Lecture 6: Chapter 6 Summary 1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z

More information

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable. 5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table

More information

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4

More information

Brief Guide on Using SPSS 10.0

Brief Guide on Using SPSS 10.0 Brief Guide on Using SPSS 10.0 (Use student data, 22 cases, studentp.dat in Dr. Chang s Data Directory Page) (Page address: http://www.cis.ysu.edu/~chang/stat/) I. Processing File and Data To open a new

More information

Math 227 EXCEL / MEGASTAT Guide

Math 227 EXCEL / MEGASTAT Guide Math 227 EXCEL / MEGASTAT Guide Introduction Introduction: Ch2: Frequency Distributions and Graphs Construct Frequency Distributions and various types of graphs: Histograms, Polygons, Pie Charts, Stem-and-Leaf

More information

Chapter 6: DESCRIPTIVE STATISTICS

Chapter 6: DESCRIPTIVE STATISTICS Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,

More information

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use? Chapter 4 Analyzing Skewed Quantitative Data Introduction: In chapter 3, we focused on analyzing bell shaped (normal) data, but many data sets are not bell shaped. How do we analyze quantitative data when

More information

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 09/10/2018 Summarising Data Today we will consider Different types of data Appropriate ways to summarise these

More information

How individual data points are positioned within a data set.

How individual data points are positioned within a data set. Section 3.4 Measures of Position Percentiles How individual data points are positioned within a data set. P k is the value such that k% of a data set is less than or equal to P k. For example if we said

More information

Numerical Descriptive Measures

Numerical Descriptive Measures Chapter 3 Numerical Descriptive Measures 1 Numerical Descriptive Measures Chapter 3 Measures of Central Tendency and Measures of Dispersion A sample of 40 students at a university was randomly selected,

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

How to Make Graphs in EXCEL

How to Make Graphs in EXCEL How to Make Graphs in EXCEL The following instructions are how you can make the graphs that you need to have in your project.the graphs in the project cannot be hand-written, but you do not have to use

More information

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among

More information

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated

More information

Statistics Lecture 6. Looking at data one variable

Statistics Lecture 6. Looking at data one variable Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)

More information

Chapter 2 Assignment (due Thursday, April 19)

Chapter 2 Assignment (due Thursday, April 19) (due Thursday, April 19) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should

More information

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or  me, I will answer promptly. Statistical Methods Instructor: Lingsong Zhang 1 Issues before Class Statistical Methods Lingsong Zhang Office: Math 544 Email: lingsong@purdue.edu Phone: 765-494-7913 Office Hour: Monday 1:00 pm - 2:00

More information

Release notes for StatCrunch mid-march 2015 update

Release notes for StatCrunch mid-march 2015 update Release notes for StatCrunch mid-march 2015 update A major StatCrunch update was made on March 18, 2015. This document describes the content of the update including major additions to StatCrunch that were

More information

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs DATA PRESENTATION At the end of the chapter, you will learn to: Present data in textual form Construct different types of table and graphs Identify the characteristics of a good table and graph Identify

More information

Creating a Basic Chart in Excel 2007

Creating a Basic Chart in Excel 2007 Creating a Basic Chart in Excel 2007 A chart is a pictorial representation of the data you enter in a worksheet. Often, a chart can be a more descriptive way of representing your data. As a result, those

More information

SPSS. (Statistical Packages for the Social Sciences)

SPSS. (Statistical Packages for the Social Sciences) Inger Persson SPSS (Statistical Packages for the Social Sciences) SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform basic statistical calculations in SPSS.

More information

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital

More information

Organizing and Summarizing Data

Organizing and Summarizing Data 1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This

More information

DAY 52 BOX-AND-WHISKER

DAY 52 BOX-AND-WHISKER DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and

More information

2.1: Frequency Distributions and Their Graphs

2.1: Frequency Distributions and Their Graphs 2.1: Frequency Distributions and Their Graphs Frequency Distribution - way to display data that has many entries - table that shows classes or intervals of data entries and the number of entries in each

More information

AP Statistics Summer Assignment:

AP Statistics Summer Assignment: AP Statistics Summer Assignment: Read the following and use the information to help answer your summer assignment questions. You will be responsible for knowing all of the information contained in this

More information

Practical 2: Using Minitab (not assessed, for practice only!)

Practical 2: Using Minitab (not assessed, for practice only!) Practical 2: Using Minitab (not assessed, for practice only!) Instructions 1. Read through the instructions below for Accessing Minitab. 2. Work through all of the exercises on this handout. If you need

More information

Plotting Graphs. Error Bars

Plotting Graphs. Error Bars E Plotting Graphs Construct your graphs in Excel using the method outlined in the Graphing and Error Analysis lab (in the Phys 124/144/130 laboratory manual). Always choose the x-y scatter plot. Number

More information

CHAPTER 2: SAMPLING AND DATA

CHAPTER 2: SAMPLING AND DATA CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),

More information

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data Chapter 2 Organizing and Graphing Data 2.1 Organizing and Graphing Qualitative Data 2.2 Organizing and Graphing Quantitative Data 2.3 Stem-and-leaf Displays 2.4 Dotplots 2.1 Organizing and Graphing Qualitative

More information

ICT & MATHS. Excel 2003 in Mathematics Teaching

ICT & MATHS. Excel 2003 in Mathematics Teaching ICT & MATHS Excel 2003 in Mathematics Teaching Published by The National Centre for Technology in Education in association with the Project Maths Development Team. Permission granted to reproduce for educational

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 152 Introduction When analyzing data, you often need to study the characteristics of a single group of numbers, observations, or measurements. You might want to know the center and the spread about

More information

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data MATH& 146 Lesson 10 Section 1.6 Graphing Numerical Data 1 Graphs of Numerical Data One major reason for constructing a graph of numerical data is to display its distribution, or the pattern of variability

More information

Chapter 3 Analyzing Normal Quantitative Data

Chapter 3 Analyzing Normal Quantitative Data Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing

More information

Test Bank for Privitera, Statistics for the Behavioral Sciences

Test Bank for Privitera, Statistics for the Behavioral Sciences 1. A simple frequency distribution A) can be used to summarize grouped data B) can be used to summarize ungrouped data C) summarizes the frequency of scores in a given category or range 2. To determine

More information

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Y O Lam, SCOPE, City University of Hong Kong Introduction The most convenient

More information

Chapter 5. Understanding and Comparing Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 5. Understanding and Comparing Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram

More information

Math 121 Project 4: Graphs

Math 121 Project 4: Graphs Math 121 Project 4: Graphs Purpose: To review the types of graphs, and use MS Excel to create them from a dataset. Outline: You will be provided with several datasets and will use MS Excel to create graphs.

More information

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

Univariate Statistics Summary

Univariate Statistics Summary Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite

More information

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Graphical Analysis of Data using Microsoft Excel [2016 Version] Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.

More information

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form. CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.

More information

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different? A frequency table is a table with two columns, one for the categories and another for the number of times each category occurs. See Example 1 on p. 247. Create a bar graph that displays the data from the

More information

Excel Tips and FAQs - MS 2010

Excel Tips and FAQs - MS 2010 BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

1.3 Graphical Summaries of Data

1.3 Graphical Summaries of Data Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 1.3 Graphical Summaries of Data In the previous section we discussed numerical summaries of either a sample or a data. In this

More information

Chapter 5: The beast of bias

Chapter 5: The beast of bias Chapter 5: The beast of bias Self-test answers SELF-TEST Compute the mean and sum of squared error for the new data set. First we need to compute the mean: + 3 + + 3 + 2 5 9 5 3. Then the sum of squared

More information

Quick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont

Quick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont Quick introduction to descriptive statistics and graphs in R Commander Written by: Robin Beaumont e-mail: robin@organplayers.co.uk http://www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html Date

More information

Assignment 3 due Thursday Oct. 11

Assignment 3 due Thursday Oct. 11 Instructor Linda C. Stephenson due Thursday Oct. 11 GENERAL NOTE: These assignments often build on each other what you learn in one assignment may be carried over to subsequent assignments. If I have already

More information

Understanding and Comparing Distributions. Chapter 4

Understanding and Comparing Distributions. Chapter 4 Understanding and Comparing Distributions Chapter 4 Objectives: Boxplot Calculate Outliers Comparing Distributions Timeplot The Big Picture We can answer much more interesting questions about variables

More information

Combo Charts. Chapter 145. Introduction. Data Structure. Procedure Options

Combo Charts. Chapter 145. Introduction. Data Structure. Procedure Options Chapter 145 Introduction When analyzing data, you often need to study the characteristics of a single group of numbers, observations, or measurements. You might want to know the center and the spread about

More information

StatCrunch. Background Material and Guided Tours. Tom Trollen. Division of Business and Computer Information Systems Scottsdale Community College

StatCrunch. Background Material and Guided Tours. Tom Trollen. Division of Business and Computer Information Systems Scottsdale Community College StatCrunch Background Material and Guided Tours Tom Trollen Division of Business and Computer Information Systems Scottsdale Community College Table of Contents INTRODUCTION... 1 STATCRUNCH... 1 OVERVIEW...

More information

Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,

More information

Error-Bar Charts from Summary Data

Error-Bar Charts from Summary Data Chapter 156 Error-Bar Charts from Summary Data Introduction Error-Bar Charts graphically display tables of means (or medians) and variability. Following are examples of the types of charts produced by

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are

More information

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values Chapter 500 Introduction This procedure produces tables of frequency counts and percentages for categorical and continuous variables. This procedure serves as a summary reporting tool and is often used

More information

Chapter 5snow year.notebook March 15, 2018

Chapter 5snow year.notebook March 15, 2018 Chapter 5: Statistical Reasoning Section 5.1 Exploring Data Measures of central tendency (Mean, Median and Mode) attempt to describe a set of data by identifying the central position within a set of data

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Minitab Notes for Activity 1

Minitab Notes for Activity 1 Minitab Notes for Activity 1 Creating the Worksheet 1. Label the columns as team, heat, and time. 2. Have Minitab automatically enter the team data for you. a. Choose Calc / Make Patterned Data / Simple

More information

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data. Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information