LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA
|
|
- Liliana Small
- 5 years ago
- Views:
Transcription
1 LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA This lab will assist you in learning how to summarize and display categorical and quantitative data in StatCrunch. In particular, you will learn how to obtain frequency and contingency tables for categorical data and display the data with bar charts and pie charts. You will also learn how to obtain the appropriate measures of center and spread for quantitative data and display the data with histograms, and boxplots. Finally, you will study how to display data over time with a time plot. The document should be used as a reference in your work on Lab 1 assignment. 1. Summarizing and Displaying Categorical Data The categorical variables such as gender (possible values: males, females) or marital status (possible values: never married, married, divorced) can be summarized by providing the counts (frequencies) or proportions (relative frequencies) of observations falling into each category (distinct value of the categorical variable). In order to demonstrate the graphical and numerical tools in StatCrunch we will use the Framingham Heart Study data file introduced in Introductory Lab. However, we will add one more column, Smoker (column 4) to the introlabdata.txt data file. The new variable is defined below. For your convenience, we will also provide the definitions of the other three variables in the data file: Column Variable Description of Variable 1 Gender M-Male, F-Female, 2 Age years, 3 Systolic Systolic blood pressure ( mm), 4 Smoker 0 if not a current smoker, 1 if current smoker. The extended data file is given in the table below: Gender Age Systolic Smoker F M M F M M M M F F M M F M Add the entries in the last column (Smoker) to the introlabdata.txt data file used in Introductory Lab. 1
2 (a) Summaries for Categorical Data: Frequency and Contingency Tables Select the Tables option in Stat menu. Frequency and Relative Frequency Table Then click the Frequency option. The feature (in its default setting) provides the frequency and relative frequency for each distinct value within selected columns. One frequency and relative frequency table will be produced for each column (variable) selected. For example, to obtain the frequencies and relative frequencies of females and males with the systolic blood pressure exceeding 135, you should fill in the dialog box as folows: Select the columns to be used in the analysis Specify the data rows to be included in the analysis The frequency and relative frequency table will be displayed for the two gender groups. Notice that if you ignore Next button in the above dialog box and click Calculate button directly, the frequency table for the default options will be obtained. Contingency Table The association between two categorical variables can be summarized with a contingency table. The rows in the table list the categories of one variable and the columns list the categories of the other variable. Each cell in the table is the frequency of observations for theparticular combination of values of the two variables. The contingency table can be obtained using raw data (Contingency Table with data) or summary data (Contingency Table with summary). Select the Contingency with data option in Tables menu. Fill in the corresponding dialog box as follows: 2
3 Select the column which values will be categorized across the rows Select the column which values will be categorized across the columns Specify the data rows to be included in the computation in the Where entry box (optional) Select an optional Group By column. A separate contingency table will be obtained for each distinct value of the column Click the Next button to specify the information to be displayed in each cell of the contingency table: Row Percent, and Column Percent. Leave the remaining boxes unchecked. Click Next. The following output will be generated: The contingency table of Gender and Smoker variables In order to obtain a contingency table when summaries are available (the above instructions apply to the situation when data are available), select the columns that contain the summary counts (0 and 1 in the above 3
4 example) and the column that contains the row labels (Gender). Then enter the name for the column variable (Smoker). Click Next button, specify the additional information (row, column, and total percents) and click Calculate buton to display the results. (b) Graphs for Categorical Data Bar Plot Bar chart uses vertical bars to display the frequency or relative frequency for all distinct values (categories) of selected columns. The length of each bar is equal to the frequency or relative frequency for the corresponding value (category). Bar charts can be used to examine the association between two categorical variables like gender and smoking status. You may either obtain a bar chart when data are available (bar plot with data) or when counts for each category are provided (bar plot with summary). For example, in order to explore the association between Gender and Smoker variables in the Framingham Heart Study data file, we can obtain a bar plot for the variable Smoker for each gender category. A separate bar plot will be generated for each column selected Use an optional Where clause to specify the data rows to be included in the analysis Select an optional Group By column. The frequency or relative frequency of each distinct value of the selected column will be displayed with a bar. You may choose Split bars if you wish to obtain two bar graphs back-to-back, one for each gender. If you ignore Next button and click directly Create Graph! button, the default options will be applied to your graph. Click the Next button to obtain the following dialog box: 4
5 Choose between plotting the frequency or the relative frequency for each distinct value. Click Next button. The following dialog box that allows you to specify axis labels and the title of the bar plot will appear: Click Next button to obtain a dialog box that would allow you to customize the appearance of the bar plot. This dialog box is common to all graphical procedures in StatCrunch and usually appears as the last dialog screen before producing the graph. In particular, you can specify the number of rows and columns per page. A page is defined here by the visible width and height of a browser window. By default, the number of rows and the number of columns per page is one, so one graph per page is produced. 5
6 You may obtain two graphs, one beside each other by entering the number 1 as the number of rows per page and entering 2 as the number of columns. In similar way, you can obtain two graphs in one column (one below the other) by entering the two parameters as 2 and 1, respectively. With the settings, one graph per page will be produced You may also change the colour scheme if required. Now click Create Graph button to obtain the following graph: 6
7 Click the Options button in the above output window. Click the option to edit the above graph (change the graph layout, change the axes labels or the graph title) To copy the graph into the Clipboard so it can be easily pasted into your report without saving To save the graph as a GIF file As most of the graphics in StatCrunch, the bar plot is interactive. To interact with the plot chart (in general, with any StatCrunch graph), draw a rectangle within a desired object (for example, a bar in the bar chart) or around the desired object (a point in a scatterplot) in the graph by clicking and dragging the mouse. The objects will be highlighted in the graph as well all other interactive graphs obtained for the data. Moreover, the corresponding observations in the data table will also be highlighted. Draw a small rectangle in any of the four bars in the above plot to explore the interactivity. Now we will demonstrate how to use bar charts to compare the proportions of smokers among females and males. Click Bar plot with data option and fill the feature dialog box as follows: The Relative Frequency option should be selected in the subsequent dialog box. Moreover, 2 columns per page and 1 row per page should be requested in the graph layout dialog box. The following output will be obtained: 7
8 Pie Chart Pie chart consists of several slices corresponding to all distinct values of a categorical variable and the size of each slice corresponds to the percentage (relative frequency) of observations in the category. You may either obtain a pie chart when data are available (pie chart with data) or when counts for each category are provided (pie chart with summary). Select the Pie Chart with data option in the menu and fill in the corresponding dialog box as follows: A separate chart will be obtained for each column selected; the slices in each chart correspond to the distinct values of the column You may enter an optional Where statement to specify the data rows to be included in the analysis Select an optional Group By column to obtain a separate pie chart for each distinct value of this column Click Next> button. The following dialog box will appear: 8
9 For each category (displayed as a slice in the pie chart), the following three numbers will be provided (separated by comas), respectively: category name, number of observations in the category, the percentage of observations falling into this category. If you ignore Next> button and click directly on Create Graph! button, the default options will be applied to your pie chart. In the 0 smoker category (nonsmokers), there are 3 females and they constitute 60% of females (three out of five). If you click the Next> button, you will obtain a pie chart for the variable Smoker for males. 9
10 2. Summarizing and Displaying Quantitative Data Now you will learn how to obtain the measures of center and spread for quantitative data and how to display the data with histograms and boxplots. (a) Summaries for Quantative Data StatCrunch provides several descriptive statistics for single variables (the columns selected) as well the measures that indicate the extent to which two variables co-vary (tend to rise or fall together). The first are produced by Columns and Rows options, the latter by the Correlation and Covariance options in the Summary Stats submenu. Columns The Columns option provides the following descriptive statistics for the columns selected: sample size, mean, variance, standard deviation (Std. Dev.), standard error (Std. Err.), median, range, minimum, maximum, first quartile (Q1) and third quartile (Q3). Moreover, additional percentiles can also be requested by the user. Click Columns option. The following dialog box will be displayed: Select the columns for which summary statistics will be computed Enter an optional Where clause to specify the data rows to be included in the computation Select an optional Group By column to group results. If a Group By column is selected, the output will be displayed in separate tables for each column selected (default). If you wish to have the output displayed for each group, choose the other radio button. Notice that the two radio buttons in the dialog box are provided to allow the user to organize the output in the most desirable way; they do not affect the data analysis process. 10
11 Suppose that we wish to obtain the summaries for the variable Systolic for non-smokers for each gender. In this case, the dialog box should be filled in as shown below: Notice that as Table groups for each column is selected, the summaries for males and females will be provided in separate tables. Click the Next button. The following dialog box will appear: 11
12 Enter the requested percentiles separated by comas or spaces, i.e. 90, 99 Check the option to have the output placed in the data table All summary statistics to be computed are selected by default (all entries in the left pane are selected). If you wish not to compute some of the statistics, click the statistics to be removed in the left pane and they will be dropped from the list in the right pane. The statistics in the right pane will be displayed in the output (from right to left) in order in which they are listed in the right pane. Finally click the Calculate button to obtain the summaries. Rows The Rows option can be very useful when the entries in the columns in the data table for each row refer to the same object or subject. For example, sales data of each of the four salespersons in a sales department over the last six months (the columns represent the sales figures for each of the six months) or the number of customers on each of the 30 days for several postal outlets in a city. Consider the sales data example. Copy the following data into your StatCrunch data table. 12
13 Suppose now that we wish to obtain the summaries of sales for each of the four salesmen over the January- March period. We may wish to compare the summaries with those for the April-June period for each of the four salespersons. Click the Row option in the Summary Stats submenu. The following dialog box will appear: Now click the Next> button. The next dialog box will allow you to specify the summary statistics to be computed for each row. The default summaries are: count, sum, mean, variance, standard deviation, minimum, median and maximum. Finally, the output will be displayed in the following form: (a) Displaying Quantative Data: Histograms and Boxplots Now we will discuss the graphical tools to display quantitative data. Histogram Histogram is the most important statistical tool to display the quantitative data. In order to obtain a histogram, we divide the range of data into non-overlapping intervals of equal width (called class intervals), count the number of observations falling into each class interval and erect a bar with height equal to the frequency (frequency histogram) or relative frequency (relative frequency histogram) over each class 13
14 interval. The heights of bars in the density histogram are calculated by dividing relative frequency by class width so that the total area of the bars equals 1. We assume that the left endpoint of each class interval is included, the right endpoint is excluded. The endpoints of class intervals are called bins. The bins specify uniquely the class intervals if the starting bin and the common class interval width are provided. Suppose we would like to compare the distributions of systolic blood pressure for the two gender groups in the Framingham Heart Study example. Click the Histogram option in the Graphics menu and fill in the dialog box as follows: Select the columns (variables) to be displayed in the plot. A separate histogram will be produced for each column selected. Specify the data rows to be included in the analysis. The clause is optional- if you do not enter anything into the box, the histogram will be obtained for all rows. Select an optional Group By column to obtain a histogram for each distinct value of the column (variable) Click the Next> button. In the next dialog box you will specify your class intervals by specifying the bin starting Select the Frequency, Relative Frequency or Density histogram The two entries are optional. If you don t enter anything, the bins will be generated automatically 14
15 Click the Next> button to open the dialog box displayed below. This dialog box allows you to superimpose one of the well-known statistical density functions upon the histogram of the data. For example, you might wish to see how well your data fits the density of a normal distribution. If you select the option, you will be required to enter the parameters of the density function. Leave optional in our case and proceed to the next dialog box to specify histogram layout options. Leave optional in this entry box for our data In order to obtain the two histograms (for females and males) displayed side by side, 2 columns per page and 1 row per page should be requested in the graph layout dialog box. The histograms are displayed below. 15
16 Boxplot Boxplot is a graph of the five-number summary: minimum value, first quartile Q1, median, third quartile Q3, and the maximum value. The distance from Q1 to Q3 is called the interquartile range (IQR). We will demonstrate the feature using the Framingham Heart Study data. Click the Boxplot option in the Graphics menu. Suppose we wish to obtain side-by-side boxplots of systolic blood pressure for males and females. To obtain side-by-side boxplots for males and females To obtain separate boxplots for the two gender groups Click the Next > button. You may choose to use fences when constructing the boxplots (optional). The inner fences are located a distance of 1.5 times the IQR to the left and right of Q1 and Q3, respectively. The outer fences are located a distance of 3 times the IQR to the left and right of Q1 and Q3, respectively. You may choose to use fences when constructing the boxplots (optional). A point beyond an inner fence on either side is considered an outlier. A point beyond an outer fence is considered an extreme outlier. You may choose to have the boxes corresponding to two genders (groups) displayed vertically (default option) or horizontally. 16
17 Remember that you may always change your graph appearance by clicking the Options button and edit the appropriate dialog boxes. 3. Displaying Data over Time: Time Plots A data set collected over time is called a time series. Time plot is a graph of time series. It plots each observation, on the vertical scale, against the time it was measured, on the horizontal scale. Time series show trends or changes in data over a period of time. The information obtained by examining a time plot is especially meaningful when the time points at which the variable of interest is being measured are equally spaced. In this case we may label them with the consecutive integer numbers 1, 2, 3,. (a) Index Plots Index plot displays the values of a column (variable) versus the corresponding row index number. The row index numbers usually displays the order in which the data have been collected. Consecutive points in the plot are connected with lines. In order to illustrate the tool we consider the following example. The sales for two department stores (in millions) from 1998 to 2005 are shown in the following table: Year Department 1 Department
18 We will compare the sales for the two department stores with a time plot using the Index Plot feature. Click the Index Plot in the Graphics pull-down menu. Select the columns (variables) to be displayed in the plot. If more than one is selected, each of them will be color-coded and displayed in a single plot (default) Check the box if each column is to be displayed in a single plot. As we wish to compare the sales data for the two department stores, we leave the Separate graph for each column check box unchecked. Click the Next> button and define labels for the axis, assign the title and specify the axis options. In the next dialog box you will specify the graph layout options. Finally, you will obtain the following graph: 18
19 The first tick on the horizontal axis is at the 1 point. Notice that you cannot change the labels (consecutive integers) below the ticks. You will be able to specify any axis labels you wish by using the Scatter Plot tool to be discussed below. (b) Scatter Plot with Lines The Scatter Plot tool available in the Graphics pull-down menu allows the user to obtain a plot of one quantitative variable versus the other quantitative variable. We will discuss scatterplots in StatCrunch in detail in Lab 2 Instructions. Here we will use the Scatter Plot tool in a special case when the variable plotted on the horizontal axis is time (in various units like minutes, hours, days, months, or years) and the other variable plotted on the vertical axis is any quantitative variable varying over time. The points in this kind of scatterplot are connected by lines. The axis labels below the tick marks on the horizontal axis correspond to the values (numerical or categorical) specified in the appropriate column in the data. We will demonstrate how to construct a time plot using the sales data for the two departments. However, to obtain a single time plot with the two variables, we had to rearrange the data as follows: 19
20 Click Scatter Plot option in the Graphics pull-down menu. Time is plotted on the horizontal (X) axis Time series values are plotted on the vertical (Y) axis. Enter an optional Where statement to specify the data rows to be included in the time plot. You may exclude some observations using the option. Click the Next> button and select the Lines option (the consecutive points in the plot will be connected by straight lines). If you wish to have the points in the plot marked clearly you may select both Points and Lines options. Click again the Next> button and specify the graph layout, title and the axes titles. 20
21 Finally the following plot will be obtained: 21
22 (c) Multi Plot The Multi Plot tool available in the Graphics menu allows you to plot multiple pairs of points on the same graph or separate graphs. Pairs may be plotted as points, connected with lines or both plotted with points and connected with lines. Click Add button to add the pairing to the plot. The pairing will then be displayed in the selection box. To delete the pairing, select it and click on Delete. In the next dialog box you will specify the graph layout options. Finally, you will obtain a graph similar to the one on page
Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables
Select Cases GRAPHS The Select Cases command excludes from further analysis all those cases that do not meet specified selection criteria. Select Cases For a subset of the datafile, use Select Cases. In
More informationINTRODUCTORY LAB INTRODUCTION TO STATCRUNCH 5.0
INTRODUCTORY LAB INTRODUCTION TO STATCRUNCH 5.0 StatCrunch is a free web-based statistical software package containing all statistical features covered in introductory statistics courses. It is very easy
More information2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 Objectives 2.1 What Are the Types of Data? www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative
More informationChapter 3 - Displaying and Summarizing Quantitative Data
Chapter 3 - Displaying and Summarizing Quantitative Data 3.1 Graphs for Quantitative Data (LABEL GRAPHS) August 25, 2014 Histogram (p. 44) - Graph that uses bars to represent different frequencies or relative
More informationAND NUMERICAL SUMMARIES. Chapter 2
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES Chapter 2 2.1 What Are the Types of Data? 2.1 Objectives www.managementscientist.org 1. Know the definitions of a. Variable b. Categorical versus quantitative
More informationUnit I Supplement OpenIntro Statistics 3rd ed., Ch. 1
Unit I Supplement OpenIntro Statistics 3rd ed., Ch. 1 KEY SKILLS: Organize a data set into a frequency distribution. Construct a histogram to summarize a data set. Compute the percentile for a particular
More informationSTA 570 Spring Lecture 5 Tuesday, Feb 1
STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row
More informationBar Charts and Frequency Distributions
Bar Charts and Frequency Distributions Use to display the distribution of categorical (nominal or ordinal) variables. For the continuous (numeric) variables, see the page Histograms, Descriptive Stats
More informationSTA Module 2B Organizing Data and Comparing Distributions (Part II)
STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and
More informationSTA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)
STA 2023 Module 2B Organizing Data and Comparing Distributions (Part II) Learning Objectives Upon completing this module, you should be able to 1 Explain the purpose of a measure of center 2 Obtain and
More informationChapter2 Description of samples and populations. 2.1 Introduction.
Chapter2 Description of samples and populations. 2.1 Introduction. Statistics=science of analyzing data. Information collected (data) is gathered in terms of variables (characteristics of a subject that
More informationAn Introduction to Minitab Statistics 529
An Introduction to Minitab Statistics 529 1 Introduction MINITAB is a computing package for performing simple statistical analyses. The current version on the PC is 15. MINITAB is no longer made for the
More informationTable of Contents (As covered from textbook)
Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression
More information1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file
1 SPSS Guide 2009 Content 1. Basic Steps for Data Analysis. 3 2. Data Editor. 2.4.To create a new SPSS file 3 4 3. Data Analysis/ Frequencies. 5 4. Recoding the variable into classes.. 5 5. Data Analysis/
More informationYour Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using
More informationData can be in the form of numbers, words, measurements, observations or even just descriptions of things.
+ What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and
More informationHomework 1 Excel Basics
Homework 1 Excel Basics Excel is a software program that is used to organize information, perform calculations, and create visual displays of the information. When you start up Excel, you will see the
More informationSAS Visual Analytics 8.2: Working with Report Content
SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects
More informationName: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations
Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations Name: Chapter P: Preliminaries Section P.2: Exploring Data Example 1: Think About It! What will it look
More informationStat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution
Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal
More informationChapter 1. Looking at Data-Distribution
Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw
More information8. MINITAB COMMANDS WEEK-BY-WEEK
8. MINITAB COMMANDS WEEK-BY-WEEK In this section of the Study Guide, we give brief information about the Minitab commands that are needed to apply the statistical methods in each week s study. They are
More informationSTA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures
STA 2023 Module 3 Descriptive Measures Learning Objectives Upon completing this module, you should be able to: 1. Explain the purpose of a measure of center. 2. Obtain and interpret the mean, median, and
More informationWELCOME! Lecture 3 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important
More informationMATH 117 Statistical Methods for Management I Chapter Two
Jubail University College MATH 117 Statistical Methods for Management I Chapter Two There are a wide variety of ways to summarize, organize, and present data: I. Tables 1. Distribution Table (Categorical
More informationChapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.
Chapter 5 Understanding and Comparing Distributions The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram
More informationMinitab 17 commands Prepared by Jeffrey S. Simonoff
Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save
More informationB. Graphing Representation of Data
B Graphing Representation of Data The second way of displaying data is by use of graphs Although such visual aids are even easier to read than tables, they often do not give the same detail It is essential
More informationName Date Types of Graphs and Creating Graphs Notes
Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.
More informationSelected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.
Minitab@Oneonta.Manual: Selected Introductory Statistical and Data Manipulation Procedures Gordon & Johnson 2002 Minitab version 13.0 Minitab@Oneonta.Manual: Selected Introductory Statistical and Data
More informationSurvey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9
Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2
More informationLecture 6: Chapter 6 Summary
1 Lecture 6: Chapter 6 Summary Z-score: Is the distance of each data value from the mean in standard deviation Standardizes data values Standardization changes the mean and the standard deviation: o Z
More informationVocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.
5-number summary 68-95-99.7 Rule Area principle Bar chart Bimodal Boxplot Case Categorical data Categorical variable Center Changing center and spread Conditional distribution Context Contingency table
More informationCHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and
CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4
More informationBrief Guide on Using SPSS 10.0
Brief Guide on Using SPSS 10.0 (Use student data, 22 cases, studentp.dat in Dr. Chang s Data Directory Page) (Page address: http://www.cis.ysu.edu/~chang/stat/) I. Processing File and Data To open a new
More informationMath 227 EXCEL / MEGASTAT Guide
Math 227 EXCEL / MEGASTAT Guide Introduction Introduction: Ch2: Frequency Distributions and Graphs Construct Frequency Distributions and various types of graphs: Histograms, Polygons, Pie Charts, Stem-and-Leaf
More informationChapter 6: DESCRIPTIVE STATISTICS
Chapter 6: DESCRIPTIVE STATISTICS Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Time Sequence Plots Normal Probability Plots Sections 6-1 to 6-5, and 6-7 Random Sampling
More informationIntroduction to Minitab 1
Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,
More information1 Introduction to Using Excel Spreadsheets
Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)
More informationMath 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency
Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency lowest value + highest value midrange The word average: is very ambiguous and can actually refer to the mean,
More informationThe main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?
Chapter 4 Analyzing Skewed Quantitative Data Introduction: In chapter 3, we focused on analyzing bell shaped (normal) data, but many data sets are not bell shaped. How do we analyze quantitative data when
More informationSummarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester
Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 09/10/2018 Summarising Data Today we will consider Different types of data Appropriate ways to summarise these
More informationHow individual data points are positioned within a data set.
Section 3.4 Measures of Position Percentiles How individual data points are positioned within a data set. P k is the value such that k% of a data set is less than or equal to P k. For example if we said
More informationNumerical Descriptive Measures
Chapter 3 Numerical Descriptive Measures 1 Numerical Descriptive Measures Chapter 3 Measures of Central Tendency and Measures of Dispersion A sample of 40 students at a university was randomly selected,
More informationCHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.
1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed
More informationHow to Make Graphs in EXCEL
How to Make Graphs in EXCEL The following instructions are how you can make the graphs that you need to have in your project.the graphs in the project cannot be hand-written, but you do not have to use
More informationSection 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc
Section 2-2 Frequency Distributions Copyright 2010, 2007, 2004 Pearson Education, Inc. 2.1-1 Frequency Distribution Frequency Distribution (or Frequency Table) It shows how a data set is partitioned among
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationStatistics Lecture 6. Looking at data one variable
Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial)
More informationChapter 2 Assignment (due Thursday, April 19)
(due Thursday, April 19) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should
More informationStatistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.
Statistical Methods Instructor: Lingsong Zhang 1 Issues before Class Statistical Methods Lingsong Zhang Office: Math 544 Email: lingsong@purdue.edu Phone: 765-494-7913 Office Hour: Monday 1:00 pm - 2:00
More informationRelease notes for StatCrunch mid-march 2015 update
Release notes for StatCrunch mid-march 2015 update A major StatCrunch update was made on March 18, 2015. This document describes the content of the update including major additions to StatCrunch that were
More informationAt the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs
DATA PRESENTATION At the end of the chapter, you will learn to: Present data in textual form Construct different types of table and graphs Identify the characteristics of a good table and graph Identify
More informationCreating a Basic Chart in Excel 2007
Creating a Basic Chart in Excel 2007 A chart is a pictorial representation of the data you enter in a worksheet. Often, a chart can be a more descriptive way of representing your data. As a result, those
More informationSPSS. (Statistical Packages for the Social Sciences)
Inger Persson SPSS (Statistical Packages for the Social Sciences) SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform basic statistical calculations in SPSS.
More informationIAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram
IAT 355 Visual Analytics Data and Statistical Models Lyn Bartram Exploring data Example: US Census People # of people in group Year # 1850 2000 (every decade) Age # 0 90+ Sex (Gender) # Male, female Marital
More informationOrganizing and Summarizing Data
1 Organizing and Summarizing Data Key Definitions Frequency Distribution: This lists each category of data and how often they occur. : The percent of observations within the one of the categories. This
More informationDAY 52 BOX-AND-WHISKER
DAY 52 BOX-AND-WHISKER VOCABULARY The Median is the middle number of a set of data when the numbers are arranged in numerical order. The Range of a set of data is the difference between the highest and
More information2.1: Frequency Distributions and Their Graphs
2.1: Frequency Distributions and Their Graphs Frequency Distribution - way to display data that has many entries - table that shows classes or intervals of data entries and the number of entries in each
More informationAP Statistics Summer Assignment:
AP Statistics Summer Assignment: Read the following and use the information to help answer your summer assignment questions. You will be responsible for knowing all of the information contained in this
More informationPractical 2: Using Minitab (not assessed, for practice only!)
Practical 2: Using Minitab (not assessed, for practice only!) Instructions 1. Read through the instructions below for Accessing Minitab. 2. Work through all of the exercises on this handout. If you need
More informationPlotting Graphs. Error Bars
E Plotting Graphs Construct your graphs in Excel using the method outlined in the Graphing and Error Analysis lab (in the Phys 124/144/130 laboratory manual). Always choose the x-y scatter plot. Number
More informationCHAPTER 2: SAMPLING AND DATA
CHAPTER 2: SAMPLING AND DATA This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and Georgia Highlands College. OUTLINE 2.1 Stem-and-Leaf Graphs (Stemplots),
More informationChapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data
Chapter 2 Organizing and Graphing Data 2.1 Organizing and Graphing Qualitative Data 2.2 Organizing and Graphing Quantitative Data 2.3 Stem-and-leaf Displays 2.4 Dotplots 2.1 Organizing and Graphing Qualitative
More informationICT & MATHS. Excel 2003 in Mathematics Teaching
ICT & MATHS Excel 2003 in Mathematics Teaching Published by The National Centre for Technology in Education in association with the Project Maths Development Team. Permission granted to reproduce for educational
More informationNCSS Statistical Software
Chapter 152 Introduction When analyzing data, you often need to study the characteristics of a single group of numbers, observations, or measurements. You might want to know the center and the spread about
More informationMATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data
MATH& 146 Lesson 10 Section 1.6 Graphing Numerical Data 1 Graphs of Numerical Data One major reason for constructing a graph of numerical data is to display its distribution, or the pattern of variability
More informationChapter 3 Analyzing Normal Quantitative Data
Chapter 3 Analyzing Normal Quantitative Data Introduction: In chapters 1 and 2, we focused on analyzing categorical data and exploring relationships between categorical data sets. We will now be doing
More informationTest Bank for Privitera, Statistics for the Behavioral Sciences
1. A simple frequency distribution A) can be used to summarize grouped data B) can be used to summarize ungrouped data C) summarizes the frequency of scores in a given category or range 2. To determine
More informationGraphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction
Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Y O Lam, SCOPE, City University of Hong Kong Introduction The most convenient
More informationChapter 5. Understanding and Comparing Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 5 Understanding and Comparing Distributions The Big Picture We can answer much more interesting questions about variables when we compare distributions for different groups. Below is a histogram
More informationMath 121 Project 4: Graphs
Math 121 Project 4: Graphs Purpose: To review the types of graphs, and use MS Excel to create them from a dataset. Outline: You will be provided with several datasets and will use MS Excel to create graphs.
More informationChapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data
Chapter 2 Descriptive Statistics: Organizing, Displaying and Summarizing Data Objectives Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically
More informationUnivariate Statistics Summary
Further Maths Univariate Statistics Summary Types of Data Data can be classified as categorical or numerical. Categorical data are observations or records that are arranged according to category. For example:
More informationPrepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.
Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good
More informationThe basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student
Organizing data Learning Outcome 1. make an array 2. divide the array into class intervals 3. describe the characteristics of a table 4. construct a frequency distribution table 5. constructing a composite
More informationGraphical Analysis of Data using Microsoft Excel [2016 Version]
Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.
More informationThings you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.
1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.
More informationThis chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.
CHAPTER 2 Frequency Distributions and Graphs Objectives Organize data using frequency distributions. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.
More informationCreate a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?
A frequency table is a table with two columns, one for the categories and another for the number of times each category occurs. See Example 1 on p. 247. Create a bar graph that displays the data from the
More informationExcel Tips and FAQs - MS 2010
BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my
More informationLearner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display
CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &
More information1.3 Graphical Summaries of Data
Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 1.3 Graphical Summaries of Data In the previous section we discussed numerical summaries of either a sample or a data. In this
More informationChapter 5: The beast of bias
Chapter 5: The beast of bias Self-test answers SELF-TEST Compute the mean and sum of squared error for the new data set. First we need to compute the mean: + 3 + + 3 + 2 5 9 5 3. Then the sum of squared
More informationQuick introduction to descriptive statistics and graphs in. R Commander. Written by: Robin Beaumont
Quick introduction to descriptive statistics and graphs in R Commander Written by: Robin Beaumont e-mail: robin@organplayers.co.uk http://www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html Date
More informationAssignment 3 due Thursday Oct. 11
Instructor Linda C. Stephenson due Thursday Oct. 11 GENERAL NOTE: These assignments often build on each other what you learn in one assignment may be carried over to subsequent assignments. If I have already
More informationUnderstanding and Comparing Distributions. Chapter 4
Understanding and Comparing Distributions Chapter 4 Objectives: Boxplot Calculate Outliers Comparing Distributions Timeplot The Big Picture We can answer much more interesting questions about variables
More informationCombo Charts. Chapter 145. Introduction. Data Structure. Procedure Options
Chapter 145 Introduction When analyzing data, you often need to study the characteristics of a single group of numbers, observations, or measurements. You might want to know the center and the spread about
More informationStatCrunch. Background Material and Guided Tours. Tom Trollen. Division of Business and Computer Information Systems Scottsdale Community College
StatCrunch Background Material and Guided Tours Tom Trollen Division of Business and Computer Information Systems Scottsdale Community College Table of Contents INTRODUCTION... 1 STATCRUNCH... 1 OVERVIEW...
More informationChapter 2: Descriptive Statistics
Chapter 2: Descriptive Statistics Student Learning Outcomes By the end of this chapter, you should be able to: Display data graphically and interpret graphs: stemplots, histograms and boxplots. Recognize,
More informationError-Bar Charts from Summary Data
Chapter 156 Error-Bar Charts from Summary Data Introduction Error-Bar Charts graphically display tables of means (or medians) and variability. Following are examples of the types of charts produced by
More informationChapter 2 - Graphical Summaries of Data
Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense
More informationUsing Excel for Graphical Analysis of Data
Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are
More informationFrequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values
Chapter 500 Introduction This procedure produces tables of frequency counts and percentages for categorical and continuous variables. This procedure serves as a summary reporting tool and is often used
More informationChapter 5snow year.notebook March 15, 2018
Chapter 5: Statistical Reasoning Section 5.1 Exploring Data Measures of central tendency (Mean, Median and Mode) attempt to describe a set of data by identifying the central position within a set of data
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationMinitab Notes for Activity 1
Minitab Notes for Activity 1 Creating the Worksheet 1. Label the columns as team, heat, and time. 2. Have Minitab automatically enter the team data for you. a. Choose Calc / Make Patterned Data / Simple
More informationFathom Dynamic Data TM Version 2 Specifications
Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More informationApplied Regression Modeling: A Business Approach
i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming
More information