Excel 2010 with XLSTAT

Similar documents
Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Applied Regression Modeling: A Business Approach

Brief Guide on Using SPSS 10.0

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Applied Regression Modeling: A Business Approach

Organizing Your Data. Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013

8. MINITAB COMMANDS WEEK-BY-WEEK

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

SPSS. (Statistical Packages for the Social Sciences)

Fathom Dynamic Data TM Version 2 Specifications

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

Math 227 EXCEL / MEGASTAT Guide

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975.

Research Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel

STATS PAD USER MANUAL

Excel Tips and FAQs - MS 2010

Guide to Statistical Software

Homework 1 Excel Basics

MINITAB 17 BASICS REFERENCE GUIDE

INSTRUCTIONS FOR USING MICROSOFT EXCEL PERFORMING DESCRIPTIVE AND INFERENTIAL STATISTICS AND GRAPHING

Project 11 Graphs (Using MS Excel Version )

Table of Contents (As covered from textbook)

Introduction to Minitab 1

Table Of Contents. Table Of Contents

Robust Linear Regression (Passing- Bablok Median-Slope)

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

Subject. Creating a diagram. Dataset. Importing the data file. Descriptive statistics with TANAGRA.

Excel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division

STATA 13 INTRODUCTION

CHAPTER 6. The Normal Probability Distribution

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

Nuts and Bolts Research Methods Symposium

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Statistical Good Practice Guidelines. 1. Introduction. Contents. SSC home Using Excel for Statistics - Tips and Warnings

Introductory Applied Statistics: A Variable Approach TI Manual

Applied Regression Modeling: A Business Approach

WELCOME! Lecture 3 Thommy Perlinger

Spreadsheet and Graphing Exercise Biology 210 Introduction to Research

IQR = number. summary: largest. = 2. Upper half: Q3 =

Technical Support Minitab Version Student Free technical support for eligible products

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

TI-83 Users Guide. to accompany. Statistics: Unlocking the Power of Data by Lock, Lock, Lock, Lock, and Lock

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Product Catalog. AcaStat. Software

Instructions for Using ABCalc James Alan Fox Northeastern University Updated: August 2009

Elementary Statistics: Looking at the Big Picture

Regression III: Advanced Methods

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Lab #9: ANOVA and TUKEY tests

The Power and Sample Size Application

1 RefresheR. Figure 1.1: Soy ice cream flavor preferences

JMP 10 Student Edition Quick Guide

Data Analysis Guidelines

Learn What s New. Statistical Software

Math 121 Project 4: Graphs

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

MegaStat User s Guide

MINITAB Release Comparison Chart Release 14, Release 13, and Student Versions

Creating a data file and entering data

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

An introduction to SPSS

INTRODUCTION TO SPSS OUTLINE 6/17/2013. Assoc. Prof. Dr. Md. Mujibur Rahman Room No. BN Phone:

Meet MINITAB. Student Release 14. for Windows

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Minitab 18 Feature List

KINETICS CALCS AND GRAPHS INSTRUCTIONS

Error-Bar Charts from Summary Data

Frequency Tables. Chapter 500. Introduction. Frequency Tables. Types of Categorical Variables. Data Structure. Missing Values

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

Making Science Graphs and Interpreting Data

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Data analysis using Microsoft Excel

Please consider the environment before printing this tutorial. Printing is usually a waste.

An Introduction to Minitab Statistics 529

Selected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.

1. Make a bar graph in Excel. (1.5 points) Copy the following table into two columns under a blank worksheet in Excel.

Using Large Data Sets Workbook Version A (MEI)

Index. Bar charts, 106 bartlett.test function, 159 Bottles dataset, 69 Box plots, 113

CREATING THE DISTRIBUTION ANALYSIS

Introduction to StatKey Getting Data Into StatKey

MegaStat User s Guide

1 Introduction to Using Excel Spreadsheets

Pre-Lab Excel Problem

Introduction to CS graphs and plots in Excel Jacek Wiślicki, Laurent Babout,

SPSS: AN OVERVIEW. V.K. Bhatia Indian Agricultural Statistics Research Institute, New Delhi

Graphical Analysis of Data using Microsoft Excel [2016 Version]

ANSWERS -- Prep for Psyc350 Laboratory Final Statistics Part Prep a

Microsoft Excel Using Excel in the Science Classroom

Technology Is For You!

Table of Contents. Help/Information Help System System Information About MegaStat... 11

Introduction to StatsDirect, 15/03/2017 1

Software Reference Sheet: Inserting and Organizing Data in a Spreadsheet

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

Release notes for StatCrunch mid-march 2015 update

Year 10 General Mathematics Unit 2

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

Numerical Descriptive Measures

Testing Random- Number Generators

Install RStudio from - use the standard installation.

Transcription:

Excel 2010 with XLSTAT J E N N I F E R LE W I S PR I E S T L E Y, PH.D. Introduction to Excel 2010 with XLSTAT The layout for Excel 2010 is slightly different from the layout for Excel 2007. However, with a little practice, students should find that the new layout is logical and easy to use. The tabs across the top can be thought of as tool boxes, where the options within each tab represent the individual tools available in Excel. Once XLSTAT has been installed, the functionality can be found under the XLSTAT tab. When the XLSTAT tab is selected, you will see the following screen: The options for analysis are grouped into logical categories like Discover, explain and predict and Test a hypothesis. Descriptive Statistics and Confidence Intervals for Means Use the following procedure to find the complete description of a variable, including the mean, median, and standard deviation. 1. Select XLSTAT>Describing Data>Descriptive Statistics. 2. The screen should default to the General tab. Place your cursor inside the first box Quantitative data:. Click on the first cell of the column of the variable to be analyzed; this cell should contain the variable name. Highlight the entire variable column. 3. The Options and Charts tabs will default to the required statistics; explore these tabs to determine the full range of options. To generate the 95% Confidence Intervals around the mean, select the Outputs tab. Scroll through the list of output options available for Quantitative Data:. 5. Check the two boxes corresponding to the lower and upper bounds of the confidence interval. Select OK and Typical output with descriptive statistics is shown. Notice that the variable name from the first row appears in the output. This is particularly useful when analyzing multiple variables simultaneously. To generate descriptive statistics for multiple variables simultaneously, select all of the variables of interest, following the instructions from step 2. Ensure that the variable names from the first row are captured. Select OK and Scatterplot 1. Select XLSTAT>Visualizing Data>Scatter plots. The following screen will appear: 2. Place your cursor inside the X: box. Click on the first row of the variable to be assigned to the (horizontal) x-axis and highlight the full range of data for the variable. Do the same for the Y: box. This variable will be assigned to the (vertical) y-axis. Note that both variables must be quantitative and have the same number of rows. The Options tab will default to the required statistics. Click OK and ISBN-13: 978-0-321-74775-4 ISBN-10: 0-321-74775-5 9 0 0 0 0 1 9 780321 747754

Correlation 1. Select XLSTAT>Test a Hypothesis>Correlation/Association tests>correlation tests. The following screen will appear: 2. Place your cursor inside the Observations/variables table: box. Click on the first row of the first variable to be analyzed and highlight the range of variables to be included, then highlight the columns of data to be included. Note: The quantitative data to be included in the correlation analysis should be in columns next to each other. If this is not the case, reposition the variables in the data set. 3. On the General tab, the type of correlation analysis will default to Pearson, which is typically used with quantitative data of sample sizes greater than 30. The other options are Spearman and Kendall Correlation values. These options are typically used with ordinal data or data with fewer than 30 observations. The significance level will default to 5% (95% confidence), but can be easily changed. Select the Missing data tab. The first option, Do not accept missing data, will result in the analysis not running if there are any missing values in the data set. The second option, Remove the observations, will ignore any observations for all correlations with any missing values, even if those values were not required for the correlation. The third option, Pairwise deletion, will ignore observations only when the missing values were required for the correlation analysis. The fourth option, Estimate missing data, will replace any missing values, using the imputation option selected (mean or mode or nearest neighbor). Be certain to select the option most appropriate for your data. 5. The Outputs and Charts tabs will default to the required statistics and plots. Select OK and Regression Modeling, Finding the Equation of Regression Line, and Residual Plots Use the following procedure to generate a linear regression model. 1. Select XLSTAT>Modeling Data>Linear regression. The following screen will appear: 3. The Options and Validation tabs will default to the required statistics and plots. Select OK and Typical output from a linear regression model is provided. Scrolling to the bottom of the output, several plots of residuals can be found to help assess the stability and generalizability of the model. 2. Place your cursor inside the Y/Dependent variables: box. Click on the first row of the variable to be assigned as the dependent variable the variable to be predicted or explained and highlight the entire data range for the variable. Do the same with the X/Explanatory variables:. Note that several variables can be included here. All of the variables selected for these two roles must be quantitative; there is a third box reserved for qualitative variables, should one be required for a model. Displaying Categorical Data Using Frequency Counts, Bar Charts, and Pie Charts Use the following procedure to generate a frequency table, a bar chart, and a pie chart for a categorical variable. 1. Select XLSTAT>Describing data>descriptive statistics. 2. Place your cursor inside the Qualitative data: box. Click on the first row of the variable to be analyzed and highlight the entire column of data. The Options and Outputs tabs will default to the required statistics and plots. Select the Charts (2) tab and check that you want to generate both the Bar Chart and the Pie Chart. Select OK and 2

Histogram 1. Select XLSTAT>Visualizing Data>Histogram. The following screen will appear: 3. Typical output is provided: 2. Place your cursor inside the Data: box. Click on the first row of the variable to be analyzed and highlight the entire column of data. The Options and Missing data, etc. tabs will default to the required statistics and plots. Select OK and Boxplot 1. Select XLSTAT>Visualizing Data>Univariate plots. 2. Place your cursor inside the Quantitative data: box. Click on the first row of the variable to be analyzed and highlight the entire column of data. The Options and Outputs tabs will default to the required statistics and plots. Select OK and 3. Typical output is provided: 5. To create side-by-side boxplots of a quantitative variable by different values of a qualitative value (such as gender), after selecting the quanitative variable in step 2, check the Subsamples: box. Place your cursor in the larger box and select the qualitative variable to be included. Select the Outputs tab and check the Group plots box. Select OK and Typical output is provided: Assessing Normality and Goodness of Fit Use the following procedure to assess the normality of a quantitative variable or its goodness of fit to a particular distribution. 1. Select XLSTAT>Modeling data>distribution fitting. 2. Place your cursor inside the Data: box. Click on the first row of the variable to be analyzed and highlight the entire range of the data. To test if the data follows a normal distribution, ensure that Normal appears in the Distribution: box. 3. Select the Options tab. XLSTAT provides two tests to assess the fit of the data to the theoretical distribution selected from the General tab. The Chi-square goodness of fit test is a parametric test using the distance between the histogram of the theoretical distribution and the histogram of the empirical distribution of the sample. The histograms are calculated using k intervals selected in the Number: box. This test is better for discrete data. The Kolmogorov-Smirnov goodness of fit test is an exact nonparametric test based on the maximum distance between a theoretical distribution function and the empirical distribution function of the sample. This test can be used only for continuous distributions. Select the test most appropriate for the data. continued 3

Assessing Normality and Goodness of Fit (continued) The Options, Missing data, Outputs, and Charts tabs will default to the required statistics and charts. Select OK and 5. Typical results for the Kolmogorov-Smirnov test and Chi-Square test are shown: Note: The Kolmogorov-Smirnov and Chi-square test results explain whether the distribution is normal. For these tests, a low p-value (less than the alpha value), would indicate that the distribution is not normal. Sampling 1. Select XLSTAT>Preparing data>data sampling. Place your cursor inside the Data: box. Highlight the entire range of the data all variables and all observations. 2. The Sampling: box includes several options. If the data have been sorted in any way, the first two options N first rows and 3. N last rows may not be appropriate. For a simple random sample, the third option Random without replacement may be most appropriate. Review all options prior to making a selection. Enter the required number of observations into the sample in the Sample size: box. Select OK and The resulting output will be a random subset of the original dataset. Hypothesis Test and Confidence Interval for a Single Proportion 1. Select XLSTAT>Test a hypothesis>parametric tests>tests for one proportion. The following screen will appear: 2. In the Frequency: box, enter the frequency of the condition of interest. For example, if the sample includes 119 people and 62 are women and the test and confidence interval will be executed on the proportion of women, enter a value of 62. In the Sample size: box, enter the total number of individuals in the sample. Note that this information can be obtained by generating the descriptive statistics for the qualitative variable of interest (Describing data>descriptive statistics). In the Test proportion: box, enter the theoretical proportion against which the sample proportion is being tested. If there is no test being conducted, enter a value of.50. 3. Select the Options tab. The Alternative hypothesis: box provides three options for hypothesis testing: two-tailed test (the default) and a one-tailed test in each direction (less 4 than and greater than the hypothesized difference between the population proportion and the Test proportion). Select the most appropriate option. In the Hypothesized difference: box, enter the value of the hypothesized difference (typically, but not always, 0). In the Significance level (%): box, enter the alpha value for the test (typically, but not always, 5 (5%)). Note that this also corresponds to a 95% confidence interval. The confidence interval options represent slightly different calculations of intervals. Review the differences and select the option most appropriate for your data. Select OK and Typical output is provided below:

Hypothesis Test and Confidence Interval for the Difference between Proportions 1. Select XLSTAT>Test a hypothesis>parametric tests>tests for two proportions. 2. Follow step 2 of Hypothesis Test and Confidence Interval for a Single Proportion. 3. Select the Options tab. Follow step 3 of Hypothesis Test and Confidence Interval for a Single Proportion. The two variance options represent an unpooled (unequal variance) approach and a pooled (equal variance) approach, respectively. Review the differences and select the option most appropriate for your data. If you are unsure, select the more conservative unpooled approach, which is the default. Select OK and Hypothesis Test and Confidence Interval for One Sample Mean 1. Select XLSTAT>Test a hypothesis>parametric tests>one sample t-test and z-test. 2. Place your cursor inside the Data: box. Click inside the first row of the variable to be analyzed and highlight the full range of the variable. 3. Select the Options tab. Follow step 2 of Hypothesis Test and Confidence Interval for a Single Proportion. 5. The Missing data tab provides options for what to do with missing data. If there are any missing values, select the second option Remove the observations. The Outputs tab will default to the required statistics. Select OK and Typical output is provided below: Hypothesis Test and Confidence Interval for Mean of Paired Differences 1. Select XLSTAT>Test a hypothesis>parametric tests>two sample t-test and z-test. The following screen will appear: 6. Typical output is provided below: 5. Follow step 3 and step 5 of Hypothesis Test and Confidence Interval for One Sample Mean. 6. The Outputs and Charts, tabs will default to the statistics that are required (the defaults will not produce charts). Select OK and 7. Typical output is provided below: 2. Place cursor in the Sample 1: box. Click inside the first row of the first pair of variables to be analyzed and highlight the full range of the variable. Place cursor in the Sample 2: box and highlight the full range of the second variable. 3. Under the Data format: options, identify that the data is Paired samples. Select the Options tab. 5

Hypothesis Tests and Confidence Interval for Difference of Means in Two Independent Samples 1. Select XLSTAT>Test a hypothesis>parametric tests>two sample t-test and z-test. 2. Prior to selecting the data for analysis, you must identify its Data format. For a hypothesis test of two independent samples, the data could exist in one of two formats. Data would reflect the first format, one column per sample, if the two samples were in different columns (i.e., Female Heights and Male Heights ). Data would reflect the second format, one column per variable, if all of the quantitative data for both samples is in a single column (i.e., Height ) and the sample identifiers or categories exists in a separate column (i.e., Gender ). 3. Place cursor in the Sample 1: box. Click inside the first row of the first pair of variables to be analyzed and highlight the full range of the variable. Place the cursor in the Sample 2: box and highlight the full range of the second variable. Finding the Area Under the Normal Curve and Inverse Normality 1. Select the quantitative variable to be analyzed. At the bottom of that variable column, generate the mean and the standard deviation. To generate, enter the formulas =AVERAGE(A2:A41) and =STDEV(A2:A41), where A2 through A41 is the range of the data and standard deviation. 2. Insert a new blank column next to the variable of interest. To do this, click on the top of the column where you want to insert a new column. Select Home>Insert. 3. To find the associated cumulative area under the normal curve of values for a variable, place your cursor in the second row of the new column. Note that the first cell should be used to name the column. Click the fx button. In the Search for a Function box, type Normal Distribution. From the Select a Function list, click on NORM. DIST. You will see the following screen: In the X box, click on the first value in your variable of interest. In the Mean box, click on the cell where you calculated the AVERAGE. Type a $ in front of the letter and in front of the number, referencing the cell where the Generating Random Numbers Follow step 3 of Hypothesis Test and Confidence Interval for a Single Proportion. 5. The Missing data tab provides options for what to do with missing data. Review and select the most appropriate option. The Outputs and Charts tabs will default to the required statistics (the defaults will not produce charts). Select OK and 6. Typical output is provided: AVERAGE was calculated. In the Standard_dev box, click on the cell where you calculated the STDEV. Again, type a $ in front of the letter and the number of the cell reference. In the Cumulative box, type TRUE and click OK. 5. The resulting value will be the cumulative probability (from negative infinity) associated with the value in that row of the variable of interest, assuming a normal distribution. Copy this function to the bottom of the column. 6. To determine the inverse the value of interest based upon a normal probability of occurrence create a new column to the right of the data of interest. Click the fx button. In the Search for a Function box, type Normal Distribution. From the Select a Function list, click on NORM.INV. You will see the following screen: 7. In the Probability box, enter the cumulative probability from the normal curve in which you are interested. Repeat step 4 to enter the necessary values into the Mean and Standard_dev boxes. Click OK. 8. The resulting value will be the observation associated with the cumulative probability indicated, assuming a normal distribution. 1. Create a new column on the right of your dataset titled RANDOM. 2. Inside the first open cell row 2 type the following function: =RAND(). 3. After you click Enter, a random number, following a uniform distribution between 0 and 1, will be generated. Once your random number has been generated, from the Home tab click Copy. Highlight the remainder of the column to the end of the data and click Paste. You may have noticed that the first value in row 2 changed. RAND is a volatile function in Excel, meaning that the result will change whenever a change is made in the spreadsheet. To resolve this, simply highlight the entire RANDOM column, select Copy, and then under the Paste options, select Paste Values. 6