Nuts and Bolts Research Methods Symposium

Similar documents
Organizing Your Data. Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

SPSS Modules Features

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975.

FreeJSTAT for Windows. Manual

Product Catalog. AcaStat. Software

Research Data Analysis using SPSS. By Dr.Anura Karunarathne Senior Lecturer, Department of Accountancy University of Kelaniya

Data analysis using Microsoft Excel

JMP 10 Student Edition Quick Guide

Why is Statistics important in Bioinformatics?

Excel 2010 with XLSTAT

Basic concepts and terms

STATISTICS (STAT) Statistics (STAT) 1

One way ANOVA when the data are not normally distributed (The Kruskal-Wallis test).

SPSS: AN OVERVIEW. V.K. Bhatia Indian Agricultural Statistics Research Institute, New Delhi

186 Statistics, Data Analysis and Modeling. Proceedings of MWSUG '95

Base package The Base subscription includes the following features:

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR

MAT 155. Chapter 1 Introduction to Statistics. sample. population. parameter. statistic

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

IBM SPSS Statistics Traditional License packages and features

AcaStat User Manual. Version 10 for Mac and Windows. Copyright 2018, AcaStat Software. All rights Reserved.

Minitab 18 Feature List

Subject. Creating a diagram. Dataset. Importing the data file. Descriptive statistics with TANAGRA.

Introduction to SPSS Faiez Mossa 2 nd Class

Creating a data file and entering data

STATISTICS (STAT) 200 Level Courses. 300 Level Courses. Statistics (STAT) 1

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

STATA 13 INTRODUCTION

The Power and Sample Size Application

Predict Outcomes and Reveal Relationships in Categorical Data

Table of Contents (As covered from textbook)

Data Analysis using SPSS

Table Of Contents. Table Of Contents

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

ECLT 5810 Data Preprocessing. Prof. Wai Lam

CLAREMONT MCKENNA COLLEGE. Fletcher Jones Student Peer to Peer Technology Training Program. Basic Statistics using Stata

Nonparametric Testing

Research Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel

Chapter 1 Introduction to Statistics

Minitab 17 commands Prepared by Jeffrey S. Simonoff

To make sense of data, you can start by answering the following questions:

Learn What s New. Statistical Software

What s new in MINITAB English R14

Eksamen ERN4110, 6/ VEDLEGG SPSS utskrifter til oppgavene (Av plasshensyn kan utskriftene være noe redigert)

Interval Estimation. The data set belongs to the MASS package, which has to be pre-loaded into the R workspace prior to use.

Chapter 2: Frequency Distributions

Study Guide. Module 1. Key Terms

The ctest Package. January 3, 2000

Want to Do a Better Job? - Select Appropriate Statistical Analysis in Healthcare Research

Matlab for Engineers

STATISTICS FOR PSYCHOLOGISTS

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application

Instructions for Using ABCalc James Alan Fox Northeastern University Updated: August 2009

Basic Statistical Terms and Definitions

STA 570 Spring Lecture 5 Tuesday, Feb 1

SPSS. (Statistical Packages for the Social Sciences)

Data Has Shape. Did you know? Data has Shape! Examples. My Data What do you think the shape of height data for this class looks like?

Descriptive Statistics Descriptive statistics & pictorial representations of experimental data.

MINITAB Release Comparison Chart Release 14, Release 13, and Student Versions

An introduction to SPSS

STATS PAD USER MANUAL

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

Correctly Compute Complex Samples Statistics

SPSS: AN OVERVIEW. SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi

MHPE 494: Data Analysis. Welcome! The Analytic Process

JMP Book Descriptions

Psychology Press is an imprint of the Taylor & Francis Group, an informa business

The results section of a clinicaltrials.gov file is divided into discrete parts, each of which includes nested series of data entry screens.

User Services Spring 2008 OBJECTIVES Introduction Getting Help Instructors

STATISTICS (STAT) 200 Level Courses Registration Restrictions: STAT 250: Required Prerequisites: not Schedule Type: Mason Core: STAT 346:

AcaStat User Manual. Version 8.3 for Mac and Windows. Copyright 2014, AcaStat Software. All rights Reserved.

Right-click on whatever it is you are trying to change Get help about the screen you are on Help Help Get help interpreting a table

Quick Start Guide Jacob Stolk PhD Simone Stolk MPH November 2018

Frequency Distributions

1 Some Preliminaries Statistical Models Model Building

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Mr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009

STAT 3304/5304 Introduction to Statistical Computing. Introduction to SAS

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

- 1 - Fig. A5.1 Missing value analysis dialog box

SPSS stands for Statistical Package for the Social Sciences. SPSS was made to be easier to use then other statistical software like S-Plus, R, or SAS.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

CS130 Software Tools. Fall 2010 Intro to SPSS and Data Handling

WolStat: A new statistics package for the behavioral and social sciences

IBM SPSS Categories. Predict outcomes and reveal relationships in categorical data. Highlights. With IBM SPSS Categories you can:

Opening a Data File in SPSS. Defining Variables in SPSS

2) familiarize you with a variety of comparative statistics biologists use to evaluate results of experiments;

IBM SPSS Categories 23

Technical Support Minitab Version Student Free technical support for eligible products

JMP Chong Ho

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

Applied Regression Modeling: A Business Approach

SPSS for Survey Analysis

Chapter 2: Understanding Data Distributions with Tables and Graphs

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

BUSINESS DECISION MAKING. Topic 1 Introduction to Statistical Thinking and Business Decision Making Process; Data Collection and Presentation

Choosing the Right Procedure

Transcription:

Organizing Your Data Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013 Topics to Discuss: Types of Variables Constructing a Variable Code Book Developing Excel Spreadsheets Data Entry Descriptive vs. Inferential Statistics Parametric vs. Nonparametric Statistics Variables A characteristic or condition that changes or has different values for different individuals Anything that can be measured TYPES OF VARIABLES Qualitative Variables Differ in kind rather than amount Differ in quality, not quantity or magnitude Also referred to as categorical or nominal Examples favorite color, treatment group, gender, race Quantitative Variables Assigned number values that represent differing quantities of the characteristics Examples medication dosage, # of doctor visits, it annual income Quantitative data can either be: Discrete a finite number of values (i.e., # of doctor visits last year) Continuous infinite continuum of possible real number values (i.e., # of minutes it takes to finish a book) the author. 1

Quantitative Variables Three types of quantitative variables: Ordinal categorical scales that have a natural ordering of values (i.e., SES Class low, middle, high) Interval distances between adjacent scores are equal & consistent throughout the scale with no absolute zero point (i.e., IQ scores, temperature) Ratio same as interval with a true zero point (i.e., length, distance, time) Variables Final Points It is possible to measure data on more than one scale Variables should always be measured on the highest scale possible Ratio Interval Ordinal Nominal NAMING VARIABLES The first row should include variable names this makes transfer to other programs easier (i.e., SPSS, SAS) Variable names can be up to 32 characters in length but anything more than 8-12 becomes very cumbersome to manage Each variable name must be unique; duplication is not allowed & names are not case sensitive Variable names should begin with a letter Avoid periods, #, @, $, and only use underscores within the variable name (not at tth the beginning i or end) No spaces are allowed in variable names Use meaningful names for variables Makes variables more self explanatory Some exceptions balance length/meaning Acceptable Names Q1; Q_1 Question1; Question_1 Q1 _ food Food DRS1; DRS_1 Unacceptable Names Q 1; 1Q; Q-1 Question 1; Question-1 Q1 food; Q1-food _Food_ DiabetesRiskScale1 The main thing is to be consistent when naming variables the author. 2

http://www.ciser.cornell.edu/images/excel2sasa.gif What is wrong with this file? CONSTRUCTING A VARIABLE CODE BOOK Purpose: Variable Code Books To create a data entry system To assist with data entry For statistical analysis When archiving data files for follow-up Code Book Construction Elements to include: Variable Name Variable Label describe the variable and/or include the question of interest t Value Labels give labels for each possible numeric value of the variable Example: Age Age of participant at time of survey 1=20-29, 2=30-39, 3=40-49, 4=50-59 Code Book Construction Word or Excel format is acceptable A columned list or table is acceptable All variables should be included with appropriate labeling information Variable labels can be any length but no longer than 256 characters is recommended The variable labels can contain spaces & characters not allowed in variable names Code Book Examples Polit Data Files Swedish Institute for Social Research ACHA NCHA II the author. 3

Code Book Final Points Be consistent in your coding! Update the code book as you enter your data if you make a change while entering your data, make sure you update your code book as well Check & double check your code book acts as a form of communication between you & your data analyst DEVELOPING EXCEL SPREADSHEETS Excel Basics Each individual row of data is known as a record, an observation, a case Do not leave any blank rows There cannot be information i about an item in more than one row Each column is a field labeled to identify the data it contains All data in each column should be formatted the same Do not leave blank columns in the table Excel Basics Once a database is created you can use Excel tools to manage the data Sorting Data Filtering Data Missing Values DATA ENTRY Should be entered consistently use 9 or 99 or 999 The value should be something that cannot represent a real numeric value for the variable in question Excel will recognize these missing values as real values so be careful if you are using Excel for analysis the author. 4

Additional Points Ensure rows below data are not activated so they are not mistaken during transfer as additional cases/observations Numeric values are always best to use for data entry regardless of the type of variable (quantitative vs. qualitative) Values/labels can always be assigned in a code book or data analysis program DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive vs. Inferential Descriptive Statistics Used to summarize, organize, and simplify data for better understanding Means, standard deviations, percents, frequencies, proportions, etc. Inferential Statistics Statistical procedures that allow researchers to study samples & then make generalizations about the population from which they were selected Allows the researcher to draw conclusions PARAMETRIC VS. NONPARAMETRIC STATISTICS Parametric Statistics Parametric Statistics A class of inferential statistical tests that involves (a) assumptions about the distribution of the variables, (b) the estimation of a parameter, and usually (c) the use of interval or ratio measures Statistical tests designed to be used when data have certain characteristics when they approximate a normal distribution & are measured with interval or ratio scales Bivariate One-sample test Two-sample test Analysis of variance (ANOVA) Repeated measures ANOVA Pearson s product moment correlation (r) Multivariate Multiple correlation/regression ANCOVA MANOVA MANCOVA Mixed design RM-ANOVA Canonical analysis Discriminant analysis Logistic regression Factor analysis the author. 5

Nonparametric Statistics Nonparametric Statistics A general class of inferential statistical tests that does not involve rigorous assumptions about the distribution of the variables; most often used with small samples, when data are measured on the nominal or ordinal scales, or when a distribution is severely skewed Statistical tests that are designed to be used when data being analyzed depart from the distributions that can be analyzed with parametric statistics Chi-square goodness-of-fit test Chi-square test of independence Fisher s exact test McNemar test Cochran s Q test Mann-Whitney U test Kruskal-Wallis test Wicoxon signed ranks test Friedman test Spearman s rank order correlation Kendall s tau the author. 6