TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

Similar documents
BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

Introduction to Minitab 1

Stata: A Brief Introduction Biostatistics

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

PHPM 672/677 Lab #2: Variables & Conditionals Due date: Submit by 11:59pm Monday 2/5 with Assignment 2

Stata v 12 Illustration. First Session

Chapter 2 The SAS Environment

Introduction to Stata

Basic concepts and terms

INTRODUCTORY SPSS. Dr Feroz Mahomed Swalaha x2689

Homework 1 Excel Basics

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12

For many people, learning any new computer software can be an anxietyproducing

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Epidemiology Principles of Biostatistics Chapter 3. Introduction to SAS. John Koval

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

Mr. Kongmany Chaleunvong. GFMER - WHO - UNFPA - LAO PDR Training Course in Reproductive Health Research Vientiane, 22 October 2009

Spotlight Session Analysing answers to open-ended questions from surveys

Opening a Data File in SPSS. Defining Variables in SPSS

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata...

Excel Tips and FAQs - MS 2010

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT

Let s get started with the module Getting Data from Existing Sources.

1. Creating a data set using the data editor 2. Importing an Excel data file

STATA Hand Out 1. STATA's latest version is version 12. Most commands in this hand-out work on all versions of STATA.

There are 3 main windows, and 3 main types of files, in SPSS: Data, Syntax, and Output.

Instructions for Using the Databases

A Simple Guide to Using SPSS (Statistical Package for the. Introduction. Steps for Analyzing Data. Social Sciences) for Windows

Dr Wan Nor Arifin Unit of Biostatistics and Research Methodology, Universiti Sains Malaysia.

STATA WORKSHOP 2. ERL Workshop for Sociology Fall 2014

Creating a data file and entering data

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University

Clinical Looking Glass Introductory Session In-Class Exercise Two: Congestive Heart Failure

Instructions for Using the Databases

Chapter 2 Assignment (due Thursday, April 19)

STAT10010 Introductory Statistics Lab 2

-Using Excel- *The columns are marked by letters, the rows by numbers. For example, A1 designates row A, column 1.

= 3 + (5*4) + (1/2)*(4/2)^2.

Lab #1: Introduction to Basic SAS Operations

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

SurgiNet Booking an Elective Surgery

4. Descriptive Statistics: Measures of Variability and Central Tendency

download instant at

QUEEN MARY, UNIVERSITY OF LONDON. Introduction to Statistics

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

PHYSICIAN S OFFICE STAFF Instructions for Paragon s WebStation for Physicians

Using Microsoft Excel

User Services Spring 2008 OBJECTIVES Introduction Getting Help Instructors

Excel Forecasting Tools Review

OneView. User s Guide

Excel 2010: Getting Started with Excel

Introduction to STATA 6.0 ECONOMICS 626

Exercise 1: Introduction to Stata

INTRODUCTION TO SPSS. Anne Schad Bergsaker 13. September 2018

Geographical Information Systems Institute. Center for Geographic Analysis, Harvard University. LAB EXERCISE 1: Basic Mapping in ArcMap

Basic Medical Statistics Course

Microsoft Excel 2013 Unit 1: Spreadsheet Basics & Navigation Student Packet

The Menu and Toolbar in Excel (see below) look much like the Word tools and most of the tools behave as you would expect.

HO-1: BASIC SPREADSHEET SKILLS - CREATING A WORKBOOK

Appendix II: STATA Preliminary

SPSS TRAINING SPSS VIEWS

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Section 4 General Factorial Tutorials

Frequency Distributions and Descriptive Statistics in SPS

An Introduction to STATA ECON 330 Econometrics Prof. Lemke

An Introduction to Stata

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Age & Stage Structure: Elephant Model

Risk Adjustment Tool for Length of Stay and Mortality User Guide

BaSICS OF excel By: Steven 10.1

Excel Basics 1. Running Excel When you first run Microsoft Excel you see the following menus and toolbars across the top of your new worksheet

Software Reference Sheet: Inserting and Organizing Data in a Spreadsheet

17 - VARIABLES... 1 DOCUMENT AND CODE VARIABLES IN MAXQDA Document Variables Code Variables... 1

A cell is highlighted when a thick black border appears around it. Use TAB to move to the next cell to the LEFT. Use SHIFT-TAB to move to the RIGHT.

Module 4 : Spreadsheets

RISKMAN QUICK REFERENCE GUIDE TO PREVIEWING INCIDENT REPORTS

Basic Medical Statistics Course

Introduction to Stata: An In-class Tutorial

Creating a Do File in Stata

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny

Studying in the Sciences

Export a PROTECT Report to Excel (User s Guide Lesson 21 : Reports) Tips for Using Microsoft Excel and Exported Reports

IDS 101 Introduction to Spreadsheets

Statistical Package for the Social Sciences INTRODUCTION TO SPSS SPSS for Windows Version 16.0: Its first version in 1968 In 1975.

Microsoft Excel Important Notice

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture..

(Updated 29 Oct 2016)

Copyright 2018 MakeUseOf. All Rights Reserved.

Excel 2013 for Beginners

Rockefeller College MPA Excel Workshop: Clinton Impeachment Data Example

Dr. Barbara Morgan Quantitative Methods

Introduction to Stata - Session 2

Lab 7: Tables Operations in ArcMap

Spreadsheet and Graphing Exercise Biology 210 Introduction to Research

Sample A2J Guided Interview & HotDocs Template Exercise

Transcription:

PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the exercises prior to the course. We will be using STATA statistical software for the biostatistics laboratory component of the ACS outcomes research course. We would like you to be familiar with a few key concepts covered in this primer to maximize your learning during the course. Objectives: To understand the following: 1) Different types of variables 2) The basic structure of a dataset To become familiar with the following: 1) The different windows in STATA 2) How to create a small dataset 3) Two basic STATA commands to analyze data DIFFERENT TYPES OF VARIABLES: The table below reviews the most common types of variables you will encounter. It is important to understand the types of variables in your dataset because this guides the choice of a statistical test. Variable Type Description Examples Possible Values Dichotomous Categorical Ordinal Can take on only two values (usually Yes or No ) Can take on more than two values (but still confined to a limited range) The order of the categories has some inherent meaning Death after an operation Acuity of hospital admission 1 (death) 0 (alive) Elective, urgent, emergent Nominal Continuous The order of the categories has no meaning Can take on any integer value (or fractions if appropriate) The payer for a hospital stay Medicare, Medicaid, Private Patient age 18, 35, 66, 75 It is important to note two things about ordinal variables that distinguish them from continuous variables: 1) they are still confined to a limited number of categories; and 2) the distance between categories is not meaningful. 1

THE STRUCTURE OF A DATASET: Most of you have used a dataset at some point. However, the structure can vary depending on the program or purpose of the dataset. We will introduce you to the basic vocabulary necessary to describe a dataset for the purposes of using STATA. Picture a blank sheet of paper with horizontal and vertical lines intersecting each and you will have the basic structure needed to store data: The vertical lines divide the sheet into columns. Each column represents a different variable. For example, Patient ID number, age, gender, or race: Patient ID Age Gender Race And the horizontal lines divide the sheet into rows. Each row represents what is called an observation -- for example, a patient who has surgery: Patient 1 Patient 2 Patient 3 Patient 4 First observation Most datasets have the variable names (highlighted below) in the first row (or just over the first row as a label). Each additional row (observation) then includes the values of each variable for each patient. Variable name Variable value Patient ID Age Gender Race Patient 1 55 Male White Patient 2 44 Female Black Patient 3 63 Male Native American As you will learn, STATA has this same data storage structure and it is maintained behind the scenes in the Data Editor. 2

GETTING FAMILIAR WITH STATA LAYOUT: When you first open the STATA software, it should look like Figure 1. There are four basic windows that are described in the Figure. If your STATA does not look like the figure, go to the menu and click Prefs (for preferences) and scroll down to Default Windowing. Behind these four windows, there are several other hidden areas of STATA. For instance, click on the Data Editor icon on the toolbar (See Figure 1). This will open a behind the scenes spreadsheet where the data is stored (See Figure 2). We will build a small dataset by typing directly into the Data Editor later in this primer. It is worth noting that commands are usually typed directly into the Command Window, but STATA also has drop-down menus for most commands. Figure 1. The basic layout of STATA. Data editor Click this icon to open the behind the scenes spreadsheet Variables Window Variables are listed here Tip: Click and they will appear in command window Do file editor Click this icon and the do-file window will open Here you can write numerous commands, run them, and save them Review Window Previous commands appear here Tip: Click on the command and it will reappear in the command window Command window Type commands here Hit return and they will run Results window Results of your analyses will appear here 3

CREATING A SMALL STATA DATASET: To become more familiar with STATA, we will create a small dataset with information on five of your friends. You will have four variables: first name, last name, age, and whether they voted for Obama or Romney in the 2012 presidential election. To create this dataset, you will type the data directly into the STATA Data Editor. Open the Data Editor and click in the upper-most left box. Start by typing the first name of one of your friends. Hit return and the variable name, var1 will appear above the column. Now click on var1 and you can enter the variable name in the variable window in the bottom right hand corner. Type firstname in as the name of the first variable 4

Continue by adding your friend s last name in the next column. (Remember to hit return after typing the name.) Then add the variable name, last_name, just as you did above. Next add the other two variables, age and whether they voted Obama or Romney in the election. Then you can start on the next observation. 5

Repeat the process until you ve entered data for all 5 of your friends. Your finished dataset should look something like the following: Click on the X in the right upper-hand corner of the Data Editor window and this will close the Data Editor (But be careful not to close the whole program by clicking the X outside the Data Editor). Now your STATA should appear like Figure 1, except the variables listed should be the ones you just created. In this dataset, each of your friends represents a row or observation and each variable has its own column. Also note that age is a continuous variable and the vote is a dichotomous variable. But what type of variables are first_name and last_name? These variables don t fall into are previous classification because they re not numbers. However, STATA refers to data that has text or words instead of numbers as string variables. 6

STARTING TO ANALYZE YOUR DATASET: Now we will try a few tricks using your new dataset. Type the following text in your command window: (in the following text, the box around the commands just lets you know this is STATA syntax this convention will also be used during the laboratory sessions). Command: summarize age The output should appear as follows: Variable Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- age 5 41.8 8.700575 33 55 There is a lot of interesting information here. It tells us there are 5 observations and the mean age is 41.8 years with a standard deviation of almost 9 years. The range is also given as 33 years to 55 years. Now we will examine the 2012 vote variable. Because this is a dichotomous variable, we will create a table and see who your friends voted for. Command: tab vote STATA output: vote Freq. Percent Cum. ------------+----------------------------------- obama 3 60.00 60.00 romney 2 40.00 100.00 ------------+----------------------------------- Total 5 100.00 Remember that you have entered this dichotomous variable as two text words (string variables). It would be better to enter these as 0 and 1 and then label them, which you will learn to do in the lab. This concludes the primer. We look forward to seeing you at the course! 7