International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

Size: px
Start display at page:

Download "International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata"

Transcription

1 International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata Paul Dickman September A brief introduction to Stata Starting the Stata program We assume you have a computer in front of you and that you know how to open an application in Windows. To start Stata: 1. Click on Start 2. Click on Programs 3. Click on Stata 4. Select Intercooled Stata 8 It is also possible to start Stata by double-clicking on the Stata icon on your desktop, or clicking on a Stata data set. If everything has gone smoothly so far, then you should be able to see the Stata default windows. 1. Review 2. Variables 3. Results 4. Command 5. Stata Toolbar based, in part, on notes written by David Clayton and Michael Hills 1

2 Tutorials The command tutorial starts a somewhat interactive series of tutorials to learn some of the Stata facilities in specific areas. Further details are available on page 12 of this handout. To run the tutorial intro type. tutorial intro Useful Stata links Resources for learning Stata can be found at Getting help Click on Help, or type help followed by a command name, for on-line help. whelp. Try also Closing the program Choose exit from the file menu, click the Windows close box (the x in the top right corner), or type exit at the command line. You will have to type clear first if you have any data in memory (or simply type exit, clear). Note that Stata is case sensitive. To interrupt a Stata command, click on break or press ctrl break. Types of Stata files Data files in Stata format are given the extension.dta. These are created using save filename and read in with use filename. There are four other types of input file:.raw for raw data,.dct for data plus variable names,.do for batch files containing Stata commands,.ado for Stata programs, and.log for log files. Syntax command varnames if... in... using..., options The if part restricts the command to records satisfying certain logical conditions (eg sex==1), the in part restricts the command to certain line numbers, and the using part specifies any files which may be needed. Abbreviations Stata accepts unambiguous abbreviations for commands and variable names. 2

3 2 A hands-on introduction to Stata To introduce you to Stata we use the IVF data which consists of 641 records on mothers who had singleton births following in-vitro fertilisation. The variables in the dataset are shown in Table 1. Variable Units or Coding Type Name Subject number categorical id Maternal age years metric matage Hypertension 1=hypertensive, 0=normal binary hyp Gestational age weeks metric gestwks Sex of infant 1=male, 2=female binary sex Birthweight grams metric bweight Table 1: Variables in the IVF dataset Type in the commands which start with the Stata prompt (. ). Do not type the. prompt this is used to indicate a Stata command. Stata distinguishes between upper and lower case letters, and accepts abbreviations for both commands and variable names. Think carefully about what is happening after each command. The file ivf.dta contains the variables names and values for the 641 records and can be accessed over the world wide web from within Stata. To read the data, type. use describe Now type the following. Describe Stata will return an error message (unrecognised command: Describe). Stata is case sensitive; describe is a valid Stata command, whereas Describe is not. A good way to start the analysis is to ask for a summary of the data by typing. summarize This will produce the mean, standard deviation, and range, for each variable in turn. In most datasets there will be some missing values. These are coded using the symbol. in place of the value which is missing. Stata can recognize other codes for missing values, but this is the one which is recommended. The summarize command is useful for seeing whether there are missing values (the column labelled Obs gives the number of non-missing observations). 3

4 For a more detailed summary of the variable gestwks try. codebook gestwks or. summarize gestwks, detail Many Stata commands can be accessed using menus. For example, from the Summaries menu, select Median/Percentiles. You will notice that the result is identical to that obtained from the command typed previously (summarize gestwks, detail) and that Stata even shows the command which was used. The list command is used to list the values in the data file. Try out the following and see their consequences:. list in 1/5. list matage in 1/10. list matage. list matage bweight in 1/20 Stata stops after each screenfull of output. Click on more (or hit the spacebar) to get another screenfull, or press enter to continue line by line. The command list on its own would list all of the data. You can cancel this command (and any other Stata command) by clicking on Break (the icon in the toolbar which looks like a red circle with a white cross through it). Stata also contains a spreadsheet-style editor which can be brought to the front by typing. edit Close this window by clicking in the close box (in the top right corner of the window). The browse command will bring up a similar window, except changes cannot be made to the data. The data window can also be opened using icons on the toolbar (the two icons look like spreadsheets, with a magnifying glass over the data browser icon) or from the Data menu. When starting to look at any new data the first step is to check that the values of the variables make sense and correspond to the codes defined in the coding schedule. For categorical variables this can be done by looking at one-way frequency tables and checking that only the specified codes occur. For metric variables we need to look at ranges. This first look at the data will also indicate whether all values are present or whether there are some missing values on some variables. Let us begin by looking at the categorical variables. The distribution of the categorical variables hyp and sex can be viewed by typing. tabulate hyp. tab sex 4

5 To treat missing values as a separate category, the missing option can be used. tabulate hyp, missing Note that tab is an abbreviation for tabulate. The cross-tabulation of hyp and sex is obtained by typing. tab hyp sex Cross tabulations are useful when checking for consistency. The basic output from a cross tabulation reports frequencies only; to include row and/or column percentages add the options row, col, cell, or any combination, as in. tab hyp sex, col missing The command table is used for preparing tables of summary statistics by one, two, or even more categorical variables. For example, to obtain the means and standard deviations of bweight separately by sex, type. table sex, contents(freq mean bweight sd bweight) To make a table of the median and interquartile range for birthweight, by sex, try. table sex, contents(freq med bweight iqr bweight) Note that tab is an abbreviation for tabulate, NOT for table, which must be typed in full. You can type whelp tabulate and whelp table to understand how, if, you can abbreviate the command. 3 Restricting commands Stata commands can be restricted to records 1, 2,..., 10 (for example), by adding in 1/10 to the command. The letters f and l can be used as abbreviations for first and last, so 20/l refers to the records from 20 onwards. Commands can also be restricted to operate only on records which satisfy given conditions. The conditions are added to the command using if followed by a logical expression which takes the values true or false. For example, to restrict the command list to records with birthweight less than or equal to 2000g, type. list id bweight if bweight <= 2000 The record is listed only if the logical expression bweight <= 2000 is true. A useful command when exploring data is count which counts the number of records which satisfy some logical expression. For example. count if bweight <= count if bweight <= 2000 & sex==1 5

6 Note the use of & to link two conditions both of which must be satisfied and that a double equal sign (==) is used for equality testing. A common error is to use = in a logical expression instead of ==. The following comparison operators and logical functions are available: Arithmetic Logical Comparison addition ~ not > greater than - subtraction or < less than * multiplication & and >= > or equal / division <= < or equal ^ power == equal ~= not equal 4 Generating and recoding variables New variables are generated using the command generate, and variables can be recoded using recode. For example, to create a new variable sex2 which is the same as sex but coded 1 for male and 0 for female, try. gen sex2=sex. recode sex2 2=0. tab sex2 5 Sorting The records in a dataset can be sorted according to the values of one or more variables. The births dataset is currently sorted by id but for some purposes it might be better to have it sorted by bweight. Try. list id bweight in 1/10. sort bweight. list id bweight in 1/10 The records are now in order of bweight and the id numbers and all other variables have also been sorted in this order. Stata commands which use the option by() usually require the data to be first sorted by the variable in the by() option. The sort is not done automatically because you should always be aware of how your data are sorted. 6

7 6 Editing commands The PageUp and PageDown keys (represented as arrows on the top right of the keypad) can be used to cycle through previous commands, which can then be edited. For example, if you decide that you would also like to list the values of the variable matage you could use the PageUp key to recall the previous command and then edit it in the command line to be:. list id bweight matage in 1/10 This capability is especially useful if you make a small mistake while typing a command. The command can be recalled, edited, and resubmitted. It also makes it easy to resubmit the same command with additional options. 7 Using Stata as a calculator The display command can be used to carry out simple calculations. For example, the command. display 2+2 will display the answer 4, while. display log(10) will display the answer Note that log means natural log in Stata. To obtain base 10 logarithms use the log10 function. For example,. display log10(1000) will return the value 3. Standard probability functions can also be displayed, as in. display normprob(1.96) which will return the probability that a random variable with a standard normal distribution (i.e. mean 0 and variance 1) is less that

8 8 Graphical displays The Stata graphics procedures were completely rewritten for version 8 and are now quite powerful. Following are just a few simple examples. To obtain a histogram of bweight, type the following. It may take a few seconds for the graph to be displayed.. hist bweight, freq You can vary the number of rectangles in the histogram (called bins) by adding bin(20), etc. To superimpose the histogram with a normal curve which has the same mean and standard deviation as the data, add the option normal. Try, for example,. hist bweight, freq bin(20) normal You can also produce this plot via the Graphics / Easy graphs / Histogram menu. This provides a useful way of exploring the various options for the hist command. Note that you can save time by using the PageUp to recall the previous command, to which you then can add the additional options. We can also produce separate graphs for each level of a categorical variable by using a by() command. Note that we must first sort the data when using a by() command.. sort hyp. hist gestwks, by(hyp) Scatter plots can be used to evaluate the association between, for example, the metric variables bweight and matage by typing. scatter bweight matage To plot bweight against gestwks, try. scatter bweight gestwks 9 Missing values The missing value symbol in Stata is. and is treated as plus infinity in logical comparisons. Stata commands automatically exclude missing values when they are coded in this way. 8

9 10 Icons on the Stata toolbar Some file operations are easier to perform by clicking on the appropriate icon on the toolbar rather than typing commands at the command line. The most commonly used icons are the first four from the left: Open a Stata data file Save a Stata data file Print (graph or log file (if a log file is open)). Log file operations (Open, Close, Suspend, Resume). 11 Saving data files The Stata data currently in memory can be saved in a file by clicking on the Save icon (the floppy disk) on the toolbar. You will need to type in a name for your file which, by default, will be saved in the default directory with the extension.dta. 12 Logging and printing results Graphs can be printed directly by selecting Print graph from the File menu, or you can copy it and past it into any of your word processor (for instance MS Word). Other output must first be written to a log file before it can be printed. A log file can be opened by clicking on the log icon on the toolbar (the fourth icon from the left. You will need to type in a name for your file which, by default, will be saved in your personal directory with the extension.log. 13 Using the menus Most Stata commands can be accessed from the menus. Experiment with some of the commands in the Data, Graphics and Statistics menus. For example, select Graphics / Easy Graphs / Scatterplot and then select bweight as the Y axis variable and gestwks as the X axis variable and click OK. The resulting graph is the same as if you typed the command. scatter bweight gestwks 9

10 14 Some practice with basic commands Remember to make use of the help command during these exercises. You are encouraged to explore and use the menus. 1. List the variables bweight and hyp for records inclusive. 2. Obtain the frequency distribution of matage together with its histogram. 3. Obtain the two way table of frequencies of sex and hyp, first with row, then column, then cell percentages. Is there evidence of an association between the two variables? Do you think it s statistically significant? [Note that you are not expected to perform a formal statistical significance test, just give your impression.] 4. Calculate the mean birthweight for hypertensive and non-hypertensive mothers. Is there evidence of an association? Do you think it s statistically significant? [Note that you are not expected to perform a formal statistical significance test, just give your impression.] 5. The mean birthweight of babies to hypertensive mothers is considerably lower than the mean birthweight of babies to non-hypertensive mothers. It turns out that this difference is highly statistically significant (based on a t-test, which you will learn later during the course). Do you believe that the association is causal (i.e. that hypertension causes babies to be smaller)? 6. It is possible that the association between hypertension and birthweight is confounded by gestational age (gstwks). If so, gestational age should be associated with both the exposure (hypertension) and the outcome (birthweight). Study appropriate tables or graphs to determine if such associations exist. 7. Imagine we wish to classify babies weighing less that 2500 g as being low birth weight. Create a dichotomous variable, lbw which takes the value 1 for babies of low birth weight and 0 otherwise. 8. Produce a table showing the proportion of low birth weight babies of each sex. 9. Produce a histogram of birthweights (use at least 20 bins). Does the distribution appear to be symmetric? 10. Now produce histograms of birthweights for each level of hyp. Do the distributions appear to be symmetric? 11. Produce a scatterplot of maternal age against patient ID. Is there evidence of an association between these variables? 12. Formal statistical tests suggest that there is a statistically significant inverse (or negative) association between maternal age against patient ID. How might such an association arise and what are the possible consequences for the analysis of these data? 10

11 Some useful commands A, B are categorical variables. X, Y are metric variables. Data Management use Read in a data set already in Stata format infile using Read in data in a txt file with names describe (or f3) Describe contents of data in memory list List values of variables drop A Drops the variable called A drop if... Drops all records satisfying... generate A = Creates a new variable called A replace A = Replaces contents of A recode A Recodes the variable called A save filename Save data set in Stata format sort A Sort records according to the variable A count if... Count number of observations satisfying... Statistics and Graphics summarize Y tabulate A tabulate A B table A, c(mean X) graph Y, hist graph Y X, scatter hist A regress Y X predict P Display summary statistics for Y One-way table of frequencies for A (categorical) Two-way table of frequencies for A and B Table of mean X by levels of A Displays histogram of Y Displays scatter plot of Y vs X Histogram of the categorical variable A Linear regression of Y on X Obtain prediction after regress and put in P Utilities clear Clear data from memory display 2+2 Display the result of 2+2 do filename Execute commands from filename.do exit Exit Stata exit, clear Clear and exit Stata help Obtain on-line help for both data and commands log using filename Write output to filename.log 11

12 The Stata tutorials The official Stata package provides the following tutorials Tutorial intro graphics tables regress anova logit survival factor ourdata yourdata Description An introduction to Stata How to make graphs How to make tables Estimating regression models, including 2SLS Estimating one-, two- and N-way ANOVA and ANOCOVA models Estimating maximum-likelihood logit and probit models Estimating maximum-likelihood survival models Estimating factor and principal component models Description of the data we provide How to input your own data into Stata Useful ones to try are intro, graphics, and yourdata. To run the tutorial intro type. tutorial intro 12

Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes

Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes Doctoral Program in Epidemiology for Clinicians, April 2001 Computing notes Paul Dickman, Rino Bellocco April 18, 2001 We will be using the computer teaching room located on the second floor of Norrbacka,

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

Introduction to STATA

Introduction to STATA Introduction to STATA Duah Dwomoh, MPhil School of Public Health, University of Ghana, Accra July 2016 International Workshop on Impact Evaluation of Population, Health and Nutrition Programs Learning

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

Introduction to Stata. Written by Yi-Chi Chen

Introduction to Stata. Written by Yi-Chi Chen Introduction to Stata Written by Yi-Chi Chen Center for Social Science Computation & Research 145 Savery Hall University of Washington Seattle, WA 98195 U.S.A (206)543-8110 September 2002 http://julius.csscr.washington.edu/pdf/stata.pdf

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

After opening Stata for the first time: set scheme s1mono, permanently

After opening Stata for the first time: set scheme s1mono, permanently Stata 13 HELP Getting help Type help command (e.g., help regress). If you don't know the command name, type lookup topic (e.g., lookup regression). Email: tech-support@stata.com. Put your Stata serial

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

Brief Guide on Using SPSS 10.0

Brief Guide on Using SPSS 10.0 Brief Guide on Using SPSS 10.0 (Use student data, 22 cases, studentp.dat in Dr. Chang s Data Directory Page) (Page address: http://www.cis.ysu.edu/~chang/stat/) I. Processing File and Data To open a new

More information

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics

More information

An Introduction to Stata Part I: Data Management

An Introduction to Stata Part I: Data Management An Introduction to Stata Part I: Data Management Kerry L. Papps 1. Overview These two classes aim to give you the necessary skills to get started using Stata for empirical research. The first class will

More information

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture..

Stata version 14 Also works for versions 13 & 12. Lab Session 1 February Preliminary: How to Screen Capture.. Stata version 14 Also works for versions 13 & 12 Lab Session 1 February 2016 1. Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. 3. Preliminary: How to Save

More information

Introduction to Stata

Introduction to Stata Introduction to Stata Introduction In introductory biostatistics courses, you will use the Stata software to apply statistical concepts and practice analyses. Most of the commands you will need are available

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

STATA Version 9 10/05/2012 1

STATA Version 9 10/05/2012 1 INTRODUCTION TO STATA PART I... 2 INTRODUCTION... 2 Background... 2 Starting STATA... 3 Window Orientation... 4 Command Structure... 4 The Help Menu... 4 Selecting a Subset of the Data... 5 Inputting Data...

More information

Stata version 12. Lab Session 1 February Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session..

Stata version 12. Lab Session 1 February Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. Stata version 12 Lab Session 1 February 2013 1. Preliminary: How to Screen Capture.. 2. Preliminary: How to Keep a Log of Your Stata Session.. 3. Preliminary: How to Save a Stata Graph... 4. Enter Data:

More information

Basics of Stata, Statistics 220 Last modified December 10, 1999.

Basics of Stata, Statistics 220 Last modified December 10, 1999. Basics of Stata, Statistics 220 Last modified December 10, 1999. 1 Accessing Stata 1.1 At USITE Using Stata on the USITE PCs: Stata is easily available from the Windows PCs at Harper and Crerar USITE.

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva Econ 329 - Stata Tutorial I: Reading, Organizing and Describing Data Sanjaya DeSilva September 8, 2008 1 Basics When you open Stata, you will see four windows. 1. The Results window list all the commands

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 09/10/2018 Summarising Data Today we will consider Different types of data Appropriate ways to summarise these

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

Introduction to STATA

Introduction to STATA Center for Teaching, Research and Learning Research Support Group American University, Washington, D.C. Hurst Hall 203 rsg@american.edu (202) 885-3862 Introduction to STATA WORKSHOP OBJECTIVE: This workshop

More information

Stata versions 12 & 13 Week 4 Practice Problems

Stata versions 12 & 13 Week 4 Practice Problems Stata versions 12 & 13 Week 4 Practice Problems SOLUTIONS 1 Practice Screen Capture a Create a word document Name it using the convention lastname_lab1docx (eg bigelow_lab1docx) b Using your browser, go

More information

A Short Guide to Stata 10 for Windows

A Short Guide to Stata 10 for Windows A Short Guide to Stata 10 for Windows 1. Introduction 2 2. The Stata Environment 2 3. Where to get help 2 4. Opening and Saving Data 3 5. Importing Data 4 6. Data Manipulation 5 7. Descriptive Statistics

More information

Introduction to Stata. Getting Started. This is the simple command syntax in Stata and more conditions can be added as shown in the examples.

Introduction to Stata. Getting Started. This is the simple command syntax in Stata and more conditions can be added as shown in the examples. Getting Started Command Syntax command varlist, option This is the simple command syntax in Stata and more conditions can be added as shown in the examples. Preamble mkdir tutorial /* to create a new directory,

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

RUDIMENTS OF STATA. After entering this command the data file WAGE1.DTA is loaded into memory.

RUDIMENTS OF STATA. After entering this command the data file WAGE1.DTA is loaded into memory. J.M. Wooldridge Michigan State University RUDIMENTS OF STATA This handout covers the most often encountered Stata commands. It is not comprehensive, but the summary will allow you to do basic data management

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 Revised September 2008 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab After having logged in you have to log

More information

Stata versions 12 & 13 Week 4 - Practice Problems

Stata versions 12 & 13 Week 4 - Practice Problems Stata versions 12 & 13 Week 4 - Practice Problems DUE: Monday February 24, 2014 Last submission date for credit: Monday March 3, 2014 1 Practice Screen Capture a Create a word document Name it using the

More information

Department of Economics Spring 2016 University of California Economics 154 Professor Martha Olney Stata Lesson Wednesday February 17, 2016

Department of Economics Spring 2016 University of California Economics 154 Professor Martha Olney Stata Lesson Wednesday February 17, 2016 University of Califnia Economics 154 Berkeley Profess Martha Olney Stata Lesson Wednesday February 17, 2016 [1] Where to find the data sets http://www.econ.berkeley.edu/~olney/spring16/econ154 There are

More information

API-202 Empirical Methods II Spring 2004 A SHORT INTRODUCTION TO STATA 8.0

API-202 Empirical Methods II Spring 2004 A SHORT INTRODUCTION TO STATA 8.0 API-202 Empirical Methods II Spring 2004 A SHORT INTRODUCTION TO STATA 8.0 Course materials and data sets will assume that you are using Stata to complete the analysis. Stata is available on all of the

More information

A Quick Guide to Stata 8 for Windows

A Quick Guide to Stata 8 for Windows Université de Lausanne, HEC Applied Econometrics II Kurt Schmidheiny October 22, 2003 A Quick Guide to Stata 8 for Windows 2 1 Introduction A Quick Guide to Stata 8 for Windows This guide introduces the

More information

ECONOMICS 452* -- Stata 12 Tutorial 1. Stata 12 Tutorial 1. TOPIC: Getting Started with Stata: An Introduction or Review

ECONOMICS 452* -- Stata 12 Tutorial 1. Stata 12 Tutorial 1. TOPIC: Getting Started with Stata: An Introduction or Review Stata 12 Tutorial 1 TOPIC: Getting Started with Stata: An Introduction or Review DATA: auto1.raw and auto1.txt (two text-format data files) TASKS: Stata 12 Tutorial 1 is intended to introduce you to some

More information

Homework 1 Excel Basics

Homework 1 Excel Basics Homework 1 Excel Basics Excel is a software program that is used to organize information, perform calculations, and create visual displays of the information. When you start up Excel, you will see the

More information

Using Large Data Sets Workbook Version A (MEI)

Using Large Data Sets Workbook Version A (MEI) Using Large Data Sets Workbook Version A (MEI) 1 Index Key Skills Page 3 Becoming familiar with the dataset Page 3 Sorting and filtering the dataset Page 4 Producing a table of summary statistics with

More information

1 Introduction. 1.1 What is Statistics?

1 Introduction. 1.1 What is Statistics? 1 Introduction 1.1 What is Statistics? MATH1015 Biostatistics Week 1 Statistics is a scientific study of numerical data based on natural phenomena. It is also the science of collecting, organising, interpreting

More information

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file 1 SPSS Guide 2009 Content 1. Basic Steps for Data Analysis. 3 2. Data Editor. 2.4.To create a new SPSS file 3 4 3. Data Analysis/ Frequencies. 5 4. Recoding the variable into classes.. 5 5. Data Analysis/

More information

ECO375 Tutorial 1 Introduction to Stata

ECO375 Tutorial 1 Introduction to Stata ECO375 Tutorial 1 Introduction to Stata Matt Tudball University of Toronto Mississauga September 14, 2017 Matt Tudball (University of Toronto) ECO375H5 September 14, 2017 1 / 25 What Is Stata? Stata is

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

Appendix II: STATA Preliminary

Appendix II: STATA Preliminary Appendix II: STATA Preliminary STATA is a statistical software package that offers a large number of statistical and econometric estimation procedures. With STATA we can easily manage data and apply standard

More information

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1 TOPIC: Getting Started with Stata Stata 10 Tutorial 1 DATA: auto1.raw and auto1.txt (two text-format data files) TASKS: Stata 10 Tutorial 1 is intended to introduce (or re-introduce) you to some of the

More information

STAT:5400 Computing in Statistics

STAT:5400 Computing in Statistics STAT:5400 Computing in Statistics Introduction to SAS Lecture 18 Oct 12, 2015 Kate Cowles 374 SH, 335-0727 kate-cowles@uiowaedu SAS SAS is the statistical software package most commonly used in business,

More information

OVERVIEW OF WINDOWS IN STATA

OVERVIEW OF WINDOWS IN STATA OBJECTIVES OF STATA This course is the series of statistical analysis using Stata. It is designed to acquire basic skill on Stata and produce a technical reports in the statistical views. After completion

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SPSS SPSS (originally Statistical Package for the Social Sciences ) is a commercial statistical software package with an easy-to-use

More information

Microsoft Excel 2007

Microsoft Excel 2007 Microsoft Excel 2007 1 Excel is Microsoft s Spreadsheet program. Spreadsheets are often used as a method of displaying and manipulating groups of data in an effective manner. It was originally created

More information

Using Microsoft Excel

Using Microsoft Excel About Excel Using Microsoft Excel What is a Spreadsheet? Microsoft Excel is a program that s used for creating spreadsheets. So what is a spreadsheet? Before personal computers were common, spreadsheet

More information

Also, for all analyses, two other files are produced upon program completion.

Also, for all analyses, two other files are produced upon program completion. MIXOR for Windows Overview MIXOR is a program that provides estimates for mixed-effects ordinal (and binary) regression models. This model can be used for analysis of clustered or longitudinal (i.e., 2-level)

More information

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots. MINITAB Guide PREFACE Preface This guide is used as part of the Elementary Statistics class (Course Number 227) offered at Los Angeles Mission College. It is structured to follow the contents of the textbook

More information

Introduction to gretl

Introduction to gretl Introduction to gretl Applied Economics Department of Economics Universidad Carlos III de Madrid Outline 1 What is gretl? 2 gretl Basics 3 Importing Data 4 Saving as gretl File 5 Running a Script 6 First

More information

Statistics with a Hemacytometer

Statistics with a Hemacytometer Statistics with a Hemacytometer Overview This exercise incorporates several different statistical analyses. Data gathered from cell counts with a hemacytometer is used to explore frequency distributions

More information

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages...

BIOSTATS 640 Spring 2018 Introduction to R Data Description. 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... BIOSTATS 640 Spring 2018 Introduction to R and R-Studio Data Description Page 1. Start of Session. a. Preliminaries... b. Install Packages c. Attach Packages... 2. Load R Data.. a. Load R data frames...

More information

Chapter One: Getting Started With IBM SPSS for Windows

Chapter One: Getting Started With IBM SPSS for Windows Chapter One: Getting Started With IBM SPSS for Windows Using Windows The Windows start-up screen should look something like Figure 1-1. Several standard desktop icons will always appear on start up. Note

More information

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT Go to the course Blackboard site and download Laboratory 3 MYSTAT Intro.xls open this file in

More information

An Introduction to Stata Part II: Data Analysis

An Introduction to Stata Part II: Data Analysis An Introduction to Stata Part II: Data Analysis Kerry L. Papps 1. Overview Do-files Sorting a dataset Combining datasets Creating a dataset of means or medians etc. Weights Panel data capabilities Dummy

More information

Lab 1: Introduction, Plotting, Data manipulation

Lab 1: Introduction, Plotting, Data manipulation Linear Statistical Models, R-tutorial Fall 2009 Lab 1: Introduction, Plotting, Data manipulation If you have never used Splus or R before, check out these texts and help pages; http://cran.r-project.org/doc/manuals/r-intro.html,

More information

Advanced Regression Analysis Autumn Stata 6.0 For Dummies

Advanced Regression Analysis Autumn Stata 6.0 For Dummies Advanced Regression Analysis Autumn 2000 Stata 6.0 For Dummies Stata 6.0 is the statistical software package we ll be using for much of this course. Stata has a number of advantages over other currently

More information

QUEEN MARY, UNIVERSITY OF LONDON. Introduction to Statistics

QUEEN MARY, UNIVERSITY OF LONDON. Introduction to Statistics QUEEN MARY, UNIVERSITY OF LONDON MTH 4106 Introduction to Statistics Practical 1 10 January 2012 In this practical you will be introduced to the statistical computing package called Minitab. You will use

More information

Appendix II: STATA Preliminary

Appendix II: STATA Preliminary Appendix II: STATA Preliminary STATA is a statistical software package that offers a large number of statistical and econometric estimation procedures. With STATA we can easily manage data and apply standard

More information

SPSS. (Statistical Packages for the Social Sciences)

SPSS. (Statistical Packages for the Social Sciences) Inger Persson SPSS (Statistical Packages for the Social Sciences) SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform basic statistical calculations in SPSS.

More information

ICSSR Data Service. Stata: User Guide. Indian Council of Social Science Research. Indian Social Science Data Repository

ICSSR Data Service. Stata: User Guide. Indian Council of Social Science Research. Indian Social Science Data Repository http://www.icssrdataservice.in/ ICSSR Data Service Indian Social Science Data Repository Stata: User Guide Indian Council of Social Science Research ICSSR Data Service Contents: 1. Introduction 1 2. Opening

More information

Introduction to StatsDirect, 15/03/2017 1

Introduction to StatsDirect, 15/03/2017 1 INTRODUCTION TO STATSDIRECT PART 1... 2 INTRODUCTION... 2 Why Use StatsDirect... 2 ACCESSING STATSDIRECT FOR WINDOWS XP... 4 DATA ENTRY... 5 Missing Data... 6 Opening an Excel Workbook... 6 Moving around

More information

STATA Tutorial. Introduction to Econometrics. by James H. Stock and Mark W. Watson. to Accompany

STATA Tutorial. Introduction to Econometrics. by James H. Stock and Mark W. Watson. to Accompany STATA Tutorial to Accompany Introduction to Econometrics by James H. Stock and Mark W. Watson STATA Tutorial to accompany Stock/Watson Introduction to Econometrics Copyright 2003 Pearson Education Inc.

More information

Getting Our Feet Wet with Stata SESSION TWO Fall, 2018

Getting Our Feet Wet with Stata SESSION TWO Fall, 2018 Getting Our Feet Wet with Stata SESSION TWO Fall, 2018 Instructor: Cathy Zimmer 962-0516, cathy_zimmer@unc.edu 1) REMINDER BRING FLASH DRIVES! 2) QUESTIONS ON EXERCISES? 3) WHAT IS Stata SYNTAX? a) A set

More information

How to use Excel Spreadsheets for Graphing

How to use Excel Spreadsheets for Graphing How to use Excel Spreadsheets for Graphing 1. Click on the Excel Program on the Desktop 2. You will notice that a screen similar to the above screen comes up. A spreadsheet is divided into Columns (A,

More information

Lab #1: Introduction to Basic SAS Operations

Lab #1: Introduction to Basic SAS Operations Lab #1: Introduction to Basic SAS Operations Getting Started: OVERVIEW OF SAS (access lab pages at http://www.stat.lsu.edu/exstlab/) There are several ways to open the SAS program. You may have a SAS icon

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 HG Revised September 2011 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab and elsewhere at UiO. At the computer

More information

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus Introduction (SPSS) SPSS is the acronym of Statistical Package for the Social Sciences. SPSS is one of the most popular statistical packages which can perform highly complex data manipulation and analysis

More information

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS. 1 SPSS 11.5 for Windows Introductory Assignment Material covered: Opening an existing SPSS data file, creating new data files, generating frequency distributions and descriptive statistics, obtaining printouts

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

2 The Stata user interface

2 The Stata user interface 2 The Stata user interface The windows This chapter introduces the core of Stata s interface: its main windows, its toolbar, its menus, and its dialogs. The five main windows are the Review, Results, Command,

More information

Intro To Excel Spreadsheet for use in Introductory Sciences

Intro To Excel Spreadsheet for use in Introductory Sciences INTRO TO EXCEL SPREADSHEET (World Population) Objectives: Become familiar with the Excel spreadsheet environment. (Parts 1-5) Learn to create and save a worksheet. (Part 1) Perform simple calculations,

More information

If you use Stata for Windows, starting Stata is straightforward. You just have to double-click on the wstata (or stata) icon.

If you use Stata for Windows, starting Stata is straightforward. You just have to double-click on the wstata (or stata) icon. Stata Handout 1. Starting Stata If you use Stata for Windows, starting Stata is straightforward. You just have to double-click on the wstata (or stata) icon. If you use Stata for Unix, type at the athena

More information

Excel tutorial Introduction

Excel tutorial Introduction Office button Excel tutorial Introduction Microsoft Excel is an electronic spreadsheet. You can use it to organize your data into rows and columns. You can also use it to perform mathematical calculations

More information

MINITAB 17 BASICS REFERENCE GUIDE

MINITAB 17 BASICS REFERENCE GUIDE MINITAB 17 BASICS REFERENCE GUIDE Dr. Nancy Pfenning September 2013 After starting MINITAB, you'll see a Session window above and a worksheet below. The Session window displays non-graphical output such

More information

Department of Economics Spring 2018 University of California Economics 154 Professor Martha Olney Stata Lesson Thursday February 15, 2018

Department of Economics Spring 2018 University of California Economics 154 Professor Martha Olney Stata Lesson Thursday February 15, 2018 University of California Economics 154 Berkeley Professor Martha Olney Stata Lesson Thursday February 15, 2018 [1] Where to find the data sets http://www.econ.berkeley.edu/~olney/spring18/econ154 There

More information

StatLab Workshops 2008

StatLab Workshops 2008 Stata Workshop Fall 2008 Adrian de la Garza and Nancy Hite Using STATA at the Statlab 1. The Different Windows in STATA Automatically displayed windows o Command Window: executes STATA commands; type in

More information

AcaStat User Manual. Version 8.3 for Mac and Windows. Copyright 2014, AcaStat Software. All rights Reserved.

AcaStat User Manual. Version 8.3 for Mac and Windows. Copyright 2014, AcaStat Software. All rights Reserved. AcaStat User Manual Version 8.3 for Mac and Windows Copyright 2014, AcaStat Software. All rights Reserved. http://www.acastat.com Table of Contents INTRODUCTION... 5 GETTING HELP... 5 INSTALLATION... 5

More information

EXCEL BASICS: MICROSOFT OFFICE 2007

EXCEL BASICS: MICROSOFT OFFICE 2007 EXCEL BASICS: MICROSOFT OFFICE 2007 GETTING STARTED PAGE 02 Prerequisites What You Will Learn USING MICROSOFT EXCEL PAGE 03 Opening Microsoft Excel Microsoft Excel Features Keyboard Review Pointer Shapes

More information

Chapter 2 Assignment (due Thursday, April 19)

Chapter 2 Assignment (due Thursday, April 19) (due Thursday, April 19) Introduction: The purpose of this assignment is to analyze data sets by creating histograms and scatterplots. You will use the STATDISK program for both. Therefore, you should

More information

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

More information

GETTING STARTED WITH MINITAB INTRODUCTION TO MINITAB STATISTICAL SOFTWARE

GETTING STARTED WITH MINITAB INTRODUCTION TO MINITAB STATISTICAL SOFTWARE Six Sigma Quality Concepts & Cases Volume I STATISTICAL TOOLS IN SIX SIGMA DMAIC PROCESS WITH MINITAB APPLICATIONS CHAPTER 2 GETTING STARTED WITH MINITAB INTRODUCTION TO MINITAB STATISTICAL SOFTWARE Amar

More information

2. Getting started with MLwiN

2. Getting started with MLwiN 2. Getting started with MLwiN Introduction This chapter aims to provide you with some practice with MLwiN commands before you begin to fit multilevel models. It is may be helpful if you have already read

More information

Introduction to Microsoft Excel

Introduction to Microsoft Excel Intro to Excel Introduction to Microsoft Excel OVERVIEW In this lab, you will become familiar with the general layout and features of Microsoft Excel spreadsheet computer application. Excel has many features,

More information

Data Management Project Using Software to Carry Out Data Analysis Tasks

Data Management Project Using Software to Carry Out Data Analysis Tasks Data Management Project Using Software to Carry Out Data Analysis Tasks This activity involves two parts: Part A deals with finding values for: Mean, Median, Mode, Range, Standard Deviation, Max and Min

More information

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Minitab 17 commands Prepared by Jeffrey S. Simonoff Minitab 17 commands Prepared by Jeffrey S. Simonoff Data entry and manipulation To enter data by hand, click on the Worksheet window, and enter the values in as you would in any spreadsheet. To then save

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA This lab will assist you in learning how to summarize and display categorical and quantitative data in StatCrunch. In particular, you will learn how to

More information

Excel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division

Excel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division Excel R Tips EXCEL TIP 1: INPUTTING FORMULAS To input a formula in Excel, click on the cell you want to place your formula in, and begin your formula with an equals sign (=). There are several functions

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

SAS Training Spring 2006

SAS Training Spring 2006 SAS Training Spring 2006 Coxe/Maner/Aiken Introduction to SAS: This is what SAS looks like when you first open it: There is a Log window on top; this will let you know what SAS is doing and if SAS encountered

More information

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2018/10/30) The numbers of figures in the SPSS_screenshot.pptx are shown in red.

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2018/10/30) The numbers of figures in the SPSS_screenshot.pptx are shown in red. Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2018/10/30) The numbers of figures in the SPSS_screenshot.pptx are shown in red. 1. How to display English messages from IBM SPSS Statistics

More information

User Services Spring 2008 OBJECTIVES Introduction Getting Help Instructors

User Services Spring 2008 OBJECTIVES  Introduction Getting Help  Instructors User Services Spring 2008 OBJECTIVES Use the Data Editor of SPSS 15.0 to to import data. Recode existing variables and compute new variables Use SPSS utilities and options Conduct basic statistical tests.

More information

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL SPSS QM II SHORT INSTRUCTIONS This presentation contains only relatively short instructions on how to perform some statistical analyses in SPSS. Details around a certain function/analysis method not covered

More information

Quick Start Guide Jacob Stolk PhD Simone Stolk MPH November 2018

Quick Start Guide Jacob Stolk PhD Simone Stolk MPH November 2018 Quick Start Guide Jacob Stolk PhD Simone Stolk MPH November 2018 Contents Introduction... 1 Start DIONE... 2 Load Data... 3 Missing Values... 5 Explore Data... 6 One Variable... 6 Two Variables... 7 All

More information