Preparing Data for Analysis in Stata

Size: px
Start display at page:

Download "Preparing Data for Analysis in Stata"

Transcription

1 Preparing Data for Analysis in Stata Before you can analyse your data, you need to get your data into an appropriate format, to enable Stata to work for you. To avoid rubbish results, you need to check your data is sensible and free of nonsense values. Recap on preceding workshop, SIDM1 It is essential to create and save a do file of commands, otherwise you will lose your work. Attention needs to be paid to the presence of missing data, which are treated as infinitely large positive values by Stata. You learnt to run commands from this do file as you go along. You also learn how to open and save datasets, log files and graphs. You learn to distinguish between different types of data. You learnt to use variable labels for a fuller description of what they are, and value labels to define categories (when coded numerically, so we know what the numbers represent). You learn basic graph and tables commands, how to create new variables and amend their values, use of if statement. Learning objectives of this Session, SIDM2 This describes how to read in data Stata in the first place. There is a recap on different types of data, and rationale on need to change some variables between formats before analysis can begin, in many cases. This gives example code, and exercises (with solutions available) for you to see how these commands are used in practice. Learning objectives of further workshops, SIDM3 and SIDM4. Merging datasets in many different ways, reshaping datasets, looping in Stata, and extracting saved results into files. Efficient production of publication quality tables. Further resources complementary to this series This series teaches most of the material contained in Stata Data Management.doc, referenced SDM. The accompanying Stata commands crib sheet.xls, SCCS, acts as a quick reference guide (and also summarises some data analysis commands). Stata manuals (accessed online and via help) and Stata help itself, are both excellent resources. The manuals teach statistics, as well as Stata, and provide statistics references. Contents 1. Reading data into Stata from other files Recap on types of data Converting strings to numeric and categorical data as necessary Dealing with Dates Checking for errors and missing data When your dataset erroneously has 2 or more lines of data for a few patients Recoding numeric data into groups Extracting information from string variables Further sources of help... 7 SCCS=Stata Commands Crib Sheet.xls 2.1

2 1. Reading data into Stata from other files Now suppose that the census data we looked at last week was received as an excel file, that we need to read into Stata before we can analyse it. See SDM 2.3 Opening data from an excel file. See SDM chapter 2 for reading in from other sources. cd "H:\_MPHTeaching\stata\dataman\wk2" clear import excel census, firstrow /* reading data in from excel file census. Xls, firstrow indicates that the first row is treated as variable names or labels */ descr // look for string variables, storage type = str## browse // look for string variables which appear in red Read in nlsw88dates2.xls into Stata. View the data, looking at types of variables and seeing what is contains. Summarise the data. Here is some further information on what is contained in this data set, with commands for labelling appropriately: label var grade "Current grade completed" label var c_city "Lives in central city" label var wage "Hourly Wage" label var south "Lives in South" label var union "Union Worker" label var hours "Usual hours worked" label var ttl_exp "Total Work Experience" label var tenure "Job Tenure (years)" label var quesday "Day of month that questionnaire was filled in (started to be filled in)" label var quesmon "Month that questionnaire was filled in (started to be filled in)" label var quesyr "Year that questionnaire was filled in (started to be filled in)" label var quesfinish "Date that questionnaire was completed" 2. Recap on types of data There are 4 main types of data in Stata: i) numeric (numerical with types int, byte, float, double black in data browser) ii) string (e.g. str2, str24 red in data browser) iii) categorical (i.e. numeric with value labels - blue in data browser) iv) dates & times (numeric with format %d or %td or similar black in data browser). The describe command will detail data types, format, presence of value labels and variable labels. It is usually necessary to have data in numeric format, in order to use it in Stata data analysis and most graph commands; this includes dates & time in Stata format recognised as such (showing up in black in the data editor) and categorical data (showing as blue and looking like text). The main exception to this is patient id variable (or hospital id s or regions or similar) where it is generally okay to use a string variable. SCCS=Stata Commands Crib Sheet.xls 2.2

3 Dates and categorical data are often read into Stata as string variables, this is generally the best way to do it. Hence the need to recode into different types of variables. There may also be a desire to recode numeric data into categories, and to recode categorical variables, perhaps by combining categories. 3. Converting strings to numeric and categorical data as necessary See DSM 5.5 and 5.6 Converting strings to numeric data and categorical data *** converting from string variables to numeric variables (when most/ all values already look like strings) destring medage2, replace // converts string to numeric data, keeping the same variable name tab medage2 medage, miss // check new variable against another variable that looks the same scatter medage2 medage // they are identical summ medage medage2 // identical also in number of missing values list if medage2==. // identical also in missing values, they are for the same observation drop medage2 // we don't need 2 identical variables destring marriage, replace // gives error because there is non-numeric data help destring destring marriage, force replace /* converts string var marriage into numeric var, with missing value where there is any non-numeric data */ *** converting from strings to categorical variables descr encode region, gen(region2) // create a categorical (numeric) var (region2) from the string var, region tab region region2 // compare the newly created and original variables tab region region2, nolabel // compare the newly created and original vars without value labels codebook region2 // see correspondence of values and value labels drop region // the string version is nolonger needed encode state2, gen(state_2) // create categorical var (state_2) from string var, state2 tab state2 state_2 // compare - browse state2 state_2 // easier to compare this way codebook state_2 // gives examples of coding label list state_2 // gives the full numeric correspondence between numbers and value labels drop state2 // string version is no-longer needed a) Look for string variables in nlsw88dates2.xls that look numeric/ as if they should be numeric. Create numeric version of (one or more of) the variables. b) Look for categorical variables and convert (some of) them also to numeric variables. For instance, do this for industry and race. Does encode command work well for both/all? If not, then what approach shall we take? c) Try help string function and decide whether it is a good strategy to use one or a few string functions to tidy up the variables before using the encode command. For instance, could take just the first character and change to lower case. SCCS=Stata Commands Crib Sheet.xls 2.3

4 4. Dealing with Dates See SDM chapter 7 on dates. *** convering to Stata dates variables from string variables gen dateofsurvey3=date( dateofsurvey, "DMY") /* converts from string to Stata date variable, string ordered day month year DMY */ browse dateofsurvey dateofsurvey3 // dates are coded as number of days from a fixed date format dateofsurvey3 %d /* display the date variable in a format that we can understand as a date (not a number) */ browse dateofsurvey dateofsurvey3 // check the date can correctly been recoded by Stata a) Look at the nlsw88dates2 data set. Which variables look like dates, but are currently string variables? Change these to Stata date format. b) Now create a stata date variable from the 3 variables which give questionnaire day, month and year. Remember the function mdy for month, day and year. Use help date function and find it if necessary. c) Find the time interval in years between questionnaire date and the date that the questionnaire was finally filled in (quesfinish for the few people where this is not missing). d) Count how many questionnaire dates are before 30 april Count how many dates are after this date. e) There are more date commands described in SDM chapter 7 Dates and time. Within Stata, type help function and click on date and time functions. You will see more options here. 5. Checking for errors and missing data See SDM chapter 6 on looking for errors. *** recode missing values replace pop=. if pop<0 // impossible values recoded to missing replace divorce=. if divorce== // imposssible/ implausible value recoded to missing * remember last time we also checked total populations added up, and can check values are smaller than total population and similar * there are no obvious cross tabulations to check here, e.g. do we have pregnant men? Do we have non-smokers with 10 cigs/day? Summarise all variables, and look at maximum and minimum values. Do they all appear to be valid values? Look at histograms of continuous data to see if there are outlying values, and to see what the distributions look like. Do not recode outlying values to missing (unless you are pretty confident that they are errors and report that you have done this). SCCS=Stata Commands Crib Sheet.xls 2.4

5 *** search for and drop duplicates isid state // checks if state is unique on each row help duplicates // see options for dealing with duplicates duplicates report state // report any 2 or more rows with the same value for state duplicates list state // list the duplicates duplicates tag state, gen(dup) // tag the duplicates, i.e. add a new variable called dup=1 for duplicates, = 0 otherwise browse if dup==1 // browse the duplicates help duplicates // look for any more useful options *duplicates drop state duplicates drop state, force // this drops one duplicate, despite the duplicates not being equal on other variables browse if state=="texas" // see the result isid state // check if state is now unique, i.e. not the same on any 2 rows Note that for duplicates, the egen or the collapse commands, and explicit subscripting, can be useful when we want data contained in both duplicates. See SIDM 3 section 3.6, i.e. next week s class. a) Check for further errors in nlsw88dates2. Do any numeric values look impossible, and need recoding to missing. b) Can you think of any variables that need to be consistent with each other, where you can check for this? c) Read SDM chapter 6 and see if you can think of any times when these errors might apply. d) Note that when you were creating new variables above you will always be looking out for errors as you go along, in order to create appropriate values. 6. When your dataset erroneously has 2 or more lines of data for a few patients help duplicate gives commands that help you to tidy up your data in situations like this. See SIDM3 for more details and for other ways of dealing with these duplicates. 7. Recoding numeric data into groups See SDM 5.8 and 5.9 recoding numeric and categorical variables and creating categorical variables from numeric data. ***** recode numeric data into categories xtile deathq5=death, n(5) // divide number of deaths into quintiles tab deathq5 summ death deathq5 // checks both same amount of missing data xtile marriage_bin=marriage, n(2) // divide number of marriages into 2 groups by the median tab marriage_bin summ marriage mariage_bin // check both have same amount of missing data SCCS=Stata Commands Crib Sheet.xls 2.5

6 tab marriage_bin, sum(marriage) // check the result tabstat marriage, by( marriage_bin) stat(n min max) // look at min and max in each group, comprehensive check hist marriage // see what might be sensible groupings egen marriage50k=cut(marriage), at( ) // divide marriages by pre-chosen cut-offs tab marriage50k // see the result - there are many missing values drop marriage50k // drop variable and try again egen marriage50k=cut(marriage), at( ) /* specify values below and above range to avoid missings */ count summ marriage // now we don't have too much missing data tab marriage50k // see distribution by regions label define marrlbl 0 "0 to 49,000" "50,000 to 99,999" "100,000 to 149,999" "150,000 to 199,999" // defining labels label values marriage50k marrlbl // attaching fully informative labels tab marriage50k // tabulate now shows informative labels label list region2 // want to recode to get a binary variable for North/ South recode region2 (1 2=1) (3 4=2), gen (region_bin) /* recode is very flexible for recoding individual values and ranges of values */ label define region_lbl 1 "North" 2 "South/West" // adding informative labels - firstly we define the value label label values region_bin region_lbl // now we attach the newly created value label to the values tab region2 region_bin // now we tabulate against the original value that we recoded tab state region_bin // tabulating against state - but not ideal since South/West contains some North West states label list tab state if region2==4 // list states in West region tab state_2 if region2==4, nolabel // list numeric values of state_2 for states in West region help recode // look for further details of command recode state_2 ( =1) (nonmissing=0), gen(northwest) /* create a variable specific for North West states (=1 for them, =0 otherwise) */ tab state northwest // check what we have done codebook region_bin // check coding of region_bin variable gen south=region_bin-1 // want variable south=1 for southern states, =0 for northern states replace south=0 if northwest==1 // use newly constructed northwest variable to recode as appropriate tab state south // check the result label define yesno 0 "No" 1 "Yes" // this is standard coding, commonly used for binary variables label values south yesno // attach the newly defined value label to the variable south label values northwest yesno // attach the same value label to the variable northwest descr // shows names of value labels allocated to each variable (blank for variables with no value labels) SCCS=Stata Commands Crib Sheet.xls 2.6

7 Student Exercises: a) Create a new categorical variable for usual hours worked into 3 groups of roughly equal sizes, then label the variable and its values (command xtile). b) Create a new binary variable for age, using the median as a cut-off c) Create a categorical variable for age, divided into 5 year age groups. d) Create a categorical variable containing age in quintiles. e) Create an ethic group variable which is 1 for whites and 0 for other ethnic groups. f) Create a new variable that industry, recoding into fewer categories, according to what you think is sensible. 8. Extracting information from string variables There are many string functions that enable you to extract substrings, from within strings. help string function read through the list if you need to do this, e.g. extract first letter of the State. They also allow you to tidy up strings (e.g. remove leading and trailing blanks, convert all to small letters). 9. Further sources of help Look at the Stata commands crib sheet.xls. Many students find it useful to have a crib sheet on hand whilst analysing data. What are pros and cons of these resources? Stata help Stata manuals Stata youtube videos Stata Commands Crib sheet.xls Stata Data Management.doc Menus in Stata to learn new commands/ find out what is available SCCS=Stata Commands Crib Sheet.xls 2.7

SIDM3: Combining and restructuring datasets; creating summary data across repeated measures or across groups

SIDM3: Combining and restructuring datasets; creating summary data across repeated measures or across groups SIDM3: Combining and restructuring datasets; creating summary data across repeated measures or across groups You might find that your data is in a very different structure to that needed for analysis.

More information

A Short Introduction to STATA

A Short Introduction to STATA A Short Introduction to STATA 1) Introduction: This session serves to link everyone from theoretical equations to tangible results under the amazing promise of Stata! Stata is a statistical package that

More information

After opening Stata for the first time: set scheme s1mono, permanently

After opening Stata for the first time: set scheme s1mono, permanently Stata 13 HELP Getting help Type help command (e.g., help regress). If you don't know the command name, type lookup topic (e.g., lookup regression). Email: tech-support@stata.com. Put your Stata serial

More information

STATA 13 INTRODUCTION

STATA 13 INTRODUCTION STATA 13 INTRODUCTION Catherine McGowan & Elaine Williamson LONDON SCHOOL OF HYGIENE & TROPICAL MEDICINE DECEMBER 2013 0 CONTENTS INTRODUCTION... 1 Versions of STATA... 1 OPENING STATA... 1 THE STATA

More information

Empirical trade analysis

Empirical trade analysis Empirical trade analysis Introduction to Stata Cosimo Beverelli World Trade Organization Cosimo Beverelli Stata introduction Bangkok, 18-21 Dec 2017 1 / 23 Outline 1 Resources 2 How Stata looks like 3

More information

Introduction to STATA

Introduction to STATA Introduction to STATA Duah Dwomoh, MPhil School of Public Health, University of Ghana, Accra July 2016 International Workshop on Impact Evaluation of Population, Health and Nutrition Programs Learning

More information

Intermediate Stata. Jeremy Craig Green. 1 March /29/2011 1

Intermediate Stata. Jeremy Craig Green. 1 March /29/2011 1 Intermediate Stata Jeremy Craig Green 1 March 2011 3/29/2011 1 Advantages of Stata Ubiquitous in economics and political science Gaining popularity in health sciences Large library of add-on modules Version

More information

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA Learning objectives: Getting data ready for analysis: 1) Learn several methods of exploring the

More information

Storage Types Display Format String Numeric (Dis)connect Cha

Storage Types Display Format String Numeric (Dis)connect Cha Storage Types Display Format String Numeric (Dis)connect Characters Jeehoon Han jhan4@nd.edu Fall 2017 Storage Types Storage types Numbers (digits of accuracy) Integers: byte(2), int(4), long(9) Floating

More information

Workshop for empirical trade analysis. December 2015 Bangkok, Thailand

Workshop for empirical trade analysis. December 2015 Bangkok, Thailand Workshop for empirical trade analysis December 2015 Bangkok, Thailand Cosimo Beverelli (WTO) Rainer Lanz (WTO) Content a. Resources b. Stata windows c. Organization of the Bangkok_Dec_2015\Stata folder

More information

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables Jennie Murack You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables How to conduct basic descriptive statistics

More information

Introduction to Stata - Session 1

Introduction to Stata - Session 1 Introduction to Stata - Session 1 Simon, Hong based on Andrea Papini ECON 3150/4150, UiO January 15, 2018 1 / 33 Preparation Before we start Sit in teams of two Download the file auto.dta from the course

More information

Getting Our Feet Wet with Stata SESSION TWO Fall, 2018

Getting Our Feet Wet with Stata SESSION TWO Fall, 2018 Getting Our Feet Wet with Stata SESSION TWO Fall, 2018 Instructor: Cathy Zimmer 962-0516, cathy_zimmer@unc.edu 1) REMINDER BRING FLASH DRIVES! 2) QUESTIONS ON EXERCISES? 3) WHAT IS Stata SYNTAX? a) A set

More information

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation Subject index Symbols %fmt... 106 110 * abbreviation character... 374 377 * comment indicator...346 + combining strings... 124 125 - abbreviation character... 374 377.,.a,.b,...,.z missing values.. 130

More information

Introduction to Stata Toy Program #1 Basic Descriptives

Introduction to Stata Toy Program #1 Basic Descriptives Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.

More information

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT

TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the

More information

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva Econ 329 - Stata Tutorial I: Reading, Organizing and Describing Data Sanjaya DeSilva September 8, 2008 1 Basics When you open Stata, you will see four windows. 1. The Results window list all the commands

More information

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice. I Launching and Exiting Stata 1. Launching Stata Stata can be launched in either of two ways: 1) in the stata program, click on the stata application; or 2) double click on the short cut that you have

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

Introduction to Stata - Session 2

Introduction to Stata - Session 2 Introduction to Stata - Session 2 Siv-Elisabeth Skjelbred ECON 3150/4150, UiO January 26, 2016 1 / 29 Before we start Download auto.dta, auto.csv from course home page and save to your stata course folder.

More information

GETTING DATA INTO THE PROGRAM

GETTING DATA INTO THE PROGRAM GETTING DATA INTO THE PROGRAM 1. Have a Stata dta dataset. Go to File then Open. OR Type use pathname in the command line. 2. Using a SAS or SPSS dataset. Use Stat Transfer. (Note: do not become dependent

More information

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata...

Introduction to Stata Getting Data into Stata. 1. Enter Data: Create a New Data Set in Stata... Introduction to Stata 2016-17 02. Getting Data into Stata 1. Enter Data: Create a New Data Set in Stata.... 2. Enter Data: How to Import an Excel Data Set.... 3. Import a Stata Data Set Directly from the

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

Stata: A Brief Introduction Biostatistics

Stata: A Brief Introduction Biostatistics Stata: A Brief Introduction Biostatistics 140.621 2005-2006 1. Statistical Packages There are many statistical packages (Stata, SPSS, SAS, Splus, etc.) Statistical packages can be used for Analysis Data

More information

Introduction to STATA 6.0 ECONOMICS 626

Introduction to STATA 6.0 ECONOMICS 626 Introduction to STATA 6.0 ECONOMICS 626 Bill Evans Fall 2001 This handout gives a very brief introduction to STATA 6.0 on the Economics Department Network. In a few short years, STATA has become one of

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata.. Stata version 13 January 2015 I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3. Windows..... III -...

More information

For many people, learning any new computer software can be an anxietyproducing

For many people, learning any new computer software can be an anxietyproducing 1 Getting to Know Stata 12 For many people, learning any new computer software can be an anxietyproducing task. When that computer program involves statistics, the stress level generally increases exponentially.

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Introduction

More information

Stata v 12 Illustration. First Session

Stata v 12 Illustration. First Session Launch Stata PC Users Stata v 12 Illustration Mac Users START > ALL PROGRAMS > Stata; or Double click on the Stata icon on your desktop APPLICATIONS > STATA folder > Stata; or Double click on the Stata

More information

Poli 5D Social Science Data Analytics More on Stata

Poli 5D Social Science Data Analytics More on Stata Poli 5D Social Science Data Analytics More on Stata Shane Xinyang Xuan ShaneXuan.com February 1, 2017 ShaneXuan.com 1 / 12 Contact Information Shane Xinyang Xuan xxuan@ucsd.edu The teaching sta is a team!

More information

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL DESCRIPTION: The objective of this example is to illustrate how external

More information

Basic concepts and terms

Basic concepts and terms CHAPTER ONE Basic concepts and terms I. Key concepts Test usefulness Reliability Construct validity Authenticity Interactiveness Impact Practicality Assessment Measurement Test Evaluation Grading/marking

More information

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata

International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata International Graduate School of Genetic and Molecular Epidemiology (GAME) Computing Notes and Introduction to Stata Paul Dickman September 2003 1 A brief introduction to Stata Starting the Stata program

More information

A quick introduction to STATA

A quick introduction to STATA A quick introduction to STATA Data files and other resources for the course book Introduction to Econometrics by Stock and Watson is available on: http://wps.aw.com/aw_stock_ie_3/178/45691/11696965.cw/index.html

More information

OVERVIEW OF WINDOWS IN STATA

OVERVIEW OF WINDOWS IN STATA OBJECTIVES OF STATA This course is the series of statistical analysis using Stata. It is designed to acquire basic skill on Stata and produce a technical reports in the statistical views. After completion

More information

Microsoft Access Database How to Import/Link Data

Microsoft Access Database How to Import/Link Data Microsoft Access Database How to Import/Link Data Firstly, I would like to thank you for your interest in this Access database ebook guide; a useful reference guide on how to import/link data into an Access

More information

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata.. Introduction to Stata 2016-17 01. First Session I- Launching and Exiting Stata... 1. Launching Stata... 2. Exiting Stata.. II - Toolbar, Menu bar and Windows.. 1. Toolbar Key.. 2. Menu bar Key..... 3.

More information

An Introduction to Stata Part II: Data Analysis

An Introduction to Stata Part II: Data Analysis An Introduction to Stata Part II: Data Analysis Kerry L. Papps 1. Overview Do-files Sorting a dataset Combining datasets Creating a dataset of means or medians etc. Weights Panel data capabilities Dummy

More information

ICSSR Data Service. Stata: User Guide. Indian Council of Social Science Research. Indian Social Science Data Repository

ICSSR Data Service. Stata: User Guide. Indian Council of Social Science Research. Indian Social Science Data Repository http://www.icssrdataservice.in/ ICSSR Data Service Indian Social Science Data Repository Stata: User Guide Indian Council of Social Science Research ICSSR Data Service Contents: 1. Introduction 1 2. Opening

More information

Data analysis using Stata , AMSE Master (M1), Spring semester

Data analysis using Stata , AMSE Master (M1), Spring semester Data analysis using Stata 2016-2017, AMSE Master (M1), Spring semester Notes Marc Sangnier Data analysis using Stata Virtually infinite number of tasks for data analysis. Almost infinite number of commands

More information

An Introduction to Stata Part I: Data Management

An Introduction to Stata Part I: Data Management An Introduction to Stata Part I: Data Management Kerry L. Papps 1. Overview These two classes aim to give you the necessary skills to get started using Stata for empirical research. The first class will

More information

Intro to Stata for Political Scientists

Intro to Stata for Political Scientists Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge

More information

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide STEPS Epi Info Training Guide Department of Chronic Diseases and Health Promotion World Health Organization 20 Avenue Appia, 1211 Geneva 27, Switzerland For further information: www.who.int/chp/steps WHO

More information

Introduction to STATA

Introduction to STATA Center for Teaching, Research and Learning Research Support Group American University, Washington, D.C. Hurst Hall 203 rsg@american.edu (202) 885-3862 Introduction to STATA WORKSHOP OBJECTIVE: This workshop

More information

Results Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October Stata 2. Willa Friedman

Results Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October Stata 2. Willa Friedman Results Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October 2010 Stata 2 Willa Friedman Outline of Presentation Importing data from other sources IDs Merging and Appending multiple

More information

STATA Version 9 10/05/2012 1

STATA Version 9 10/05/2012 1 INTRODUCTION TO STATA PART I... 2 INTRODUCTION... 2 Background... 2 Starting STATA... 3 Window Orientation... 4 Command Structure... 4 The Help Menu... 4 Selecting a Subset of the Data... 5 Inputting Data...

More information

Revision of Stata basics in STATA 11:

Revision of Stata basics in STATA 11: Revision of Stata basics in STATA 11: April, 2016 Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka Contents a) Resources b) Stata 11 Interface c) Datasets

More information

0.1 Stata Program 50 /********-*********-*********-*********-*********-*********-*********/ 31 /* Obtain Data - Populate Source Folder */

0.1 Stata Program 50 /********-*********-*********-*********-*********-*********-*********/ 31 /* Obtain Data - Populate Source Folder */ 0.1 Stata Program 1 capture log close master // suppress error and close any open logs 2 log using RDC3-master, name(master) replace text 3 // program: RDC3-master.do 4 // task: Demonstrate basic Stata

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S0 SPSS Intro November 2013 Wilma Heemsbergen w.heemsbergen@nki.nl 1 13.00 ~ 15.30 Database (20 min) SPSS (40 min) Short break Exercise (60 min) This Afternoon During the

More information

Using Microsoft Excel

Using Microsoft Excel About Excel Using Microsoft Excel What is a Spreadsheet? Microsoft Excel is a program that s used for creating spreadsheets. So what is a spreadsheet? Before personal computers were common, spreadsheet

More information

Introduction to Stata

Introduction to Stata Introduction to Stata Introduction In introductory biostatistics courses, you will use the Stata software to apply statistical concepts and practice analyses. Most of the commands you will need are available

More information

Lecture 2: Advanced data manipulation

Lecture 2: Advanced data manipulation Introduction to Stata- A. Chevalier Content of Lecture 2: Lecture 2: Advanced data manipulation -creating data -using dates -merging and appending datasets - wide and long -collapse 1 A] Creating data

More information

CTU Database revised June 2001

CTU Database revised June 2001 CTU Database revised June 2001 The UH Cancer Center CTU Database is a web based application. You should be able to use this application from any web browser, though it was designed for IE8, it has been

More information

Getting started with Stata 2017: Cheat-sheet

Getting started with Stata 2017: Cheat-sheet Getting started with Stata 2017: Cheat-sheet 4. september 2017 1 Get started Graphical user interface (GUI). Clickable. Simple. Commands. Allows for use of do-le. Easy to keep track. Command window: Write

More information

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University

Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University Maximizing Statistical Interactions Part II: Database Issues Provided by: The Biostatistics Collaboration Center (BCC) at Northwestern University While your data tables or spreadsheets may look good to

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 Revised September 2008 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab After having logged in you have to log

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S0 SPSS Intro December 2014 Wilma Heemsbergen w.heemsbergen@nki.nl This Afternoon 13.00 ~ 15.00 SPSS lecture Short break Exercise 2 Database Example 3 Types of data Type

More information

Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561

Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561 Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561 Egenmore Egenmore is a package in Stata that extends the capabilities of the egen command.

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

Week 1: Introduction to Stata

Week 1: Introduction to Stata Week 1: Introduction to Stata Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline Log

More information

STATA WORKSHOP 2. ERL Workshop for Sociology Fall 2014

STATA WORKSHOP 2. ERL Workshop for Sociology Fall 2014 STATA WORKSHOP 2 ERL Workshop for Sociology Fall 2014 NITTY GRITTY DATA SAVVY STEPS TO SUCCESS Write your codebook By doing this first you take care of all the decision making, so you don t have to go

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 27, 2018 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

set mem 10m we can also decide to have the more separation line on the screen or not when the software displays results: set more on set more off

set mem 10m we can also decide to have the more separation line on the screen or not when the software displays results: set more on set more off Setting up Stata We are going to allocate 10 megabites to the dataset. You do not want to allocate to much memory to the dataset because the more memory you allocate to the dataset, the less memory will

More information

Opening a Data File in SPSS. Defining Variables in SPSS

Opening a Data File in SPSS. Defining Variables in SPSS Opening a Data File in SPSS To open an existing SPSS file: 1. Click File Open Data. Go to the appropriate directory and find the name of the appropriate file. SPSS defaults to opening SPSS data files with

More information

Rockefeller College MPA Excel Workshop: Clinton Impeachment Data Example

Rockefeller College MPA Excel Workshop: Clinton Impeachment Data Example Rockefeller College MPA Excel Workshop: Clinton Impeachment Data Example This exercise is a follow-up to the MPA admissions example used in the Excel Workshop. This document contains detailed solutions

More information

WORKFLOW. Effective Data Management Strategies for Doing Research Well

WORKFLOW. Effective Data Management Strategies for Doing Research Well WORKFLOW Effective Data Management Strategies for Doing Research Well WHY? Why an explicit focus on workflow? WHAT? What are the steps and tasks in an effective workflow? HOW? How can we use Stata and

More information

Using Spatial Data in a Desktop GIS; QGIS 2.8 Practical 2

Using Spatial Data in a Desktop GIS; QGIS 2.8 Practical 2 Using Spatial Data in a Desktop GIS; QGIS 2.8 Practical 2 Practical 2 Learning objectives: To work with a vector base map within a GIS and overlay point data. To practise using Ordnance Survey mapping

More information

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file

1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file 1 SPSS Guide 2009 Content 1. Basic Steps for Data Analysis. 3 2. Data Editor. 2.4.To create a new SPSS file 3 4 3. Data Analysis/ Frequencies. 5 4. Recoding the variable into classes.. 5 5. Data Analysis/

More information

Quick Reference for the FloridaCHARTS Fetal Death Query

Quick Reference for the FloridaCHARTS Fetal Death Query Quick Reference for the FloridaCHARTS Fetal Death Query 1. Toolbar Functions 2. Reports 3. Frequently Asked Questions This application is set up in sections. To use it, you do not have to follow any particular

More information

An Introduction to STATA ECON 330 Econometrics Prof. Lemke

An Introduction to STATA ECON 330 Econometrics Prof. Lemke An Introduction to STATA ECON 330 Econometrics Prof. Lemke 1. GETTING STARTED A requirement of this class is that you become very comfortable with STATA, a leading statistical software package. You were

More information

/23/2004 TA : Jiyoon Kim. Recitation Note 1

/23/2004 TA : Jiyoon Kim. Recitation Note 1 Recitation Note 1 This is intended to walk you through using STATA in an Athena environment. The computer room of political science dept. has STATA on PC machines. But, knowing how to use it on Athena

More information

One does not necessarily have special statistical software to perform statistical analyses.

One does not necessarily have special statistical software to perform statistical analyses. Appendix F How to Use a Data Spreadsheet Excel One does not necessarily have special statistical software to perform statistical analyses. Microsoft Office Excel can be used to run statistical procedures.

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL We have spent the first part of the course learning Excel: importing files, cleaning, sorting, filtering, pivot tables and exporting

More information

ECO375 Tutorial 1 Introduction to Stata

ECO375 Tutorial 1 Introduction to Stata ECO375 Tutorial 1 Introduction to Stata Matt Tudball University of Toronto Mississauga September 14, 2017 Matt Tudball (University of Toronto) ECO375H5 September 14, 2017 1 / 25 What Is Stata? Stata is

More information

Activity: page 1/10 Introduction to Excel. Getting Started

Activity: page 1/10 Introduction to Excel. Getting Started Activity: page 1/10 Introduction to Excel Excel is a computer spreadsheet program. Spreadsheets are convenient to use for entering and analyzing data. Although Excel has many capabilities for analyzing

More information

Appendix II: STATA Preliminary

Appendix II: STATA Preliminary Appendix II: STATA Preliminary STATA is a statistical software package that offers a large number of statistical and econometric estimation procedures. With STATA we can easily manage data and apply standard

More information

Appendix II: STATA Preliminary

Appendix II: STATA Preliminary Appendix II: STATA Preliminary STATA is a statistical software package that offers a large number of statistical and econometric estimation procedures. With STATA we can easily manage data and apply standard

More information

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS. 1 SPSS 11.5 for Windows Introductory Assignment Material covered: Opening an existing SPSS data file, creating new data files, generating frequency distributions and descriptive statistics, obtaining printouts

More information

WORKSHOP: Using the Health Survey for England, 2014

WORKSHOP: Using the Health Survey for England, 2014 WORKSHOP: Using the Health Survey for England, 2014 There are three sections to this workshop, each with a separate worksheet. The worksheets are designed to be accessible to those who have no prior experience

More information

Section 3.2 Measures of Central Tendency MDM4U Jensen

Section 3.2 Measures of Central Tendency MDM4U Jensen Section 3.2 Measures of Central Tendency MDM4U Jensen Part 1: Video This video will review shape of distributions and introduce measures of central tendency. Answer the following questions while watching.

More information

Preparing for Data Analysis

Preparing for Data Analysis Preparing for Data Analysis Prof. Andrew Stokes March 21, 2017 Managing your data Entering the data into a database Reading the data into a statistical computing package Checking the data for errors and

More information

Basic Stata Tutorial

Basic Stata Tutorial Basic Stata Tutorial By Brandon Heck Downloading Stata To obtain Stata, select your country of residence and click Go. Then, assuming you are a student, click New Educational then click Students. The capacity

More information

SPSS TRAINING SPSS VIEWS

SPSS TRAINING SPSS VIEWS SPSS TRAINING SPSS VIEWS Dataset Data file Data View o Full data set, structured same as excel (variable = column name, row = record) Variable View o Provides details for each variable (column in Data

More information

Sustainability of Public Policy Lecture 1 Introduc6on STATA. Rossella Iraci Capuccinello

Sustainability of Public Policy Lecture 1 Introduc6on STATA. Rossella Iraci Capuccinello Sustainability of Public Policy Lecture 1 Introduc6on STATA Rossella Iraci Capuccinello Ge=ng started in STATA! Start STATA " Simply click on icon " Stata should look like this: BuDons/menu Review window

More information

Create a new form. To create a form from a new or existing spreadsheet: 1. Click the Tools drop down menu and select Create a form.

Create a new form. To create a form from a new or existing spreadsheet: 1. Click the Tools drop down menu and select Create a form. Create a new form You can choose Google Forms when creating a new doc from Google Drive. You can also create a new form from a Google Sheet or from a template. To create a form within Google Drive: Click

More information

Session 3: Cartography in ArcGIS. Mapping population data

Session 3: Cartography in ArcGIS. Mapping population data Exercise 3: Cartography in ArcGIS Mapping population data Background GIS is well known for its ability to produce high quality maps. ArcGIS provides useful tools that allow you to do this. It is important

More information

Use data on individual respondents from the first 17 waves of the British Household

Use data on individual respondents from the first 17 waves of the British Household Applications of Data Analysis (EC969) Simonetta Longhi and Alita Nandi (ISER) Contact: slonghi and anandi; @essex.ac.uk Week 1 Lecture 2: Data Management Use data on individual respondents from the first

More information

A quick introduction to STATA:

A quick introduction to STATA: 1 HG Revised September 2011 A quick introduction to STATA: (by E. Bernhardsen, with additions by H. Goldstein) 1. How to access STATA from the pc s at the computer lab and elsewhere at UiO. At the computer

More information

Blackboard 5. Instructor Manual Level One Release 5.5

Blackboard 5. Instructor Manual Level One Release 5.5 Bringing Education Online Blackboard 5 Instructor Manual Level One Release 5.5 Copyright 2001 by Blackboard Inc. All rights reserved. No part of the contents of this manual may be reproduced or transmitted

More information

Cross-Sectional Analysis

Cross-Sectional Analysis STATA 13 - SAMPLE SESSION Cross-Sectional Analysis Short Course Training Materials Designing Policy Relevant Research and Data Processing and Analysis with STATA 13 for Windows* 1st Edition Margaret Beaver

More information

Creating summary tables using the sumtable command

Creating summary tables using the sumtable command Creating summary tables using the sumtable command Lauren Scott and Chris Rogers University of Bristol Clinical Trials and Evaluation Unit 2016 London Stata Users Group meeting Scott LJ, Rogers CA. Creating

More information

download instant at

download instant at CHAPTER 1 - LAB SESSION INTRODUCTION TO EXCEL INTRODUCTION: This lab session is designed to introduce you to the statistical aspects of Microsoft Excel. During this session you will learn how to enter

More information

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus Introduction (SPSS) SPSS is the acronym of Statistical Package for the Social Sciences. SPSS is one of the most popular statistical packages which can perform highly complex data manipulation and analysis

More information

NPDA User Guide: How to submit your data by uploading a CSV

NPDA User Guide: How to submit your data by uploading a CSV NPDA User Guide: How to submit your data by uploading a CSV Page 0 of 10 Before you start The NPDA is an audit of the care processes received and outcomes of all children and young people with diabetes

More information

SPSS for Survey Analysis

SPSS for Survey Analysis STC: SPSS for Survey Analysis 1 SPSS for Survey Analysis STC: SPSS for Survey Analysis 2 SPSS for Surveys: Contents Background Information... 4 Opening and creating new documents... 5 Starting SPSS...

More information

Functionality Guide. for CaseWare IDEA Data Analysis

Functionality Guide. for CaseWare IDEA Data Analysis Functionality Guide for CaseWare IDEA Data Analysis CaseWare IDEA Quick Access Functionality Crib Sheet A quick guide to the major functionality you will use within IDEA. FILE TAB: Passport The single

More information

OneUSG Connect. Hire a New Employee. Hire a New Employee HR_JA002

OneUSG Connect. Hire a New Employee. Hire a New Employee HR_JA002 Description This process describes the steps necessary to a new employee into a Position. Conditions A Position has been created in HCM Source Documents Hire Documentation Identify Verification Documentation

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information