Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Similar documents
Workshop for empirical trade analysis. December 2015 Bangkok, Thailand

Empirical trade analysis

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation

Revision of Stata basics in STATA 11:

A quick introduction to STATA

An Introduction to Stata Part I: Data Management

Data analysis using Stata , AMSE Master (M1), Spring semester

Introduction to Stata - Session 2

Introduction to Stata - Session 1

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

Introduction to Stata. Getting Started. This is the simple command syntax in Stata and more conditions can be added as shown in the examples.

Introduction to data analysis using STATA. Miguel Niño-Zarazúa World Institute for Development Economics Research United Nations University

Week 1: Introduction to Stata

STATA TUTORIAL B. Rabin with modifications by T. Marsh

Introduction to STATA

OVERVIEW OF WINDOWS IN STATA

Use data on individual respondents from the first 17 waves of the British Household

A Short Guide to Stata 10 for Windows

Getting started with Stata 2017: Cheat-sheet

StatLab Workshops 2008

An Introduction to Stata Part II: Data Analysis

A quick introduction to STATA:

Intermediate Stata. Jeremy Craig Green. 1 March /29/2011 1

Results Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October Stata 2. Willa Friedman

GETTING DATA INTO THE PROGRAM

Saving Time with Prudent Data Management. Basic Principles To Save Time. Introducing Our Data. Yearly Directed Dyads (Multiple Panel Data)

Dr. Barbara Morgan Quantitative Methods

Thanks to Petia Petrova and Vince Wiggins for their comments on this draft.

STATA Tutorial. Elena Capatina Office hours: Mondays 10am-12, SS5017

A Quick Guide to Stata 8 for Windows

SIDM3: Combining and restructuring datasets; creating summary data across repeated measures or across groups

An Introduction to Stata

Data Management Using Stata: A Practical Handbook

An Introduction To Stata and Matlab. Liugang Sheng ECN 240A UC Davis

/23/2004 TA : Jiyoon Kim. Recitation Note 1

Introduction to Stata: An In-class Tutorial

What is Stata? A programming language to do sta;s;cs Strongly influenced by economists Open source, sort of. An acceptable way to manage data

ICSSR Data Service. Stata: User Guide. Indian Council of Social Science Research. Indian Social Science Data Repository

INTRODUCTION TO SAS HOW SAS WORKS READING RAW DATA INTO SAS

Econ Stata Tutorial I: Reading, Organizing and Describing Data. Sanjaya DeSilva

Getting Our Feet Wet with Stata SESSION TWO Fall, 2018

Intro to Stata for Political Scientists

SOCY7709: Quantitative Data Management Instructor: Natasha Sarkisian. Reading and Writing Datasets

A quick introduction to STATA:

Preparing Data for Analysis in Stata

Contents Append Merge Further Exercises... 24

Lab 1: Basics of Stata Short Course on Poverty & Development for Nordic Ph.D. Students University of Copenhagen June 13-23, 2000

Storage Types Display Format String Numeric (Dis)connect Cha

Sustainability of Public Policy Lecture 1 Introduc6on STATA. Rossella Iraci Capuccinello

A QUICK INTRODUCTION TO STATA

SAS Online Training: Course contents: Agenda:

Sacha Kapoor - Masters Metrics

Lecture 2: Advanced data manipulation

Appendix II: STATA Preliminary

Microsoft Access 2016

Microsoft Access 2016

Formulas, LookUp Tables and PivotTables Prepared for Aero Controlex

ECONOMICS 452* -- Stata 12 Tutorial 1. Stata 12 Tutorial 1. TOPIC: Getting Started with Stata: An Introduction or Review

Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561

GETTING STARTED WITH STATA. Sébastien Fontenay ECON - IRES

Title. Description. Quick start. stata.com. misstable Tabulate missing values

Introduction to Stata Toy Program #1 Basic Descriptives

An Introduction to Stata Exercise 1

Appendix II: STATA Preliminary

B.E. Publishing Correlations to The Office Specialist.com, 2E to Microsoft Office Specialist Word 2016 Core (77-725)

ECO375 Tutorial 1 Introduction to Stata

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

Lesson 4: Auditing and Additional Formulas. Return to the FastCourse Excel 2007 Level 3 book page

Base and Advance SAS

Interfacing with MS Office Conference 2017

PAM 4280/ECON 3710: The Economics of Risky Health Behaviors Fall 2015 Professor John Cawley TA Christine Coyer. Stata Basics for PAM 4280/ECON 3710

Stata Session 2. Tarjei Havnes. University of Oslo. Statistics Norway. ECON 4136, UiO, 2012

Outline. Mixed models in R using the lme4 package Part 1: Introduction to R. Following the operations on the slides

Advanced Stata Skills

Introduction to Programming in Stata

Introduction to Stata

Recoding and Labeling Variables

Empirical Asset Pricing

Consolidate and Summarizing Data from Multiple Worksheets

ECONOMICS 351* -- Stata 10 Tutorial 1. Stata 10 Tutorial 1

Empirical Research: Getting Started with Stata

COURSE CONTENT Excel with VBA Training

Programming with Stata

Using Subroutines and Loops to Simplify Submissions Involving Several Datasets

0.1 Stata Program 50 /********-*********-*********-*********-*********-*********-*********/ 31 /* Obtain Data - Populate Source Folder */

5. Exercise 5 Produce publication-quality tables of summary statistics and regression results

A Practical Introduction to Stata

Data Should Not be a Four Letter Word Microsoft Excel QUICK TOUR

The Stata Bible 2.0. Table of Contents. by Dawn L. Teele 1

A Short Guide to Stata 14

Stata basics (Stata 13 or 14)

STATA Hand Out 1. STATA's latest version is version 12. Most commands in this hand-out work on all versions of STATA.

Getting Started Guide

COMM 205 MIDTERM REVIEW SESSION BY KEVIN DHIR

EXAMPLE 10: PART I OFFICIAL GEOGRAPHICAL IDENTIFIERS IN THE UNDERSTANDING SOCIETY PART II LINKING MACRO-LEVEL DATA AT THE LSOA LEVEL

CLAREMONT MCKENNA COLLEGE. Fletcher Jones Student Peer to Peer Technology Training Program. Basic Statistics using Stata

Stata v 12 Illustration. First Session

Sage 50 U.S. Edition Intelligence Reporting Getting Started Guide

Excel 2016: Core Data Analysis, Manipulation, and Presentation; Exam

Introduc)on to Stata. Training Workshop on the Commitment to Equity Methodology CEQ Ins;tute and The Ministry of Finance Accra February 7-10, 2017

Transcription:

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1

Introduction to STATA 2

Content a. Datasets used in Introduction to Stata b. Resources c. Importing data into Stata d. Do files e. Commands for variable s management and descriptive statistics f. Macros g. Loops 3

a. Datasets used in Introduction to Stata To apply some of the Stata commands described in this presentation, we will use two datasets: WDI.dta - a very small subset of the World Development indicators WB_ES.dta derived from the World Bank Enterprise Surveys You can find the datasets in the directory: Stata_material\data\IntroductionStata\ 4

b. Resources Stata help and Stata manual A variety of books covering Stata exist Web resources: Germán Rodríguez s webpage Data management, graphics and programming UCLA IDRES webpage Very comprehensive covering all sorts of topics (data management, analysis, ) with many examples FAQ Statalist Typically accessed via a google search 5

c. Importing data into STATA insheet using filename, clear delimit( ; ) names Typically used for text files that are either comma or tab-separated Rarely used alternatives: infix (fixed-column format); infile (free format) Stata12: import excel using filename, sheet( Ex1 ) first Reads excel files directly into Stata Allows to specifiy variables, cellrange and worksheet to import StatTransfer Specialised software to transfer data Copy paste Sometimes the most efficient way, e.g. when you do not want to write a do file To watch out: the accuracy of copied numbers depends on 1. how data are formatted in excel, i.e. how many digits are shown, and 2. your settings in Stata(use set type double before copying) 6

d. Do files If you work with STATA, (almost) always use do files E.g. one do file for creating your master dataset and one do file for regressions Do files can also be used to set globals and directories or to run a series of different do files after each other Typical commands at beginning of each do file: clear all /* removes all data */ set more off, perm /* prevents Stata to pause while runnning a do file */ capture log close /* closes a log file */ cd directory /* sets the directory, e.g. C:\Research\data\ */ log using filename, replace /* useful for long do files, allows printing */ capture log close /* at the end of a do file that is logged */ use dataset.dta, replace /* open dataset; are not necessarily needed */ 7

e. Commands for variable s management and descriptive statistics generate newvar=exp [if] Creates a new variable replace oldvar=exp [if] Replaces an existing variable rename old_varname new_varname Renames variable; alternative: renvars varlist To drop or keep variables you can use drop varlist or keep varlist To drop or keep observations you can use drop if or keep if 8

e. Commands for variable s management and descriptive statistics (ct d) describe Provides information on dataset (#obs, #vars, size) and on variables (type, labels) sum(marize) varlist Provides #obs, mean, std. dev., min., max tab(ulate) var1 var 2 Provides one- or two-way tables of frequencies tab cou sector Allows the creation of dummy variables with the option generate() table rowvar (colvar), content() Provides frequencies by default. The option contents allows for other statistics table cou sector, content(mean sales sum d_exp) by cou: table sector, content(mean sales sum d_exp) tabstat varlist, statistics() by() Another command to calculate summary statistics tabstat sales if cou=="usa", by(sector) 9

e. Commands for variable s management and descriptive statistics (ct d) Commands to identify missings inspect varlist e.g. inspect cou codebook varlist e.g. codebook cou duplicates (report/drop/tag/list) varlist Reports, drops, tags or lists observations that are identical in all variables or identical in the variables specified by varlist unique varlist Reports the number of unique values for varlist 10

e. Commands for variable s management and descriptive statistics (ct d) egen newvar=function(varlist or other argument) Often used command to create new variables, see Stata help Often used egen functions: bysort cou sector: egen sales_sec=total(sales), missing bysort cou sector: egen sales_sec=mean(sales) egen exp_tot=rowtotal(exp_intermediate exp_final) egen id_cluster=group(cou sector) egen cou_sec=concat(cou sector) Further functions include: max, min, count, tag, 11

e. Commands for variable s management and descriptive statistics (ct d) collapse (mean) varlist (sum) varlist, by(varlist) Creates an aggregate dataset by e.g. averaging or summing variables across the dimension identified in by() All observations not included in the command are dropped Useful in analysis when moving to a higher level of aggregation, e.g. aggregating trade flows from HS 6-digit to HS 2-digit Useful for calculating descriptive statistics before exporting them to excel using outsheet or export excel egen can be used to create aggregates within the disaggregated dataset bysort cou sector: egen sales_sec=total(sales), missing duplicates drop after egen gives the same results as collapse keep cou sector sales_sec duplicates drop 12

e. Commands for variable s management and descriptive statistics (ct d) destring varlist, replace force Converts a string variable to a numeric variable Useful when numbers are imported as string into Stata The other way round numeric to string: tostring varlist, gen(newvar) String functions: generate newvar =function() Allow to manipulate string variables. See Stata help. Some useful functions are: abbrev () shortens the string the number of indicated characters length() returns the length of the string, i.e. number of characters subinstr() allows to replace or delete particular substrings substr() allows to extract substrings based on its position upper (lower) Changes the entire string to upper-case (lower-case) strings trim() removes leading and trailing blanks of the string 13

e. Commands for variable s management and descriptive statistics (ct d) reshape wide (long) stub, i() j() options Reshapes dataset from long to wide format and vice versa Data dimensions such as country, year or sector are normally put in long format stub are variables in reshape wide and stubs of variables in reshape long i() are identifying dimensions; j() dimension to change Exercise: Open WDI.dta and reshape it first long and then wide To merge datasets use either merge or joinby merge (1:1,m:1,1:m,m:m) varlist using filename, update keepusing(varlist) joinby varlist using filename, unmatched(both) update Merge is used to add further variables to observations in the master data Joinby forms all pairwise combination for varlist Exercise: Open WB_ES.dta and merge it with WTI.dta 14

f. Macros See Stata help and Germán Rodríguez s webpage Macros are names associated with some text The commands global and local assign strings to global and local macro names global mname [=exp :extended_fcn [`]"[string]"['] ] Global macros, once defined, are available anywhere in Stata local lclname [=exp :extended_fcn [`]"[string]"['] ] Simplest example: local c USA JPN Local macros work only within the do file in which they are defined Globals and locals have a variety of uses To define the directories for this class, i.e. directory_definition.do They are used in loops (see next slides) A set of explanatory variables can be grouped under one macro name 15

g. Loops See Stata help and Germán Rodríguez s webpage Two main commands: foreach and forvalues Syntax: foreach lname {in of listtype} list { commands referring to `lname } forvalues lname = range { commands referring to `lname' } 16

g. Loops (ct d) Examples for loops in WB_ES.dta: foreach k in USA JPN { /* Loop over any_list */ egen sales_`k'=total(sales) if cou=="`k'" } vallist cou, local(c) /* vallist shows values and creates local */ foreach k of local c { /* Loop over a local macro */ capture drop sales_`k' egen sales_`k'=total(sales) if cou=="`k'" } forvalues k=1(1)3 { /* Loop over sector codes */ egen total_`k'=total(sales) if sector==`k' } Foreach can also be used to loop over variables and numbers foreach k of var varlist; foreach k of num numlist 17