Workshop for empirical trade analysis. December 2015 Bangkok, Thailand

Similar documents
Empirical trade analysis

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Revision of Stata basics in STATA 11:

Data analysis using Stata , AMSE Master (M1), Spring semester

A quick introduction to STATA

An Introduction to Stata Part I: Data Management

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation

Introduction to Stata - Session 1

Getting started with Stata 2017: Cheat-sheet

Introduction to STATA

You will learn: The structure of the Stata interface How to open files in Stata How to modify variable and value labels How to manipulate variables

A quick introduction to STATA:

Week 1: Introduction to Stata

Introduction to Stata. Getting Started. This is the simple command syntax in Stata and more conditions can be added as shown in the examples.

A Short Guide to Stata 10 for Windows

OVERVIEW OF WINDOWS IN STATA

Introduction to data analysis using STATA. Miguel Niño-Zarazúa World Institute for Development Economics Research United Nations University

GETTING DATA INTO THE PROGRAM

Intermediate Stata. Jeremy Craig Green. 1 March /29/2011 1

A Quick Guide to Stata 8 for Windows

Intro to Stata for Political Scientists

Introduction to Stata - Session 2

An Introduction to Stata Part II: Data Analysis

ECO375 Tutorial 1 Introduction to Stata

StatLab Workshops 2008

STATA TUTORIAL B. Rabin with modifications by T. Marsh

A quick introduction to STATA:

STATA 13 INTRODUCTION

Results Based Financing for Health Impact Evaluation Workshop Tunis, Tunisia October Stata 2. Willa Friedman

ICSSR Data Service. Stata: User Guide. Indian Council of Social Science Research. Indian Social Science Data Repository

Use data on individual respondents from the first 17 waves of the British Household

SIDM3: Combining and restructuring datasets; creating summary data across repeated measures or across groups

Preparing Data for Analysis in Stata

Getting Our Feet Wet with Stata SESSION TWO Fall, 2018

Stata: A Brief Introduction Biostatistics

An Introduction to Stata Exercise 1

GETTING STARTED WITH STATA. Sébastien Fontenay ECON - IRES

A QUICK INTRODUCTION TO STATA

An Introduction to Stata

An Introduction To Stata and Matlab. Liugang Sheng ECN 240A UC Davis

Dr. Barbara Morgan Quantitative Methods

What is Stata? A programming language to do sta;s;cs Strongly influenced by economists Open source, sort of. An acceptable way to manage data

Formulas, LookUp Tables and PivotTables Prepared for Aero Controlex

Introduction to Stata

Saving Time with Prudent Data Management. Basic Principles To Save Time. Introducing Our Data. Yearly Directed Dyads (Multiple Panel Data)

Introduction to Stata. Written by Yi-Chi Chen

Introduction to gretl

Stata v 12 Illustration. First Session

An Introductory Guide to Stata

Interfacing with MS Office Conference 2017

Introduction to Stata: An In-class Tutorial

Basic Stata Tutorial

Lab 1: Basics of Stata Short Course on Poverty & Development for Nordic Ph.D. Students University of Copenhagen June 13-23, 2000

Introduction to Stata Toy Program #1 Basic Descriptives

Empirical Asset Pricing

STATA Tutorial. Elena Capatina Office hours: Mondays 10am-12, SS5017

The Stata Bible 2.0. Table of Contents. by Dawn L. Teele 1

Opening a Data File in SPSS. Defining Variables in SPSS

Microsoft Access XP Queries. Student Manual

A Practical Introduction to Stata

I Launching and Exiting Stata. Stata will ask you if you would like to check for updates. Update now or later, your choice.

Introduction to STATA

Consolidate and Summarizing Data from Multiple Worksheets

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

Stata version 13. First Session. January I- Launching and Exiting Stata Launching Stata Exiting Stata..

Introduction to Stata First Session. I- Launching and Exiting Stata Launching Stata Exiting Stata..

Lab 3 - Introduction to Graphics

ECONOMICS 452* -- Stata 12 Tutorial 1. Stata 12 Tutorial 1. TOPIC: Getting Started with Stata: An Introduction or Review

STATA Version 9 10/05/2012 1

A Short Guide to Stata 14

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010

Why should you become a Stata programmer?

Data Management 2. 1 Introduction. 2 Do-files. 2.1 Ado-files and Do-files

Economics 145 Fall 2009 Howell Getting Started with Stata

Sacha Kapoor - Masters Metrics

Thanks to Petia Petrova and Vince Wiggins for their comments on this draft.

Excel Second Edition.

Sustainability of Public Policy Lecture 1 Introduc6on STATA. Rossella Iraci Capuccinello

Appendix II: STATA Preliminary

Advanced Regression Analysis Autumn Stata 6.0 For Dummies

To complete this workbook, you will need the following file:

Intro to Stata. University of Virginia Library data.library.virginia.edu. September 16, 2014

RegressItPC installation and test instructions 1

0.1 Stata Program 50 /********-*********-*********-*********-*********-*********-*********/ 31 /* Obtain Data - Populate Source Folder */

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013

After opening Stata for the first time: set scheme s1mono, permanently

Basic tasks in Excel 2013

Introduction to Stata

Introduc)on to Stata. Training Workshop on the Commitment to Equity Methodology CEQ Ins;tute and The Ministry of Finance Accra February 7-10, 2017

Microsoft How to Series

Data Should Not be a Four Letter Word Microsoft Excel QUICK TOUR

Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561

COMP1000 / Spreadsheets Week 2 Review

BIOSTATISTICS LABORATORY PART 1: INTRODUCTION TO DATA ANALYIS WITH STATA: EXPLORING AND SUMMARIZING DATA

General Improvements with GainSeeker versions 8.8 and 8.8.1

A Short Introduction to STATA

Advanced Stata Skills

Title. Description. Quick start. stata.com. misstable Tabulate missing values

Migration and the Labour Market. STATA: An Introduction into the Basics. Dr. Ehsan Vallizadeh. Bamberg, May 17, 2018

Surviving Stata Workshop

Transcription:

Workshop for empirical trade analysis December 2015 Bangkok, Thailand Cosimo Beverelli (WTO) Rainer Lanz (WTO)

Content a. Resources b. Stata windows c. Organization of the Bangkok_Dec_2015\Stata folder d. The directory_definition do file e. Datasets used in this introduction to Stata f. Do files g. Importing data into Stata h. Basic commands i. Merging datasets j. Macros k. Loops l. Graphics 2

a. Resources 1. Stata help and Stata manual 2. A variety of books covering Stata exist Web resources: 1. Germán Rodríguez s webpage Data management, graphics and programming 2. UCLA IDRES webpage Very comprehensive, covering all sorts of topics (data management, analysis, ) with several examples FAQ 3. Statalist Typically accessed via a google search 3

b. Stata windows 4

Type commands here

Type commands here List of variables and labels

List of variables and labels Type commands here Some properties of the variables

Actions taken in the session List of variables and labels Type commands here Some properties of the variables

c. Organization of the Bangkok_December_2015\Stata folder The Bangkok_December_2015\Stata folder contains the following subfolders: data do_files results Do not change the name of the folder or of the sub-folders The first thing to do is to define your own working directory (see slide d.) 6

d. The directory_definition do file To make things easy for all of us, we define a path for the working directory that can be easily changed, and will be changed only once and for all When you open Stata, the first thing to do is to open the directory_definition do-file in the do-file editor Click here: or Ctrl+9 The directory_definition looks like this and you have to change it to your own path 7

e. Datasets and do files used in this introduction to Stata To apply some of the Stata commands described in this presentation, we will use two datasets: WDI.csv a very small subset of the World Development indicators WB_ES.xls derived from the World Bank Enterprise Surveys You can find the datasets in the data directory: "$ZAF\data\Introduction_Stata" There are also 4 do files 01_data_intro_stata 02_descriptive 03_reshape_merge 04_loops You can find the do files in the directory: "$ZAF\do_files\Introduction_Stata 8

f. Do files A do file is a set of Stata commands typed in a plain text file When you work with STATA, always use do files e.g. one do file for creating your master dataset and one do file for regressions Do files can also be used to set globals and directories or to run a series of different do files after each other 9

Typical commands at beginning of each do file: clear all /* removes all data in the current Stata session */ set more off, perm /* prevents Stata to pause while running a do file */ capture log close /* closes a log file */ cd directory /* sets the directory, e.g.. $ZAF\data */ log using filename, replace /* useful for long do files */ use dataset.dta, replace /* open dataset in Stata format (.dta); are not needed */ 10

Typical commands at beginning of each do file: clear all /* removes all data in the current Stata session */ set more off, perm /* prevents Stata to pause while running a do file */ capture log close /* closes a log file */ cd directory /* sets the directory, e.g.. $ZAF\data */ log using filename, replace /* useful for long do files */ use dataset.dta, replace /* open dataset in Stata format (.dta); are not needed */ Notes * treats everything after it in a line as a comment /* text */ will make Stata treat text as a comment ( text can span over several lines) 10

g. Importing data into Stata insheet using filename.csv, clear Typically used for text files that are either comma or tab-separated import excel using filename.xls, sheet( Sheet1 ) first clear Reads excel files directly into Stata Allows to specify variables, cell range and worksheet to import Copy paste is strongly discouraged. Watch out. The accuracy of copied numbers depends on: How data are formatted in excel, i.e. how many digits are shown Your settings in Stata (use set type double before copying) To save the dataset (in Stata format): save filename, replace Exercise: execute the do file 01_data_into_stata.do 11

h. Basic commands Installing packages ssc install (or findit) packagename Identify missing values (represented by. (dot) or empty cell) inspect varlist (codebook varlist) Identify duplicate observations duplicates (report/drop/tag/list) varlist Identify number of unique values unique varlist (also, codebook for single variable) Browse the dataset browse varlist 12

Other basic commands describe generate destring/tostring replace rename Alternative: renvars keep drop The list can go on what is important is to keep in mind that, in case of doubt, you can always use the help : help command 13

List of useful operators commonly used in expressions Arithmetic Logical Relational + add! not (also ~) == equal - subtract or!= not equal (also ~=) * multiply & and < less than / divide <= less than or equal ^ raise to power > greater than + string concatenation >= greater than or equal 14

Commands for descriptive statistics summarize varlist tabulate var1 var 2 table rowvar (colvar), content() tabstat varlist, statistics() by() 15

The egen command Often used command to create new variables Commonly used egen functions (refer to WB_ES dataset): bysort cou sector: egen sales_sec=total(sales), missing bysort cou sector: egen sales_sec=mean(sales) egen exp_tot=rowtotal(exp_intermediate exp_final) egen id_cluster=group(cou sector) egen cou_sec=concat(cou sector) Further functions include: max, min, count, tag, 16

String functions generate newvar =function() Some useful functions are: abbrev() > shortens the string the number of indicated characters length() > returns the number of characters of the string subinstr() > allows to replace or delete particular substrings substr() > allows to extract substrings based on its position upper (lower) > Changes the entire string to uppercase (lower-case) strings trim() > Removes leading and trailing blanks of the string 17

The collapse command collapse (mean) varlist (sum) varlist, by(varlist) Creates an aggregate dataset by e.g. averaging or summing variables across the dimension identified in by() All observations not included in the command are dropped Useful in analysis when moving to a higher level of aggregation, e.g. aggregating trade flows from HS 6-digit to HS 2-digit Useful for calculating descriptive statistics before exporting them to excel using export excel 18

The collapse command (ct d) 19

The collapse command (ct d) If you do not want to collapse, duplicates drop after bysort (): egen gives the same results as collapse. Example (see 02_descriptive.do): collapse (mean) sales (sum) dummy_exp, by(cou sector) is equivalent to: bysort cou sector: egen avg_sales = mean(sales) bysort cou sector: egen number_exporters = total(dummy_exp) keep cou sector avg_sales number_exporters duplicates drop 19

The reshape command reshape wide (long) stub, i(var) j(var) options Reshapes dataset from long to wide format and vice versa Data dimensions such as country, year or sector are normally put in long format stub are variables in reshape wide and stubs of variables in reshape long i(var) are identifying dimensions; j(var) dimension to change Exercise: Open WDI.dta and reshape it first long and then wide (see 03_reshape_merge.do) Long i j stub 1 1 4.1 1 2 4.5 2 1 3.3 Reshape Wide i stub1 stub2 1 4.1 4.5 2 3.3 3.0 2 2 3.0 20

i. Merging datasets To merge datasets you can use joinby or merge joinby varlist using filename, unmatched(both) The command forms all pairwise combination for varlist unmatched can keep unmatched observations from the master dataset, the using dataset or both (see generated variable _merge) Exercise: Open WB_ES.dta and merge it with WDI.dta (see 03_reshape_merge.do) 21

Exercises Execute 02_descriptive.do Execute 03_reshape_merge.do (opens WB_ES.dta and merges it with WDI.dta ) 22

j. Macros Macros are names associated with some text The commands global and local assign strings to global and local macro names Globals and locals have a variety of uses To define the directories for this class, i.e. directory_definition.do They are used in loops (see next slides) A set of explanatory variables can be grouped under one macro name Global macros, once defined, are available anywhere in Stata Local macros are only available within the selected lines of a do file 23

k. Loops See Stata help and Germán Rodríguez s webpage Two main commands: foreach and forvalues foreach loops through strings of text, forvalues loops through numbers Syntax (3 alternatives): foreach item in a-list-of-things (e.g. a b d) { commands referring `item } foreach varname of varlist list-of-variables { commands referring to `varname } forvalues number = first(step)last { commands referring to `number' } 24

Examples for loops in WB_ES.dta Ex1 foreach k in USA JPN { /* Loop over any_list */ egen sales_`k'=total(sales) if cou=="`k'" } Ex2 vallist cou, local(c) /* vallist shows values and creates local */ foreach k of local c { /* Loop over a local macro */ capture drop sales_`k' egen sales_`k'=total(sales) if cou=="`k'" } Ex3 forvalues k=1(1)3 { /* Loop over sector codes. The range can be defined in different ways*/ egen total_`k'=total(sales) if sector==`k' } foreach can also be used to loop over variables and numbers foreach k of var varlist; foreach k of num numlist 25

l. Graphics Useful links: official Stata or UCLA IDRES Histograms and bar graphs: histogram var graph bar (stat) var, over(var) 26

More graphics kdensity lnsales.1.2.3 0 Distribution plots: histogram var, frequency kdensity twoway kdensity var twoway (kdensity var1 if var2== ) (kdensity var1 if var2 == ), by(var) JPN USA Graphs by cou 0 5 10 15 20 0 5 10 15 20 x Non-exporters Exporters 27

More graphics Scatter plots: scatter yvar xvar graph twoway (scatter yvar xvar) (lfit yvar xvar) 28

More graphics Line plots: line yvar year graph twoway (line yvar year if ) (line yvar year if ) 29

Exercise Execute 04_loops.do 30