set mem 10m we can also decide to have the more separation line on the screen or not when the software displays results: set more on set more off

Size: px
Start display at page:

Download "set mem 10m we can also decide to have the more separation line on the screen or not when the software displays results: set more on set more off"


1 Setting up Stata We are going to allocate 10 megabites to the dataset. You do not want to allocate to much memory to the dataset because the more memory you allocate to the dataset, the less memory will be available to perform the commands. You could reduce the speed of Stata or even kill it. set mem 10m we can also decide to have the more separation line on the screen or not when the software displays results: set more on set more off Setting up a panel Now, we have to instruct Stata that we have a panel dataset. We do it with the command tsset, or iis and tis iis idcode tis year or tsset idcode year In the previous command, idcode is the variable that identifies individuals in our dataset. Year is the variable that identifies time periods. This is always the rule. The commands refering to panel data in Stata almost always start with the prefix xt. You can check for these commands by calling the help file for xt. help xt Thierry Warin,

2 You should describe and summarize the dataset as usually before you perform estimations. Stata has specific commands for describing and summarizing panel datasets. xtdes xtsum xtdes permits you to observe the pattern of the data, like the number of individuals with different patterns of observations across time periods. In our case, we have an unbalanced panel because not all individuals have observations to all years. The xtsum command gives you general descriptive statistics of the variables in the dataset, considering the overall, the between and the within variations. Overall refers to the whole dataset. Between refers to the variation of the means to each individual (across time periods). Within refers to the variation of the deviation from the respective mean to each individual. You may be interested in applying the panel data tabulate command to a variable. For instance, to the variable south, in order to obtain a one-way table. xttab south As in the previous commands, Stata will report the tabulation for the overall variation, the within and the between variation. How to generate variables Generating variables gen age2=age^2 gen ttl_exp2=ttl_exp^2 gen tenure2=tenure^2 Thierry Warin,

3 Now, let's compute the average wage for each individual (across time periods). bysort idcode: egen meanw=mean(ln_wage) In this case, we did not apply the sort command previously and then the by prefix command. We could have done it, but with this only command, you can always abreviate the implementation of the by prefix command. The command egen is an extension of the gen command to generate new variables. The general rule to apply egen is when you want to generate a new variable that is created using a function inside Stata. In our case, we used the function mean. You can apply the command list to list the first 10 observations of the new variable mwage. list meanw in 1/10 And then apply the xtsum command to summarize the new variable. xtsum meanw You may want to obtain the average of the logarithm of wages to each year in the panel. bysort year: egen meanw1=mean(ln_wage) And then you can apply the xttab command. xttab meanw1 Generating dates Let s generate dates: Gen varname2 = date(varname1, dmy ) Thierry Warin,

4 And format: Format varname2 %d How to generate dummies Generating general dummies Let's generate the dummy variable black, which is not in our dataset. gen black=1 if race==2 replace black=0 if black==. Suppose you want to generate a new variable called tenure1 that is equal to the variable tenure lagged one period. Than you would use a time series operator (l). First, you would need to sort the dataset according to idcode and year, and then generate the new variable with the "by" prefix on the variable idcode. sort idcode year by idcode: gen tenure1=l.tenure If you were interested in generating a new variable tenure3 equal to one difference of the variable tenure, you would use the time series d operator. by idcode: gen tenure3=d.tenure If you would like to generate a new variable tenure4 equal to two lags of the variable tenure, you would type: by idcode: gen tenure4=l2.tenure The same principle would apply to the operator d. Let's just save our data file with the changes that we made to it. Thierry Warin,

5 save, replace Another way would be to use the xi command. It takes the items (string of letters, for instance) of a designated variable (category, for instance) and create a dummy variable for each item. You need to change the base anyway: char _dta[omit] prevalent xi: i.category tabulate category Generating time dummies In order to do this, let's first generate our time dummies. We use the "tabulate" command with the option "gen" in order to generate time dummies for each year of our dataset. We will name the time dummies as "y", and we will get a first time dummy called "y1" which takes the value 1 if year=1980, 0 otherwise, a second time dummy "y2" which assumes the value 1 if year=1982, 0 otherwise, and similarly for the remaining years. You could give any other name to your time dummies. tab year, g(y) Thierry Warin,

6 Running OLS regressions Let's now turn to estimation commands for panel data. The first type of regression that you may run is a pooled OLS regression, which is simply an OLS regression applied to the whole dataset. This regression is not considering that you have different individuals across time periods, and so, it is not considering for the panel nature of the dataset. reg ln_wage grade age ttl_exp tenure black not_smsa south In the previous command, you do not need to type age1 or age2. You just need to type age. When you do this, you are instructing Stata to include all the variables starting with the expression age to be included in the regression. Suppose you want to observe the internal results saved in Stata associated with the last estimation. This is valid for any regression that you perform. In order to observe them, you would type: ereturn list If you want to control for some categories: xi: reg dependent ind1 ind2 i.category1 i.category2 i.time Let's perform a regression where only the variation of the means across individuals is considered. This is the between regression. xtreg ln_wage grade age ttl_exp tenure black not_smsa south, be Thierry Warin,

7 Running Panel regressions In empirical work in panel data, you are always concerned in choosing between two alternative regressions. This choice is between fixed effects (or within, or least squares dummy variables - LSDV) estimation and random effects (or feasible generalized least squares - FGLS) estimation. In panel data, in the two-way model, the error term can be the result of the sum of three components: 1. The two-way model assumes the error term as having a specific individual term effect, 2. a specific time effect 3. and an additional idiosyncratic term. In the one-way model, the error term can be the result of the sum of one component: 1. assumes the error term as having a specific individual term effect It is absolutely fundamental that the error term is not correlated with the independent variables. If you have no correlation, then the random effects model should be used because it is a weighted average of between and within estimations. But, if there is correlation between the individual and/or time effects and the independent variables, then the individual and time effects (fixed effects model) must be estimated as dummy variables in order to solve for the endogeneity problem. The fixed effects (or within regression) is an OLS regression of the form: (yit - yi. - y.t + y..) = (xit - xi. - x.t + x..)b + (vit - vi. - v.t + v..) Thierry Warin,

8 where yi., xi. and vi. are the means of the respective variables (and the error) within the individual across time, y.t, x.t and v.t are the means of the respective variables (and the error) within each time period across individuals and y.., x.. and v.. is the overall mean of the respective variables (and the error). Choosing between Fixed effects and Random effects? The Hausman test The generally accepted way of choosing between fixed and random effects is running a Hausman test. Statistically, fixed effects are always a reasonable thing to do with panel data (they always give consistent results) but they may not be the most efficient model to run. Random effects will give you better P-values as they are a more efficient estimator, so you should run random effects if it is statistically justifiable to do so. Thierry Warin,

9 The Hausman test checks a more efficient model against a less efficient but consistent model to make sure that the more efficient model also gives consistent results. To run a Hausman test comparing fixed with random effects in Stata, you need to first estimate the fixed effects model, save the coefficients so that you can compare them with the results of the next model, estimate the random effects model, and then do the comparison. 1. xtreg dependentvar independentvar1 independentvar2..., fe 2. estimates store fixed 3. xtreg dependentvar independentvar1 independentvar2..., re 4. estimates store random 5. hausman fixed random The hausman test tests the null hypothesis that the coefficients estimated by the efficient random effects estimator are the same as the ones estimated by the consistent fixed effects estimator. If they are insignificant (P-value, Prob>chi2 larger than.05) then it is safe to use random effects. If you get a significant P- value, however, you should use fixed effects. If you want a fixed effects model with robust standard errors, you can use the following command: areg ln_wage grade age ttl_exp tenure black not_smsa south, absorb(idcode) robust You may be interested in running a maximum likelihood estimation in panel data. You would type: xtreg ln_wage grade age ttl_exp tenure black not_smsa south, mle If you qualify for a fixed effects model, should you include time effects? Thierry Warin,

10 Other important question, when you are doing empirical work in panel data is to choose for the inclusion or not of time effects (time dummies) in your fixed effects model. In order to perform the test for the inclusion of time dummies in our fixed effects regression, 1. first we run fixed effects including the time dummies. In the next fixed effects regression, the time dummies were abbreviated to "y" (see Generating time dummies, but you could type them all if you prefer. xtreg ln_wage grade age ttl_exp tenure black not_smsa south y, fe 2. Second, we apply the "testparm" command. It is the test for time dummies, which assumes the null hypothesis that the time dummies are not jointly significant. testparm y 3. We reject the null hypothesis that the time dummies are not jointly significant if p-value smaller than 10%, and as a consequence our fixed effects regression should include time effects. Fixed effects or random effects when time dummies are involved: a test What about if the inclusion of time dummies in our regression would permit us to use a random effects model in the individual effects? [This question is not usually considered in typical empirical work- the purpose here is to show you an additional test for random effects in panel data.) 1. First, we will run a random effects regression including our time dummies, xtreg ln_wage grade age ttl_exp tenure black not_smsa south y, re Thierry Warin,

11 2. and then we will apply the "xttest0" command to test for random effects in this case, which assumes the null hypothesis of random effects. xttest0 3. The null hypothesis of random effects is again rejected if p-value smaller than 10%, and thus we should use a fixed effects model with time effects. Thierry Warin,

12 GMM estimations Two additional commands that are very usefull in empirical work are the Arellano and Bond estimator (GMM estimator) and the Arellano and Bover estimator (system GMM). Both commands permit you do deal with dynamic panels (where you want to use as independent variable lags of the dependent variable) as well with problems of endogeneity. You may want to have a look at them The commands are respectively "xtabond" and "xtabond2". "xtabond" is a built in command in Stata, so in order to check how it works, just type: help xtabond "xtabond2" is not a built in command in Stata. If you want to look at it, previously, you must get it from the net (this is another feature of Stata- you can always get additional commands from the net). You type the following: findit xtabond2 The next steps to install the command should be obvious. How does it work? The xtabond2 commands allows to estimate dynamic models either with the GMM estimator in difference or the GMM estimator in system. xtabond2 dep_variable ind_variables (if, in), noleveleq gmm(list1, options1) iv(list2, options2) two robust small 1. When noleveleq is specified, it is the GMM estimator in difference that s used. Otherwise, if noleveleq is not specified, it is the GMM estimator in system that s used. Thierry Warin,

13 2. gmm(list1, options): list1 is the list of the non-exogenous independent variables options1 may take the following values: lag(a,b), eq(diff), eq(level), eq(both) and collapse o lag(a,b) means that for the equation in difference, the lagged variables (in level) of each variable from list1, dated from t-a to t- b, will be used as instruments; whereas for the equation in level, the first differences dated t-a+1 will be used as instruments. If b=, it means b is infinite. By default, a=1, and b=. Example: gmm(x y, lag(2.)) all the lagged variables of x and y, lagged by at least two periods, will be used as instruments. Example 2: gmm(x, lag(1 2)) gmm (y, lag (2 3)) for variable x, the lagged values of one period and two periods will be used as instruments, whereas for variable y, the lagged values of two and three periods will be used as instruments. o Options eq(diff), eq(level) or eq(both) mean that the instruments must be used respectively for the equation in first difference, the equation in level, or for both. By default, the option is eq(both). o Option collapse reduces the size of the instruments matrix and aloow to prevent the overestimation bias in small samples when the number of instruments is close to the number of observations. But it reduces the statistical efficiency of the estimator in large samples. 3. iv(list2, options2): List2 is the list of variables that are strictly exogenous, and options2 may take the following values: eq(diff), eq(level), eq(both), pass and mz. o Eq(diff), eq(level), and eq(both): see above o By default, the exogenous variables are differentiated to serve as instruments in the equations in first difference, and are used undifferentiated to serve as instruments in the equations in level. The pass option allows to prevent that exogenous variables are differentiated to serve as instruments in equations in first difference. Example: gmm(z, eq(level)) gmm(x, eq(diff) pass) allows to use variable x in level as an instrument in the equation in level as well as in the equation in difference. o Option mz replaces the missing values of the exogenous variables by zero, allowing thus to include in the regression the observations whose data on exogenous variables are missing. This option impacts the coefficients only if the variables are exogenous. Thierry Warin,

14 4. Option two: This option specifies the use of the GMM estimation in two steps. But although this two-step estimation is asymptotically more efficient, leads to biased results. To fix this issue, the xtabond2 command proceeds to a correction of the covariance matrix for finite samples. So far, there is no test to know whether the on-step GMM estimator or two-step GMM estimator should be used. 5. Option robust: This option allows to correct the t-test for heteroscedasticity. 6. Option small: This option replaces the z-statistics by the t-test results. Thierry Warin,


BASIC STEPS TO DO A SIMPLE PANEL DATA ANALYSIS IN STATA BASIC STEPS TO DO A SIMPLE PANEL DATA ANALYSIS IN STATA By: Mahyudin Ahmad @ 2017 Basic steps to do a panel data analysis in STATA Page 1 Outline Outline: 1. Setting up commands 2. Importing data to Stata

More information

An Introduction to Stata Part II: Data Analysis

An Introduction to Stata Part II: Data Analysis An Introduction to Stata Part II: Data Analysis Kerry L. Papps 1. Overview Do-files Sorting a dataset Combining datasets Creating a dataset of means or medians etc. Weights Panel data capabilities Dummy

More information

RUDIMENTS OF STATA. After entering this command the data file WAGE1.DTA is loaded into memory.

RUDIMENTS OF STATA. After entering this command the data file WAGE1.DTA is loaded into memory. J.M. Wooldridge Michigan State University RUDIMENTS OF STATA This handout covers the most often encountered Stata commands. It is not comprehensive, but the summary will allow you to do basic data management

More information

Serial Correlation and Heteroscedasticity in Time series Regressions. Econometric (EC3090) - Week 11 Agustín Bénétrix

Serial Correlation and Heteroscedasticity in Time series Regressions. Econometric (EC3090) - Week 11 Agustín Bénétrix Serial Correlation and Heteroscedasticity in Time series Regressions Econometric (EC3090) - Week 11 Agustín Bénétrix 1 Properties of OLS with serially correlated errors OLS still unbiased and consistent

More information

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian. Panel Data Analysis: Fixed Effects Models

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian. Panel Data Analysis: Fixed Effects Models SOCY776: Longitudinal Data Analysis Instructor: Natasha Sarkisian Panel Data Analysis: Fixed Effects Models Fixed effects models are similar to the first difference model we considered for two wave data

More information

A quick introduction to STATA

A quick introduction to STATA A quick introduction to STATA Data files and other resources for the course book Introduction to Econometrics by Stock and Watson is available on:

More information

A Short Introduction to STATA

A Short Introduction to STATA A Short Introduction to STATA 1) Introduction: This session serves to link everyone from theoretical equations to tangible results under the amazing promise of Stata! Stata is a statistical package that

More information

Dr. Barbara Morgan Quantitative Methods

Dr. Barbara Morgan Quantitative Methods Dr. Barbara Morgan Quantitative Methods 195.650 Basic Stata This is a brief guide to using the most basic operations in Stata. Stata also has an on-line tutorial. At the initial prompt type tutorial. In

More information

Session 2: Fixed and Random Effects Estimation

Session 2: Fixed and Random Effects Estimation Session 2: Fixed and Random Effects Estimation Principal, Developing Trade Consultants Ltd. ARTNeT/RIS Capacity Building Workshop on the Use of Gravity Modeling Thursday, November 10, 2011 1 Outline Fixed

More information

A Short Guide to Stata 10 for Windows

A Short Guide to Stata 10 for Windows A Short Guide to Stata 10 for Windows 1. Introduction 2 2. The Stata Environment 2 3. Where to get help 2 4. Opening and Saving Data 3 5. Importing Data 4 6. Data Manipulation 5 7. Descriptive Statistics

More information

A First Tutorial in Stata

A First Tutorial in Stata A First Tutorial in Stata Stan Hurn Queensland University of Technology National Centre for Econometric Research Stan Hurn (NCER) Stata Tutorial 1 / 66 Table of contents 1 Preliminaries

More information

A Quick Guide to Stata 8 for Windows

A Quick Guide to Stata 8 for Windows Université de Lausanne, HEC Applied Econometrics II Kurt Schmidheiny October 22, 2003 A Quick Guide to Stata 8 for Windows 2 1 Introduction A Quick Guide to Stata 8 for Windows This guide introduces the

More information

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1 Panel data set Consists of n entities or subjects (e.g., firms and states), each of which includes T observations measured at 1 through t time period. total number of observations : nt Panel data have

More information


GETTING STARTED WITH STATA. Sébastien Fontenay ECON - IRES GETTING STARTED WITH STATA Sébastien Fontenay ECON - IRES THE SOFTWARE Software developed in 1985 by StataCorp Functionalities Data management Statistical analysis Graphics Using Stata at UCL Computer

More information

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS Analysis of Panel Data Third Edition Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS Contents Preface to the ThirdEdition Preface to the Second Edition Preface to the First Edition

More information

schooling.log 7/5/2006

schooling.log 7/5/2006 ----------------------------------- log: C:\dnb\schooling.log log type: text opened on: 5 Jul 2006, 09:03:57. /* schooling.log */ > use schooling;. gen age2=age76^2;. /* OLS (inconsistent) */ > reg lwage76

More information

STATA TUTORIAL B. Rabin with modifications by T. Marsh

STATA TUTORIAL B. Rabin with modifications by T. Marsh STATA TUTORIAL B. Rabin with modifications by T. Marsh 5.2.05 (content also from Why choose Stata? Stata has a wide array of pre-defined statistical

More information

Migration and the Labour Market: Data and Intro to STATA

Migration and the Labour Market: Data and Intro to STATA Migration and the Labour Market: Data and Intro to STATA Prof. Dr. Otto-Friedrich-University of Bamberg, Meeting May 27 and June 9, 2010 Contents of today s meeting 1 Repetition of last meeting Repetition

More information

Stata For Dummies. Table of Contents

Stata For Dummies. Table of Contents Stata For Dummies Christopher Zorn University of South Carolina Oxford Spring School June 18-20, 2007 Table of Contents 1 Introduction 1 2 Things You Need To Know 2 3 Starting Stata 2 4 Entering Commands

More information

Introduction to Stata. Getting Started. This is the simple command syntax in Stata and more conditions can be added as shown in the examples.

Introduction to Stata. Getting Started. This is the simple command syntax in Stata and more conditions can be added as shown in the examples. Getting Started Command Syntax command varlist, option This is the simple command syntax in Stata and more conditions can be added as shown in the examples. Preamble mkdir tutorial /* to create a new directory,

More information

Birkbeck College Department of Economics, Mathematics and Statistics.

Birkbeck College Department of Economics, Mathematics and Statistics. Birkbeck College Department of Economics, Mathematics and Statistics. Graduate Certificates and Diplomas Economics, Finance, Financial Engineering 2012 Applied Statistics and Econometrics INTRODUCTION

More information

Use data on individual respondents from the first 17 waves of the British Household

Use data on individual respondents from the first 17 waves of the British Household Applications of Data Analysis (EC969) Simonetta Longhi and Alita Nandi (ISER) Contact: slonghi and anandi; Week 1 Lecture 2: Data Management Use data on individual respondents from the first

More information

Week 4: Simple Linear Regression II

Week 4: Simple Linear Regression II Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties

More information

Economics 561: Economics of Labour (Industrial Relations) Empirical Assignment #2 Due Date: March 7th

Economics 561: Economics of Labour (Industrial Relations) Empirical Assignment #2 Due Date: March 7th Page 1 of 5 2/16/2017 The University of British Columbia Vancouver School of Economics Economics 561: Economics of Labour (Industrial Relations) Professor Nicole M. Fortin Winter 2017 Professor Thomas

More information

Preparing Data for Analysis in Stata

Preparing Data for Analysis in Stata Preparing Data for Analysis in Stata Before you can analyse your data, you need to get your data into an appropriate format, to enable Stata to work for you. To avoid rubbish results, you need to check

More information

Two-Stage Least Squares

Two-Stage Least Squares Chapter 316 Two-Stage Least Squares Introduction This procedure calculates the two-stage least squares (2SLS) estimate. This method is used fit models that include instrumental variables. 2SLS includes

More information

Introduction to Stata: An In-class Tutorial

Introduction to Stata: An In-class Tutorial Introduction to Stata: An I. The Basics - Stata is a command-driven statistical software program. In other words, you type in a command, and Stata executes it. You can use the drop-down menus to avoid

More information

Working Paper No. 782

Working Paper No. 782 Working Paper No. 782 Feasible Estimation of Linear Models with N-fixed Effects by Fernando Rios-Avila* Levy Economics Institute of Bard College December 2013 *Acknowledgements: This paper has benefited

More information

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation

Subject index. ASCII data, reading comma-separated fixed column multiple lines per observation Subject index Symbols %fmt... 106 110 * abbreviation character... 374 377 * comment indicator...346 + combining strings... 124 125 - abbreviation character... 374 377.,.a,.b,...,.z missing values.. 130

More information

Introduction to Stata - Session 1

Introduction to Stata - Session 1 Introduction to Stata - Session 1 Simon, Hong based on Andrea Papini ECON 3150/4150, UiO January 15, 2018 1 / 33 Preparation Before we start Sit in teams of two Download the file auto.dta from the course

More information



More information


GETTING DATA INTO THE PROGRAM GETTING DATA INTO THE PROGRAM 1. Have a Stata dta dataset. Go to File then Open. OR Type use pathname in the command line. 2. Using a SAS or SPSS dataset. Use Stat Transfer. (Note: do not become dependent

More information

API-202 Empirical Methods II Spring 2004 A SHORT INTRODUCTION TO STATA 8.0

API-202 Empirical Methods II Spring 2004 A SHORT INTRODUCTION TO STATA 8.0 API-202 Empirical Methods II Spring 2004 A SHORT INTRODUCTION TO STATA 8.0 Course materials and data sets will assume that you are using Stata to complete the analysis. Stata is available on all of the

More information

Sacha Kapoor - Masters Metrics

Sacha Kapoor - Masters Metrics Sacha Kapoor - Masters Metrics 091610 1 Address: Max Gluskin House, 150 St.George, Rm 329 Email: Web:$_$kapoor 1 Basics Here are some data resources

More information

An Introductory Guide to Stata

An Introductory Guide to Stata An Introductory Guide to Stata Scott L. Minkoff Assistant Professor Department of Political Science Barnard College Updated: July 9, 2012 1 TABLE OF CONTENTS ABOUT THIS GUIDE... 4

More information

Getting started with Stata 2017: Cheat-sheet

Getting started with Stata 2017: Cheat-sheet Getting started with Stata 2017: Cheat-sheet 4. september 2017 1 Get started Graphical user interface (GUI). Clickable. Simple. Commands. Allows for use of do-le. Easy to keep track. Command window: Write

More information

After opening Stata for the first time: set scheme s1mono, permanently

After opening Stata for the first time: set scheme s1mono, permanently Stata 13 HELP Getting help Type help command (e.g., help regress). If you don't know the command name, type lookup topic (e.g., lookup regression). Email: Put your Stata serial

More information

Stata Training. AGRODEP Technical Note 08. April Manuel Barron and Pia Basurto

Stata Training. AGRODEP Technical Note 08. April Manuel Barron and Pia Basurto AGRODEP Technical Note 08 April 2013 Stata Training Manuel Barron and Pia Basurto AGRODEP Technical Notes are designed to document state-of-the-art tools and methods. They are circulated in order to help

More information

Lab 2: OLS regression

Lab 2: OLS regression Lab 2: OLS regression Andreas Beger February 2, 2009 1 Overview This lab covers basic OLS regression in Stata, including: multivariate OLS regression reporting coefficients with different confidence intervals

More information

May 24, Emil Coman 1 Yinghui Duan 2 Daren Anderson 3

May 24, Emil Coman 1 Yinghui Duan 2 Daren Anderson 3 Assessing Health Disparities in Intensive Longitudinal Data: Gender Differences in Granger Causality Between Primary Care Provider and Emergency Room Usage, Assessed with Medicaid Insurance Claims May

More information

Labor Economics with STATA. Estimating the Human Capital Model Using Artificial Data

Labor Economics with STATA. Estimating the Human Capital Model Using Artificial Data Labor Economics with STATA Liyousew G. Borga December 2, 2015 Estimating the Human Capital Model Using Artificial Data Liyou Borga Labor Economics with STATA December 2, 2015 84 / 105 Outline 1 The Human

More information

Statistics & Curve Fitting Tool

Statistics & Curve Fitting Tool Statistics & Curve Fitting Tool This tool allows you to store or edit a list of X and Y data pairs to statistically analyze it. Many statistic figures can be calculated and four models of curve-fitting

More information

Estimation and Inference by the Method of Projection Minimum Distance. Òscar Jordà Sharon Kozicki U.C. Davis Bank of Canada

Estimation and Inference by the Method of Projection Minimum Distance. Òscar Jordà Sharon Kozicki U.C. Davis Bank of Canada Estimation and Inference by the Method of Projection Minimum Distance Òscar Jordà Sharon Kozicki U.C. Davis Bank of Canada The Paper in a Nutshell: An Efficient Limited Information Method Step 1: estimate

More information

From the help desk. Allen McDowell Stata Corporation

From the help desk. Allen McDowell Stata Corporation The Stata Journal (2001) 1, Number 1, pp. 76 85 From the help desk Allen McDowell Stata Corporation Abstract. Welcome to From the help desk. From the help desk is written by the people

More information

Seminar Corporate Governance: Topics on Data Analysis with STATA

Seminar Corporate Governance: Topics on Data Analysis with STATA Seminar Corporate Governance: Topics on Data Analysis with STATA Yuhao Zhu 22 November 2017 Contents I Introductory 2 1 Why we are here and how we get there? 2 2 What to learn today? 2

More information

Important Things to Know about Stata

Important Things to Know about Stata Important Things to Know about Stata Accessing Stata Stata 14.0 is available in all clusters and classrooms on campus. You may also purchase it at a substantial discount through Notre Dame s GradPlan.

More information

Regression. Dr. G. Bharadwaja Kumar VIT Chennai

Regression. Dr. G. Bharadwaja Kumar VIT Chennai Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called

More information

An Introduction To Stata and Matlab. Liugang Sheng ECN 240A UC Davis

An Introduction To Stata and Matlab. Liugang Sheng ECN 240A UC Davis An Introduction To Stata and Matlab Liugang Sheng ECN 240A UC Davis Stata and Matlab in our Lab Go to the admin webpage Follow the instruction

More information

Data analysis using Stata , AMSE Master (M1), Spring semester

Data analysis using Stata , AMSE Master (M1), Spring semester Data analysis using Stata 2016-2017, AMSE Master (M1), Spring semester Notes Marc Sangnier Data analysis using Stata Virtually infinite number of tasks for data analysis. Almost infinite number of commands

More information

Using SAS and STATA in Archival Accounting Research

Using SAS and STATA in Archival Accounting Research Using SAS and STATA in Archival Accounting Research Kai Chen Dec 2, 2014 Overview SAS and STATA are most commonly used software in archival accounting research. SAS is harder to learn. STATA is much easier.

More information

1. (20%) a) Calculate and plot the impulse response functions for the model. u1t 1 u 2t 1

1. (20%) a) Calculate and plot the impulse response functions for the model. u1t 1 u 2t 1 ECONOMETRICS II, Fall 2016 Bent E. Sørensen Final Exam. December 2. 7 questions. 1. 20% a Calculate and plot the impulse response functions for the model x1t x 2t = u1t u 2t u1t 1 u 2t 1 1 2 0.5

More information

An Introduction to STATA ECON 330 Econometrics Prof. Lemke

An Introduction to STATA ECON 330 Econometrics Prof. Lemke An Introduction to STATA ECON 330 Econometrics Prof. Lemke 1. GETTING STARTED A requirement of this class is that you become very comfortable with STATA, a leading statistical software package. You were

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

25 Working with categorical data and factor variables

25 Working with categorical data and factor variables 25 Working with categorical data and factor variables Contents 25.1 Continuous, categorical, and indicator variables 25.1.1 Converting continuous variables to indicator variables 25.1.2 Converting continuous

More information

Package endogenous. October 29, 2016

Package endogenous. October 29, 2016 Package endogenous October 29, 2016 Type Package Title Classical Simultaneous Equation Models Version 1.0 Date 2016-10-25 Maintainer Andrew J. Spieker Description Likelihood-based

More information

STATA Tutorial. Introduction to Econometrics. by James H. Stock and Mark W. Watson. to Accompany

STATA Tutorial. Introduction to Econometrics. by James H. Stock and Mark W. Watson. to Accompany STATA Tutorial to Accompany Introduction to Econometrics by James H. Stock and Mark W. Watson STATA Tutorial to accompany Stock/Watson Introduction to Econometrics Copyright 2003 Pearson Education Inc.

More information

Intro to E-Views. E-views is a statistical package useful for cross sectional, time series and panel data statistical analysis.

Intro to E-Views. E-views is a statistical package useful for cross sectional, time series and panel data statistical analysis. Center for Teaching, Research & Learning Research Support Group at the CTRL Lab American University, Washington, D.C. 202-885-3862 Intro to E-Views E-views is a statistical

More information

Week 11: Interpretation plus

Week 11: Interpretation plus Week 11: Interpretation plus Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline A bit of a patchwork

More information

CSE446: Linear Regression. Spring 2017

CSE446: Linear Regression. Spring 2017 CSE446: Linear Regression Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Prediction of continuous variables Billionaire says: Wait, that s not what I meant! You say: Chill

More information

Panel Data 4: Fixed Effects vs Random Effects Models

Panel Data 4: Fixed Effects vs Random Effects Models Panel Data 4: Fixed Effects vs Random Effects Models Richard Williams, University of Notre Dame, Last revised April 4, 2017 These notes borrow very heavily, sometimes verbatim,

More information

Department of Economics Spring 2018 University of California Economics 154 Professor Martha Olney Stata Lesson Thursday February 15, 2018

Department of Economics Spring 2018 University of California Economics 154 Professor Martha Olney Stata Lesson Thursday February 15, 2018 University of California Economics 154 Berkeley Professor Martha Olney Stata Lesson Thursday February 15, 2018 [1] Where to find the data sets There

More information

Empirical trade analysis

Empirical trade analysis Empirical trade analysis Introduction to Stata Cosimo Beverelli World Trade Organization Cosimo Beverelli Stata introduction Bangkok, 18-21 Dec 2017 1 / 23 Outline 1 Resources 2 How Stata looks like 3

More information

Stata Session 2. Tarjei Havnes. University of Oslo. Statistics Norway. ECON 4136, UiO, 2012

Stata Session 2. Tarjei Havnes. University of Oslo. Statistics Norway. ECON 4136, UiO, 2012 Stata Session 2 Tarjei Havnes 1 ESOP and Department of Economics University of Oslo 2 Research department Statistics Norway ECON 4136, UiO, 2012 Tarjei Havnes (University of Oslo) Stata Session 2 ECON

More information

GRETL FOR TODDLERS!! CONTENTS. 1. Access to the econometric software A new data set: An existent data set: 3

GRETL FOR TODDLERS!! CONTENTS. 1. Access to the econometric software A new data set: An existent data set: 3 GRETL FOR TODDLERS!! JAVIER FERNÁNDEZ-MACHO CONTENTS 1. Access to the econometric software 3 2. Loading and saving data: the File menu 3 2.1. A new data set: 3 2.2. An existent data set: 3 2.3. Importing

More information

Lab 1: Basics of Stata Short Course on Poverty & Development for Nordic Ph.D. Students University of Copenhagen June 13-23, 2000

Lab 1: Basics of Stata Short Course on Poverty & Development for Nordic Ph.D. Students University of Copenhagen June 13-23, 2000 Lab 1: Basics of Stata Short Course on Poverty & Development for Nordic Ph.D. Students University of Copenhagen June 13-23, 2000 This lab is designed to give you a basic understanding of the tools available

More information

Introduction to STATA 6.0 ECONOMICS 626

Introduction to STATA 6.0 ECONOMICS 626 Introduction to STATA 6.0 ECONOMICS 626 Bill Evans Fall 2001 This handout gives a very brief introduction to STATA 6.0 on the Economics Department Network. In a few short years, STATA has become one of

More information

Title. Description. time series Introduction to time-series commands

Title. Description. time series Introduction to time-series commands Title time series Introduction to time-series commands Description The Time-Series Reference Manual organizes the commands alphabetically, making it easy to find individual command entries if you know

More information

Subset Selection in Multiple Regression

Subset Selection in Multiple Regression Chapter 307 Subset Selection in Multiple Regression Introduction Multiple regression analysis is documented in Chapter 305 Multiple Regression, so that information will not be repeated here. Refer to that

More information

EE 511 Linear Regression

EE 511 Linear Regression EE 511 Linear Regression Instructor: Hanna Hajishirzi Slides adapted from Ali Farhadi, Mari Ostendorf, Pedro Domingos, Carlos Guestrin, and Luke Zettelmoyer, Announcements Hw1 due

More information

Introduction to Stata - Session 2

Introduction to Stata - Session 2 Introduction to Stata - Session 2 Siv-Elisabeth Skjelbred ECON 3150/4150, UiO January 26, 2016 1 / 29 Before we start Download auto.dta, auto.csv from course home page and save to your stata course folder.

More information

Standard Errors in OLS Luke Sonnet

Standard Errors in OLS Luke Sonnet Standard Errors in OLS Luke Sonnet Contents Variance-Covariance of ˆβ 1 Standard Estimation (Spherical Errors) 2 Robust Estimation (Heteroskedasticity Constistent Errors) 4 Cluster Robust Estimation 7

More information

Advanced Regression Analysis Autumn Stata 6.0 For Dummies

Advanced Regression Analysis Autumn Stata 6.0 For Dummies Advanced Regression Analysis Autumn 2000 Stata 6.0 For Dummies Stata 6.0 is the statistical software package we ll be using for much of this course. Stata has a number of advantages over other currently

More information

A Short Guide to Stata 14

A Short Guide to Stata 14 Short Guides to Microeconometrics Fall 2016 Prof. Dr. Kurt Schmidheiny Universität Basel A Short Guide to Stata 14 1 Introduction 2 2 The Stata Environment 2 3 Where to get help 3 4 Additions to Stata

More information

Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013

Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013 Your Name: Your student id: Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013 Problem 1 [5+?]: Hypothesis Classes Problem 2 [8]: Losses and Risks Problem 3 [11]: Model Generation

More information


A QUICK INTRODUCTION TO STATA A QUICK INTRODUCTION TO STATA This module provides a quick introduction to STATA. After completing this module you will be able to input data, save data, transform data, create basic tables, create basic

More information

Can double click the data file and it should open STATA

Can double click the data file and it should open STATA ECO 445: International Trade Professor Jack Rossbach Instructions on Doing Gravity Regressions in STATA Important: If you don t know how to use a command, use the help command in R. For example, type help

More information

Review of Stata II AERC Training Workshop Nairobi, May 2002

Review of Stata II AERC Training Workshop Nairobi, May 2002 Review of Stata II AERC Training Workshop Nairobi, 20-24 May 2002 This note provides more information on the basics of Stata that should help you with the exercises in the remaining sessions of the workshop.

More information



More information

Lab #9: ANOVA and TUKEY tests

Lab #9: ANOVA and TUKEY tests Lab #9: ANOVA and TUKEY tests Objectives: 1. Column manipulation in SAS 2. Analysis of variance 3. Tukey test 4. Least Significant Difference test 5. Analysis of variance with PROC GLM 6. Levene test for

More information

An Iterative Approach to Estimation with Multiple High-Dimensional Fixed Effects

An Iterative Approach to Estimation with Multiple High-Dimensional Fixed Effects An Iterative Approach to Estimation with Multiple High-Dimensional Fixed Effects Abstract Siyi Luo, Wenjia Zhu, Randall P. Ellis March 23, 2017 Department of Economics, Boston University Controlling for

More information

Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561

Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561 Community Resource: Egenmore, by command, return lists, and system variables. Beksahn Jang Feb 22 nd, 2016 SOC561 Egenmore Egenmore is a package in Stata that extends the capabilities of the egen command.

More information

Economists. Melissa Dell Matt Notowidigdo Paul Schrimpf

Economists. Melissa Dell Matt Notowidigdo Paul Schrimpf 14.170: 170: Programming for Economists 1/12/2009-1/16/2009 Melissa Dell Matt Notowidigdo Paul Schrimpf Lecture 5, Large Data Sets in Stata + Numerical Precision Overview This lecture is part wrap-up p

More information

[spa-temp.inf] Spatial-temporal information

[spa-temp.inf] Spatial-temporal information [spa-temp.inf] Spatial-temporal information VI Table of Contents for Spatial-temporal information I. Spatial-temporal information........................................... VI - 1 A. Cohort-survival method.........................................

More information

Introduction to Statistical Analyses in SAS

Introduction to Statistical Analyses in SAS Introduction to Statistical Analyses in SAS Programming Workshop Presented by the Applied Statistics Lab Sarah Janse April 5, 2017 1 Introduction Today we will go over some basic statistical analyses in

More information

Introduction to Programming in Stata

Introduction to Programming in Stata Introduction to in Stata Laron K. University of Missouri Goals Goals Replicability! Goals Replicability! Simplicity/efficiency Goals Replicability! Simplicity/efficiency Take a peek under the hood! Data

More information

Description Remarks and examples References Also see

Description Remarks and examples References Also see Title intro 4 Substantive concepts Description Remarks and examples References Also see Description The structural equation modeling way of describing models is deceptively simple. It is deceptive

More information

APractitioners Guide to Stochastic Frontier. Analysis Using Stata. SUBAL C. KUMBHAKAR Binghamton University, NY

APractitioners Guide to Stochastic Frontier. Analysis Using Stata. SUBAL C. KUMBHAKAR Binghamton University, NY APractitioners Guide to Stochastic Frontier Analysis Using Stata SUBAL C. KUMBHAKAR Binghamton University, NY HUNG-JEN WANG National Taiwan University ALAN P. HORNCASTLE Oxera Consulting LLP, Oxford, UK

More information

Introduction to STATA

Introduction to STATA Center for Teaching, Research and Learning Research Support Group American University, Washington, D.C. Hurst Hall 203 (202) 885-3862 Introduction to STATA WORKSHOP OBJECTIVE: This workshop

More information

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242 Creation & Description of a Data Set * 4 Levels of Measurement * Nominal, ordinal, interval, ratio * Variable Types

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information



More information

Lecture 3: Linear Classification

Lecture 3: Linear Classification Lecture 3: Linear Classification Roger Grosse 1 Introduction Last week, we saw an example of a learning task called regression. There, the goal was to predict a scalar-valued target from a set of features.

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients

More information

Multivariate Capability Analysis

Multivariate Capability Analysis Multivariate Capability Analysis Summary... 1 Data Input... 3 Analysis Summary... 4 Capability Plot... 5 Capability Indices... 6 Capability Ellipse... 7 Correlation Matrix... 8 Tests for Normality... 8

More information

Chapter 5 Parameter Estimation:

Chapter 5 Parameter Estimation: Chapter 5 Parameter Estimation: MODLER s regression commands at their most basic are essentially intuitive. For example, consider: IMP=F(GNP,CAPI) which specifies that IMP is a function F() of the variables

More information

Econometrics I: OLS. Dean Fantazzini. Dipartimento di Economia Politica e Metodi Quantitativi. University of Pavia

Econometrics I: OLS. Dean Fantazzini. Dipartimento di Economia Politica e Metodi Quantitativi. University of Pavia Dipartimento di Economia Politica e Metodi Quantitativi University of Pavia Overview of the Lecture 1 st EViews Session I: Convergence in the Solow Model 2 Overview of the Lecture 1 st EViews Session I:

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Missing Data Part 1: Overview, Traditional Methods Page 1

Missing Data Part 1: Overview, Traditional Methods Page 1 Missing Data Part 1: Overview, Traditional Methods Richard Williams, University of Notre Dame, Last revised January 17, 2015 This discussion borrows heavily from: Applied

More information

PRI Workshop Introduction to AMOS

PRI Workshop Introduction to AMOS PRI Workshop Introduction to AMOS Krissy Zeiser Pennsylvania State University 2-pm /3/2008 Setting up the Dataset Missing values should be recoded in another program (preferably with

More information