BASIC STEPS TO DO A SIMPLE PANEL DATA ANALYSIS IN STATA By: Mahyudin Ahmad @ 2017 Basic steps to do a panel data analysis in STATA Page 1
Outline Outline: 1. Setting up commands 2. Importing data to Stata 3. Panel data basic commands 4. xtreg command 5. Exploring panel data 6. Panel data models: Pooled OLS, Fixed effects, Random effects 7. Testing procedure 8. Hausman test 9. BP-LM test 10. Is time dummy important? 11. Which model is the best? 12. Summary: steps in panel data analysis IMPORTANT! Basic steps to do a panel data analysis in STATA Page 2
1. Setting up commands Basic setting up steps: set mem 1000m set more off cd "D:\<your working directory name>" log using <log file name>.log It is advisable to do these steps before embarking on any regression. The log file keeps the whole works you did including the regression results/outputs. Note: see also videos about basic introduction to STATA here Basic steps to do a panel data analysis in STATA Page 3
2. Importing data to Stata To load data from Excel for the first time, copy from Excel and paste into STATA: REMEMBER: Sort your data vertically according to panel group (column1), time (column2), followed by the variables country year gdp sav pop Albania First crosssectional 1990 6.75179343 20.9783993 1.6 unit Albania 1991-11.4142038-13.0284996-0.2 Albania 1992-27.5896031-75.4131012-1.6 Albania 1993-5.69153612-33.6716003-1.4 Albania 1994 11.1974627-9.88263035 0.2 Albania 1995 9.1941036-3.94799995 1.2 Albania 1996 7.55757392-11.8118 1.3 Albania 1997 7.73893405-9.25912952 1.2 Albania 1998-8.06352119-6.69585991 1.1 Albania 1999 missing -1.66910005 1.1 Algeria 1990 2.29575915 27.4666996 2.5 Algeria Time 1991-3.72084675 36.6562004 2.4 Algeria 1992-3.55414336 32.3755989 2.4 dimension Algeria 1993-0.79384221 27.8384991 2.3 Algeria 1994-4.35723136 27.0359993 2.2 Algeria 1995-3.31007521 28.4333992 2.2 Algeria 1996 1.59040861 31.4230003 2.2 Algeria 1997 1.58921549 32.1985016 2.2 Algeria 1998-1.03429441 27.0669003 2.1 Algeria 1999 1.44857954 31.6912003 2.1 Basic steps to do a panel data analysis in STATA Page 4
3. Panel data basic commands Commands normally used in panel data analysis. Panel data commands start with xt help xt : to obtain help files on xt command xtset id year : to inform Stata that our data is panel xtsum: to summarize data, will give overall, between and within stats and obs xtdes <var id> : to describe a variable xtreg : to start panel regression. We ll look more at this command after this. Basic steps to do a panel data analysis in STATA Page 5
4. xtreg command Basic steps to do a panel data analysis in STATA Page 6
5. Exploring panel data Basic steps to do a panel data analysis in STATA Page 7
6. Pooled OLS POLS assumes the data have common intercept! Basic steps to do a panel data analysis in STATA Page 8
6. Fixed effects The difference between FE and RE!! FE allows for many intercepts Basic steps to do a panel data analysis in STATA Page 9
6. Random effects The difference between FE and RE!! Basic steps to do a panel data analysis in STATA Page 10
7. Testing procedure Basic steps to do a panel data analysis in STATA Page 11
8. Hausman test Basic steps to do a panel data analysis in STATA Page 12
9. BP-LM test Basic steps to do a panel data analysis in STATA Page 13
10. Is time dummy important? Basic steps to do a panel data analysis in STATA Page 14
11. Which model is the best? Fixed effect estimator of β remains consistent even if the true model is not fixed effect Basic steps to do a panel data analysis in STATA Page 15
12. Summary: steps in panel analysis Steps: 1. Load panel data into stata, make sure your data have N>T 2. xtreg nvar tvar (nvar= cross section variable, tvar= time variable) 3. Do Hausman test FE against RE (to test the error tem that is assumed to have unobserved parameters correlating with regressors) 4. Do Hausman test: Significant choose FE, insignificant choose RE 5. If choose RE, the presence of unobserved parameters in error term is assumed but they happened at random, not correlated with regressors. 6. Do BP-LM test to check whether the unobserved parameters are actually present? Significant choose RE, insignificant choose POLS 7. POLS means the error term is iid and doesn t have unobserved parameters at all. 8. Finally, do the common diagnostic tests to your model (multicollinearity (vif), heterokedasticity (xttest3), serial correlation (xtserial). 9. Remember, always run your final model with robust standard error. Basic steps to do a panel data analysis in STATA Page 16
THE END Thank you Basic steps to do a panel data analysis in STATA Page 17