ECON4150 - Introductory Econometrics Seminar 4 Stock and Watson EE8.2 April 28, 2015 Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 1 / 20
Current Population Survey data on labor force characteristics of the population, including the level of employment, unemployment, and earnings this subset consider young workers aged 25 to 34 in 2012. 7440 individuals. FEMALE: 1 if female; 0 if male YEAR: Year AHE : Average Hourly Earnings BACHELOR: 1 if worker has a bachelor degree; 0 if worker has a high school degree AGE: age of individual, from 25 to 34, young workers Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 2 / 20
clear all cd M:\pc\Desktop\courses\introductory_econometrics\seminar_4 use "cps12.dta" cap log close log using EE8_2.log, replace set more off pause on //describe the data describe //summary statistics summ summ Variable Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- year 7440 2012 0 2012 2012 ahe 7440 19.80026 10.68632 2.136752 91.45602 bachelor 7440.531586.4990349 0 1 female 7440.424328.4942738 0 1 age 7440 29.64772 2.839661 25 34 Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 3 / 20
a) reg ahe age female bachelor, r Linear regression Number of obs = 7440 F( 3, 7436) = 539.54 Prob > F = 0.0000 R-squared = 0.1801 Root MSE = 9.6782 Robust ahe Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- age.510286.0395409 12.91 0.000.4327747.5877973 female -3.810305.2239148-17.02 0.000-4.249241-3.371368 bachelor 8.318628.2237329 37.18 0.000 7.880048 8.757208 _cons 1.866198 1.175373 1.59 0.112 -.4378656 4.170261. estimates store rega.. /* > If Age increases from 25 to 26 or from 33 to 34, earnings are predicted to in > crease by $0.510 per hour. > These values are the same because the regression is a linear function relating > AHE and Age. > */. Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 4 / 20
b gen lnahe=ln(ahe) reg lnahe age female bachelor, r Linear regression Number of obs = 7440 F( 3, 7436) = 623.31 Prob > F = 0.0000 R-squared = 0.1964 Root MSE =.47823 Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- age.0255179.0019619 13.01 0.000.0216721.0293637 female -.1923376.0112614-17.08 0.000 -.2144132 -.170262 bachelor.4377833.0112003 39.09 0.000.4158275.4597391 _cons 1.941423.0590018 32.90 0.000 1.825763 2.057083. estimates store regb.. /* earnings are predicted to increase in both cases by 100%*0.0255=2.55%. > These values, in percentage terms, are the same because the regression > is a linear function relating ln(ahe) and Age. > */. Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 5 / 20
c gen lnage=ln(age) reg lnahe lnage female bachelor, r Linear regression Number of obs = 7440 F( 3, 7436) = 624.31 Prob > F = 0.0000 R-squared = 0.1966 Root MSE =.47817 Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- lnage.7529408.0576153 13.07 0.000.6399984.8658831 female -.1923558.0112593-17.08 0.000 -.2144271 -.1702844 bachelor.4376637.0111993 39.08 0.000.4157099.4596175 _cons.1495315.1953385 0.77 0.444 -.2333873.5324504. estimates store regc... /* > If age goes from 25 to 26, the percentage increased is approximated by ln(26 > /25) = 0.0392, then 3.92%. > The predicted increase in earnings is 100% * 0.75 * (0.0392) = 2.9 > If age goes from 35 to 36, the percentage increased is approximated by ln(36 > /35) = 0.0290, then 2.90%. > TheStock predicted and Watson EE8.2 increase in earnings ECON4150 - Introductory is the Econometrics 100% * 0.75 Seminar* 4 (0.0290) = 2.1%. April 28, 2015 6 / 20
d gen agesquared=age^2 reg lnahe age agesquared female bachelor, r Linear regression Number of obs = 7440 F( 4, 7435) = 469.24 Prob > F = 0.0000 R-squared = 0.1967 Root MSE =.47816 Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- age.1040449.0457314 2.28 0.023.0143984.1936913 agesquared -.0013284.0007728-1.72 0.086 -.0028433.0001864 female -.1923983.0112589-17.09 0.000 -.214469 -.1703276 bachelor.4374121.0112096 39.02 0.000.4154381.4593862 _cons.791882.6712609 1.18 0.238 -.5239793 2.107743. estimates store regd /* Age increases from 25 to 26, the predicted change in ln(ahe) is (0.104 * 26-0.0013 * 26^2) - (0.104 * 25-0.0013 * 25^2) = 0.036. Earnings are predicted to increase by 100%*0.036=3.6%. Age increases from 34 to 35, the predicted change in ln(ahe) is (0.104 * 35-0.0013 * 352) - (0.104 * 34-0.0013 * 342) = 0.012. Earnings are predicted to increase by 100%*0.012=1.2%. */ Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 7 / 20
e /* > The regressions of questions (c) and (b) differs with respect the choice of > one regressor. > The dependent variable is the same ln(ahe). > Then, we can make a choice comparing the two regression using the adjusted > R squared. > */ quietly reg lnahe lnage female bachelor, r display "adjusted R2_c = " e(r2_a) adjusted R2_c =.19623914 quietly reg lnahe age female bachelor, r display "adjusted R2_b = " e(r2_a) adjusted R2_b =.19605996 // adjusted R2_c =.19623914 > adjusted R2_b =.19605996 Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 8 / 20
f /* The regression in (d) includes an extra regression when compared with regre sion (b). The dependent variable is the same ln(ahe). Then, we can make a choice comparing considering just the t-statistics of agesquared */ reg lnahe age agesquared female bachelor, r Linear regression Number of obs = 7440 F( 4, 7435) = 469.24 Prob > F = 0.0000 R-squared = 0.1967 Root MSE =.47816 Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- age.1040449.0457314 2.28 0.023.0143984.1936913 agesquared -.0013284.0007728-1.72 0.086 -.0028433.0001864 female -.1923983.0112589-17.09 0.000 -.214469 -.1703276 bachelor.4374121.0112096 39.02 0.000.4154381.4593862 _cons.791882.6712609 1.18 0.238 -.5239793 2.107743 /* coefficient on agesquared is not statistically significant different from zero at a 95% level ( t = -1.72 < 1.96). This suggests that (b) is preferred to (d). */ Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 9 / 20
g. /* Choice of regressors is different but same dependent variable. Check the adjusted Rsquared */ quietly reg lnahe age agesquared female bachelor, r display "adjusted R2_d = " e(r2_a) adjusted R2_d =.19627256 quietly reg lnahe lnage female bachelor, r display "adjusted R2_c = " e(r2_a) adjusted R2_c =.19623914 // adjusted R2_c =.19623914 < adjusted R2_d =.19627256. d preferred Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 10 / 20
h // first store the predicted values from the different regressions quietly reg lnahe age female bachelor, r predict lnahe_b_hat quietly reg lnahe lnage female bachelor, r predict lnahe_c_hat quietly reg lnahe age agesquared female bachelor, r predict lnahe_d_hat // then sort sort age // graphs two (line lnahe_b_hat age if female==0 & bachelor==0, lwidth(medthick) lpattern(solid) lcolor(blue)) /// (line lnahe_c_hat age if female==0 & bachelor==0, lwidth(medthick) lpattern(solid) lcolor(red)) /// (line lnahe_d_hat age if female==0 & bachelor==0, lwidth(medthick) lpattern(solid) lcolor(black)) ///, scheme(s1color) legend(pos(7) ring(0) label(1 "regression_b") label(2 "regression_c")label(3 "regression_d")) Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 11 / 20
h Fitted values 2.55 2.6 2.65 2.7 2.75 2.8 regression_b regression_d regression_c 24 26 28 30 32 34 age /* very similar fitted values. The quadratic specification (regression_d)more curvature than the log-log one (regression_c). for a female with a high school diploma similar but shifted (by the amount of the coefficient on the dummy variable female) lines, */ Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 12 / 20
i gen fembac=female*bachelor regress lnahe age agesquared female bachelor fembac, robust Linear regression Number of obs = 7440 F( 5, 7434) = 382.92 Prob > F = 0.0000 R-squared = 0.1984 Root MSE =.4777 Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- age.1043224.04568 2.28 0.022.0147766.1938682 agesquared -.0013316.0007719-1.73 0.085 -.0028447.0001815 female -.2423732.0166376-14.57 0.000 -.2749877 -.2097587 bachelor.4004463.0148482 26.97 0.000.3713396.4295531 fembac.0898571.0225592 3.98 0.000.0456346.1340796 _cons.8037409.6706449 1.20 0.231 -.510913 2.118395 /* > The coefficient on the interaction term > fembac allows the effect of Bachelor on ln(ahe) to be different accordingly > to the gender of the individual > */ Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 13 / 20
i /* > Alexis, predicted value: 0.104 * 30-0.001 * 30^2-0.242 + 0.401 + 0.090 + > 0.804 = 3.273 > Jane predicted value: 0.104 * 30-0.001 * 30^2-0.242 + 0.804 = 2.782 > > Predicted diff Alexis-jane= 3.273-2.782 =.491 > > Bob 0.104 * 30-0.001 * 30^2 + 0.401 + 0.804 = 3.425 > Jim 0.104 * 30-0.001 * 30^2 + 0.804 = 3.024 > > Predicted diff Bob - Jim = 3.425-3.024 = 0.401 > > the diff - diff is the interaction term.491-0.401 = 0.09 > > */ Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 14 / 20
j. // interaction term and then F statistic gen agefem = age * female gen agesquaredfem = agesquared * female regress lnahe age agesquared agefem agesquaredfem female bachelor fembac, r Linear regression Number of obs = 7440 F( 7, 7432) = 275.77 Prob > F = 0.0000 R-squared = 0.1993 Root MSE =.47749 - Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] --------------+---------------------------------------------------------------- age.0202458.0601672 0.34 0.737 -.097699.1381905 agesquared.0001442.0010166 0.14 0.887 -.0018486.0021369 agefem.1925467.0923517 2.08 0.037.0115112.3735821 agesquaredfem -.0033832.0015605-2.17 0.030 -.0064421 -.0003242 female -2.94933 1.355778-2.18 0.030-5.607039 -.2916208 bachelor.4007629.0148487 26.99 0.000.3716552.4298706 fembac.0886023.0225717 3.93 0.000.0443554.1328492 _cons 1.987056.8833646 2.25 0.025.2554111 3.718701 - Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 15 / 20
j test agefem agesquaredfem ( 1) agefem = 0 ( 2) agesquaredfem = 0 F( 2, 7432) = 4.14 Prob > F = 0.0160 > the effect of Age on earnings (in log) is statistically(at the 5% but not 1% l > evel) different > between men and women Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 16 / 20
k gen agebac = bachelor * age gen agesquaredbac = bachelor * agesquared regress lnahe age agesquared agebac agesquaredbac female bachelor fembac, r Linear regression Number of obs = 7440 F( 7, 7432) = 273.66 Prob > F = 0.0000 R-squared = 0.1987 Root MSE =.47768 - Robust lnahe Coef. Std. Err. t P> t [95% Conf. Interval] --------------+---------------------------------------------------------------- age.0373728.0646696 0.58 0.563 -.0893979.1641435 agesquared -.0002287.0010931-0.21 0.834 -.0023716.0019142 agebac.1283798.0912705 1.41 0.160 -.0505362.3072958 agesquaredbac -.0021153.0015424-1.37 0.170 -.0051387.0009082 female -.2423161.0166376-14.56 0.000 -.2749306 -.2097017 bachelor -1.529426 1.340231-1.14 0.254-4.15666 1.097807 fembac.0899989.0225708 3.99 0.000.0457537.1342442 _cons 1.810172.9490109 1.91 0.057 -.0501582 3.670502 - Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 17 / 20
k test agebac agesquaredbac ( 1) agebac = 0 ( 2) agesquaredbac = 0 F( 2, 7432) = 1.30 Prob > F = 0.2725 Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 18 / 20
k test agebac agesquaredbac ( 1) agebac = 0 ( 2) agesquaredbac = 0 /* F( 2, 7432) = 1.30 Prob > F = 0.2725 Pvalue is larger than 0.05, even larger than 0.10, so that we can not reject t he null hypotesys that the two coefficent are both 0. Thus, there is not statistical s ignificant evidence of a different effect of Age on ln(ahe) for high school and college graduates */ Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 19 / 20
l /* Gender and education are significant predictors of earnings, significant interaction effects between: --- age and gender --- gender and and education */ log close Stock and Watson EE8.2 ECON4150 - Introductory Econometrics Seminar 4 April 28, 2015 20 / 20