INTRODUCTION TO PANEL DATA ANALYSIS

Similar documents
An Econometric Study: The Cost of Mobile Broadband

Lab Session 1. Introduction to Eviews

Model Diagnostic tests

LOADS, CUSTOMERS AND REVENUE

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian. Panel Data Analysis: Fixed Effects Models

Chapter 15: Forecasting

Adaptive spline autoregression threshold method in forecasting Mitsubishi car sales volume at PT Srikandi Diamond Motors

Intro to E-Views. E-views is a statistical package useful for cross sectional, time series and panel data statistical analysis.

Week 4: Simple Linear Regression II

PANEL DATA REGRESSION MODELS IN EVIEWS: Pooled OLS, Fixed or Random effect model?

EViews 3.1 Student Version

Week 4: Simple Linear Regression III

Example 1 of panel data : Data for 6 airlines (groups) over 15 years (time periods) Example 1

CDAA No. 4 - Part Two - Multiple Regression - Initial Data Screening

Source engine marketing: a preliminary empirical analysis of web search data

Serial Correlation and Heteroscedasticity in Time series Regressions. Econometric (EC3090) - Week 11 Agustín Bénétrix

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS

Introduction to Hierarchical Linear Model. Hsueh-Sheng Wu CFDR Workshop Series January 30, 2017

Week 5: Multiple Linear Regression II

Research Methods Workshop Introduction to EViews

BASIC STEPS TO DO A SIMPLE PANEL DATA ANALYSIS IN STATA

Session 2: Fixed and Random Effects Estimation

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

STAT 2607 REVIEW PROBLEMS Word problems must be answered in words of the problem.

. predict mod1. graph mod1 ed, connect(l) xlabel ylabel l1(model1 predicted income) b1(years of education)

EViews 4.1 Tutorial 1. EVIEWS: INTRODUCTION

Lecture 13: Model selection and regularization

An Introduction to Growth Curve Analysis using Structural Equation Modeling

Cluster Randomization Create Cluster Means Dataset

Panel Data 4: Fixed Effects vs Random Effects Models

Chapter 7: Dual Modeling in the Presence of Constant Variance

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

EViews 6 Tutorial. by Manfred W. Keil. to Accompany. Introduction to Econometrics. by James H. Stock and Mark W. Watson

Conducting a Path Analysis With SPSS/AMOS

Introduction. Advanced Econometrics - HEC Lausanne. Christophe Hurlin. University of Orléans. October 2013

range: [1,20] units: 1 unique values: 20 missing.: 0/20 percentiles: 10% 25% 50% 75% 90%

Applied Statistics and Econometrics Lecture 6

Two-Stage Least Squares

Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242

Source:

Bivariate Linear Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Gov Troubleshooting the Linear Model II: Heteroskedasticity

Week 10: Heteroskedasticity II

HLM versus SEM Perspectives on Growth Curve Modeling. Hsueh-Sheng Wu CFDR Workshop Series August 3, 2015

An Introductory Guide to Stata

Detecting and Circumventing Collinearity or Ill-Conditioning Problems

Robust Linear Regression (Passing- Bablok Median-Slope)

- 1 - Fig. A5.1 Missing value analysis dialog box

Package endogenous. October 29, 2016

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

Stat 500 lab notes c Philip M. Dixon, Week 10: Autocorrelated errors

CREATING THE ANALYSIS

Heteroscedasticity-Consistent Standard Error Estimates for the Linear Regression Model: SPSS and SAS Implementation. Andrew F.

Subset Selection in Multiple Regression

Bivariate (Simple) Regression Analysis

Week 11: Interpretation plus

STATISTICS (STAT) Statistics (STAT) 1

Standard Errors in OLS Luke Sonnet

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

THE UNIVERSITY OF BRITISH COLUMBIA FORESTRY 430 and 533. Time: 50 minutes 40 Marks FRST Marks FRST 533 (extra questions)

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

Binary Diagnostic Tests Clustered Samples

Bland-Altman Plot and Analysis

Notes for Student Version of Soritec

The cointardl addon for gretl

Complexity Challenges to the Discovery of Relationships in Eddy Current Non-destructive Test Data

Introduction to Mixed Models: Multivariate Regression

Health Disparities (HD): It s just about comparing two groups

Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM

THE LINEAR PROBABILITY MODEL: USING LEAST SQUARES TO ESTIMATE A REGRESSION EQUATION WITH A DICHOTOMOUS DEPENDENT VARIABLE

Statistical Matching using Fractional Imputation

Analysis of Complex Survey Data with SAS

CPSC 340: Machine Learning and Data Mining. Feature Selection Fall 2016

Missing Data Missing Data Methods in ML Multiple Imputation

Statistical Analysis of List Experiments

Departments of Economics and Agricultural and Applied Economics Ph.D. Written Qualifying Examination August 2010 will not required

Data Analysis Guidelines

Time-Varying Volatility and ARCH Models

Applied Regression Modeling: A Business Approach

Statistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects

MODULE THREE, PART FOUR: PANEL DATA ANALYSIS IN ECONOMIC EDUCATION RESEARCH USING SAS

CSC 328/428 Summer Session I 2002 Data Analysis for the Experimenter FINAL EXAM

Spatial Patterns Point Pattern Analysis Geographic Patterns in Areal Data

Heteroskedasticity and Homoskedasticity, and Homoskedasticity-Only Standard Errors

Predicting Web Service Levels During VM Live Migrations

Econometrics I: OLS. Dean Fantazzini. Dipartimento di Economia Politica e Metodi Quantitativi. University of Pavia

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

set mem 10m we can also decide to have the more separation line on the screen or not when the software displays results: set more on set more off

Estimation of Unknown Parameters in Dynamic Models Using the Method of Simulated Moments (MSM)

OLS Assumptions and Goodness of Fit

Resampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016

Introduction to Mplus

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Multiple Imputation for Missing Data. Benjamin Cooper, MPH Public Health Data & Training Center Institute for Public Health

A Statistical Analysis of UK Financial Networks

Introduction to Eviews Updated by Jianguo Wang CSSCR September 2009

Package MARX. June 16, 2017

Stat 5100 Handout #6 SAS: Linear Regression Remedial Measures

CSE 417T: Introduction to Machine Learning. Lecture 6: Bias-Variance Trade-off. Henry Chai 09/13/18

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS

Transcription:

INTRODUCTION TO PANEL DATA ANALYSIS USING EVIEWS FARIDAH NAJUNA MISMAN, PhD FINANCE DEPARTMENT FACULTY OF BUSINESS & MANAGEMENT UiTM JOHOR PANEL DATA WORKSHOP-23&24 MAY 2017 1

OUTLINE 1. Introduction 2. CLRM Assumptions 3. Static Panel Data Models 4. Getting Start with EViews 9 5. Data Analysis 6. Reading The Results PANEL DATA WORKSHOP-23&24 MAY 2017 2

1. INTRODUCTION There are 3 types of data structure available: 1. Time Series data is data that is collected at regular time intervals such as every month or every year. (N=1, t=1 T) Usually this represents the values for a single firm or a single variable at different points in time. Most macroeconomic data for real variables e.g. GDP or Consumption, is quarterly time series data. The data for monetary variables such as Interest rates is often monthly time series data. 2. Cross sectional data is data associated with the values of many different firms or households that is collected at a single point in time. (i=1 N, T=1) 3. Panel data is a combination of the other two where we have values for all members of a panel or group of firms or households measured at more than one period in time. (i=1..n, t=1 T) PANEL DATA WORKSHOP-23&24 MAY 2017 3

1. INTRODUCTION Classical panel data: N>T or known as short or micro panel Macro panel: T>N or known as long panel Balanced panel : data available for all cross section for all periods. No of observation: n = NT Unbalanced panel : different T for individual. (notes: Eviews cannot read unbalanced panel) PANEL DATA WORKSHOP-23&24 MAY 2017 4

1. INTRODUCTION Selection of econometric models will depend o type of data: 1. Least Squares Regression: Normally applied to cross-section data set (e.g Ordinary Least Squares, OLS) 2. Time-series Model: Normally applied to time series data, to uncover long run relations and short run dynamics. 3. Panel Data Modelling: Normally used to capture heterogeneity across samples and due to the need to have bigger sample size. Statics Panel data model : POLS, FE, RE, BE Dynamic panel data: GMM Panel unit root and cointegration (macro panel) PANEL DATA WORKSHOP-23&24 MAY 2017 5

1. INTRODUCTION Advantages & Disadvantages Panel Data allow us to control for variables you cannot observe or measure such as: Time-invariant factors like geographical area, firm management characteristics. Variables that change over time but not across entities like national policies, federal regulation, international agreements. In other word, panel data is able to take into account for individual heterogeneity (uniqueness)- resulted efficient estimates PANEL DATA WORKSHOP-23&24 MAY 2017 6

1. INTRODUCTION Advantages: i. Larger sample size, more variation, less collinearity therefore it will increased precision of estimates ii. iii. Ability to study the dynamic- repeated cross-sectional observations-adjustment over times Ability to account for heterogeneity across individual often ignored in pooled data-more robust against misspecification due to omitted variable Disadvantages: i. Data availibity/maintenance ii. iii. Measurement errors Elf-selection bias PANEL DATA WORKSHOP-23&24 MAY 2017 7

1. INTRODUCTION Why Analyse Panel Data? We are interested in describing change over time o social change, e.g. changing attitudes, behaviours, social relationships o individual growth or development, e.g. life-course studies, child development, career trajectories, school achievement o occurrence (or non-occurrence) of events We want superior estimates trends in social phenomena o Panel models can be used to inform policy e.g. health, obesity o Multiple observations on each unit can provide superior estimates as compared to cross-sectional models of association We want to estimate causal models o Policy evaluation o Estimation of treatment effects PANEL DATA WORKSHOP-23&24 MAY 2017 8

1. INTRODUCTION What kind of data are required for panel analysis? Basic panel methods require at least two waves of measurement. Consider student GPAs and job hours during two semesters of college One way to organize the panel data is to create a single record for each combination of unit and time period Notice that the data include: A time-invariant unique identifier for each unit (StudentID) A time-varying outcome (GPA) An indicator for time (Semester) Panel datasets can include other time-varying or time-invariant variables PANEL DATA WORKSHOP-23&24 MAY 2017 9

2.CLASSICAL LINEAR REGRESSION MODEL (CLRM) Table taken from page 37, Applied Econometrics:, Asteriou & Hall, 2 nd PANEL DATA WORKSHOP-23&24 MAY 2017 10 ed. 2011, Palgrave Macmillan

3. PANEL DATA MODEL: POOLED OLS Pooled OLS yit = β0 + βit Xit + αi + νit i. α i and v it are normally distributed and they are mutually independent, ii. E(αi) = E(vij) = 0, for i = 1,...,m, j = 1,2,...,m(i), iii. E( α iαi ) = 2 1 0,, i i otherwise, iv. E(v ij v i j ) = 2 2 0,, i i, j j otherwise. PANEL DATA WORKSHOP-23&24 MAY 2017 11

4.GETTING START WITH EViews 9 PANEL DATA WORKSHOP-23&24 MAY 2017 12

PANEL DATA WORKSHOP-23&24 MAY 2017 13

PANEL DATA WORKSHOP-23&24 MAY 2017 14

PANEL DATA WORKSHOP-23&24 MAY 2017 15

PANEL DATA WORKSHOP-23&24 MAY 2017 16

PANEL DATA WORKSHOP-23&24 MAY 2017 17

PANEL DATA WORKSHOP-23&24 MAY 2017 18

PANEL DATA WORKSHOP-23&24 MAY 2017 19

PANEL DATA WORKSHOP-23&24 MAY 2017 20

5. DATA ANALYSIS PANEL DATA WORKSHOP-23&24 MAY 2017 21

DESCRIPTIVE STATISTICS PANEL DATA WORKSHOP-23&24 MAY 2017 22

PANEL DATA WORKSHOP-23&24 MAY 2017 23

CORRELATION ANALYSIS PANEL DATA WORKSHOP-23&24 MAY 2017 24

PANEL DATA WORKSHOP-23&24 MAY 2017 25

PANEL DATA WORKSHOP-23&24 MAY 2017 26

PANEL DATA WORKSHOP-23&24 MAY 2017 27

POOLED OLS REGRESSION PANEL DATA WORKSHOP-23&24 MAY 2017 28

PANEL DATA WORKSHOP-23&24 MAY 2017 29

PANEL DATA WORKSHOP-23&24 MAY 2017 30

PANEL DATA WORKSHOP-23&24 MAY 2017 31

NORMALITY TEST PANEL DATA WORKSHOP-23&24 MAY 2017 32

PANEL DATA WORKSHOP-23&24 MAY 2017 33

PANEL DATA WORKSHOP-23&24 MAY 2017 34

DUMMY VARIABLES PANEL DATA WORKSHOP-23&24 MAY 2017 35

PANEL DATA WORKSHOP-23&24 MAY 2017 36

PANEL DATA WORKSHOP-23&24 MAY 2017 37

PANEL DATA WORKSHOP-23&24 MAY 2017 38

PANEL DATA WORKSHOP-23&24 MAY 2017 39

PANEL DATA WORKSHOP-23&24 MAY 2017 40

6.READING THE RESULTS Dependent Variable: CR Method: Panel Least Squares Date: 05/23/17 Time: 17:06 Sample (adjusted): 1996 2011 Periods included: 16 Cross-sections included: 17 Total panel (unbalanced) observations: 85 Time included Total no of groups n=nt Constant If this no is < 0.05 then the model is ok. This is F test to see whether all coeffs in the model are diff than zero. Variable Coefficient Std. Error t-statistic Prob. C 12.83313 2.387841 5.374368 0.0000 FE -0.160617 0.039199-4.097434 0.0001 FQ 2.032662 0.380137 5.347179 0.0000 CB 0.362423 0.185213 1.956787 0.0539 CAPR -0.203388 0.075746-2.685126 0.0088 R-squared 0.371546 Mean dependent var 6.020596 Adjusted R-squared 0.340123 S.D. dependent var 5.639222 S.E. of regression 4.580898 Akaike info criterion 5.938690 Sum squared resid 1678.770 Schwarz criterion 6.082375 Log likelihood -247.3943 Hannan-Quinn criter. 5.996484 F-statistic 11.82412 Durbin-Watson stat 0.735389 Prob(F-statistic) 0.000000 PANEL DATA WORKSHOP-23&24 MAY 2017 41

Coefficient Std. Error t-statistic Prob. Coefficients of the regressors. Indicate how much Y changes When X increase by one unit. 12.83313 2.387841 5.374368 0.0000-0.160617 0.039199-4.097434 0.0001 2.032662 0.380137 5.347179 0.0000 0.362423 0.185213 1.956787 0.0539-0.203388 0.075746-2.685126 0.0088 T-values test the hypothesis that each coeff is diff from 0 To reject this, the t-value has to be higher than 1.96 (95% confidence interval). If this is the case then you can say that the variables has a significant influence on your DV (Y). The higher the value the higher the relevance of the variable. Two-tail p-values test the hypothesis That each coeff is diff from 0. To reject this, P-value has to be lower than 0.05 (95%). If this is Case the you can say that the variable has a significant influence On you DV (Y) PANEL DATA WORKSHOP-23&24 MAY 2017 42

R-squared 0.371546 Mean dependent var 6.020596 Adjusted R-squared 0.340123 S.D. dependent var 5.639222 S.E. of regression 4.580898 Akaike info criterion 5.938690 Sum squared resid 1678.770 Schwarz criterion 6.082375 Log likelihood -247.3943 Hannan-Quinn criter. 5.996484 F-statistic 11.82412 Durbin-Watson stat 0.735389 Prob(F-statistic) 0.000000 R-squared shows the amount Of variance of Y explained by X Adjusted R-squared shows the same as R-squared but adjusted by the number of cases and number of variables. When the number of variables is small and the number of cases is very large, then Adj R-squared is closer to R- squared PANEL DATA WORKSHOP-23&24 MAY 2017 43