M16 - Panel data regression Flashcards
Panel data regression
- def
- basic idea
- sources of variation
- e.g.
several time points
several observations
several items
- its a combination of cross-sectional data and time series
- between units, 2. within each unit
- Annual data from firms before/after IPO
Sources of variation in Panel data regression
- between units: as with cross-sectional; examine differences between the observed units
- within each unit: variation between different points in time
Adv Panel data
+ high internal & external validity
+ allows to control the impact of latent heterogeneity
+ more df –> higher efficacy in estimating
+ testing of more complex hypothesis possible
- Data collection takes time
- costly
- Panel mortality
- time series problem: the standard quality measures of regression (R², F, etc.) only tell us how well the model fits the OBSERVED values
Disadv Panel data
+ high internal & external validity
+ allows to control the impact of latent heterogeneity
+ more df –> higher efficacy in estimating
+ testing of more complex hypothesis possible
- Data collection takes time
- costly
- Panel mortality
- time series problem: the standard quality measures of regression (R², F, etc.) only tell us how well the model fits the OBSERVED values
Ways to analyze Panel data
- methods
- what about alpha
- what if Xk captures all relevent aspects of Y?
fixed effects
random effecty
pooled OLS
- alpha captures unobserved effects for unit i
- if Xk captures all variables, than we can drop alpha and use pooled OLS because we have no unobserved variables
fixed effects?
- def
- aim
- every unit of observation has a different constant
–> measure the effect of the explanatory variable, even if individual, time-constant heterogeneity is correlated with the explanatory variable
If we assume fixed effects, we impose time independent effects for each entity that are possibly correlated with the regressors.
- erase individual heterogeneity by transformation
- individual heterogeneity is fix for each entity over time
How to deal with fixed effects alpha i?
- -> introduce dummy variable Aj to make it comparable to x
- -> analysing first differences between successive error terms –> eliminating alphaAj
OR –>within-groups fixed effects:
eliminating alphai by subtracting from each
variable for each unit its mean value (over time)
Interpretation of fixed effects
- df
- slopes?
- constant?
–> we lose n degress of freedom for the estimation of betak
(we either lose n observations or estaimate n additional parameters (dummy v))
- -> slopes are the same for all units, but const differ
- -> constant captures the combined effects f several unknown variales that are different between units, but stable over time
Pooled OLS
take every observations of every time period as totally indpendent point
–> only possible to use, if you have no unobserved variables
Random effects
- def
- why cant we use OLS here?
- adv over FE
random effects works under the assumption that alphai are totally uncorrelated with Xj, purely random
- uij is typically subject to autocorrelation
- -> OLS is inefficient and the standard errors are estimated wrong
- -> Use GLS (generalized least squares)
- we do not lose n degress of freedom –> more efficient
Hausman Test
- def
- tests if …
- whats H0?
to assess whether to use fixed or random effects estimation
–> tests if the unobserved effects alphai are independent of the explanatory variables xj
- H0: alphai is distributed independently of Xj
- -> if rejected: RER will be subject to unobserved heterogeneity–> use FER (not randomly /dep of Xj)
- -> if not rejected: FER and RER are consistent, but FER will be inefficient –> use RER (randomly /indep of Xj)
which types of models can be used for Panel data?
OLS Logit Probit, Tobit autoregressive models other time series models
Panel regression is not so much an … method, but a type of … or …
Panel regression is not so much an ANALYSIS method, but a type of DATA SET
or DATA STRUCTURE