Tutorial 5 - Panel data, Pooled OLS, Random-Effects-Estimator, Fixed-Effects-Estimator Flashcards
What is panel data?
panel data or longitudinal data are multi-dimensional data involving measurements over time. Panel data contain observations of multiple phenomena obtained over multiple time periods for the same firms or individuals.
How does a Linear Panel Data Model look like?
- i = 1, .., N cross-sectional dimension (e.g. persons, firms, countries)
- t = 1, …, T time-series dimension (e.g. years)
How does a stacked model for all NT observations look like?
- i = 1, .., N cross-sectional dimension (e.g. persons, firms, countries)
- t = 1, …, T time-series dimension (e.g. years)
What does Pooled OLS estimator mean and how is it defined?
pooling all NT observations and applying OLS
Which condition needs to be met for the POLS to be consistent?
POLS is only consistent under the assumption of weak exogeneity of the regressors
What is the variance of the POLS error term?
NT × NT matrix:
What may the error variance matrix Ω of POLS contain and what not?
it may contain:
- heteroscedasticity, i.e. different variances of error terms across individuals
- autocorrelation of error terms within individuals over time
BUT:
- Assumption: no correlation of error terms across individuals
What does the Breusch-Godfrey-Test test?
Test for serial correlation
How does a model with serial correlation look like?
data:image/s3,"s3://crabby-images/3b4d7/3b4d7b4f6202d11193be3fa6b29047fda9a3c5b0" alt=""
What are the steps for applying the Breusch-Godfrey test?
- Estimate an (P)OLS regression to estimate the model
- Using the estimated residuals ûᵢ as an estimate for û, estimate following model (below)
- Test (n − p)R² against critical value from X²ₚ (p needs to be justified, e.g. how many observations per person (cluster))
What is μᵢ in a model with random effects?
individual-specific, time-constant unobserved effect
What are the assumptions for Random Effects?
see (1), (2), (3) below
with (3): the regressors are strictly exogenous, i.e. mean independence of composite error term and regressors of all (!) time periods. Assumption is violated if the individual-specific effect μᵢ is correlated with xᵢₜ!
What effect does μᵢ have on the composite error (in random effects models)?
Cov (ϵᵢₛ, ϵᵢₜ) ≠ 0
Note: RE explicitly models the serial correlation in a GLS framework
How does the variance between two individual observations look like in a model with random effects?
Show that there is serial correlation of error terms within individuals over time in a model with random effects.
Is there autocorrelation across individuals in a model with random effects?
No autocorrelation across individuals
How does the Var(ϵ) look like in a model with random effects?
- Var(ϵ) = Ω, a NT × NT matrix.
- However, under the RE assumptions, it only includes two distinct parameters: σμ² and σₑ².
- In practice, Ω has to be consistently estimated -> ^Ω
What is the random effects estimator?
What are problems both with POLS and RE that Fixed Effects and First Differences can fix?
POLS and RE are inconsistent if the individual-specific effect μᵢ is correlated with xᵢₜ (example: individual-specific ability could be correlated with education, or with joining a union)
How do Fixed Effects and First Differences handle the case that the individual-specific effect μi is correlated with xit ?
Fixed Effects and First Differences allow for this correlation by using only variation in xᵢₜ within individuals over time -> μᵢ cancels out -> controls for time-constant selection bias
How can you find an expression for fixed effects?
Fixed Effects = POLS on time-demeaned data:
How can you find an expression for first differences?
First Differences = POLS on first-differenced data:
How would a fixed effects model be different if estimated on the data of the model shown below?
- variables educ and black drop out because they don’t change over time for a given individual
- FE is the same as a “Least Squares Dummy Variable Estimator” - here: POLS with a dummy for each individual
- Time-constant variables like educ drop out, but we can estimate the interaction of educ with time dummies -> i.e., we can’t estimate the level of the return to education, but we can estimate whether the return changed over time
What do the different terms in the model mean?
index ᵢₜ: not only identify the person, but also the time of the observation
λₜ stands for control variables (one for each year, except the base year) = like a dummy for each year
μᵢ = 545 values, one for each observation
Which four characteristics/assumptions describe the multiple linear regression mode?
- Linearity:
- Independence:
- Strict Exogeneity:
- Error Variance:
What is linearity in this model?
Linearity: The model is linear in parameters α,β,γ, effect ci and error uit
What is independence in this model?
Independence: Xi, zi, yi are i.i.d. = independent and identically distributed. The observations are independent across individuals but not necessarily across time.
What is exogeneity in this model?
Strict Exogeneity: The idiosyncratic error term uit is assumed uncorrelated with the explanatory variables of all past, current and future time periods of thesame individual. This is a strong assumption which e.g. rules out lagged dependent variables.
What is assumed about error variance in this model?
What is ci in the random effects model?
In the random effects model, the individual-specific effect is a random variable that is uncorrelated with the explanatory variables
Which three additional assumptions make this a random effects model?
- Unrelated effects
- Effect variance
- Identifiability
What is the assumption of related effects in the fixed effects model?
That the assumption from the random effects does not hold:
E [ci | Xi, zi] ≠ 0
assumes that the individual-specific effect is a random variable that is correlated with the explanatory variables of all past, current and future time periods of the same individual
What is the assumption of unrelated effects in the random effects model?
E [ci | Xi, zi] = 0
assumes that the individual-specific effect is a random variable that is uncorrelated with the explanatory variables of all past, current and future time periods of the same individual
What is the assumption of effect variance in the fixed effects model?
absence of assumption of constant variance of the individual specific effect
What is the assumption of effect variance in the random effects model?
assumes constant variance of the individual specific effect
What is ci in the fixed effects model?
In the fixed effects model, the individual-specific effect is a random variable that is allowed to be correlated with the explanatory variables
Which three additional assumptions make this a fixed effects model?
- Related effects
- Effect variance
- Identifiability
What happens if this random effects model is estimated by Pooled OLS?
- The pooled OLS estimator of α, β and γ is unbiased
- It is consistent and approximately normally distributed
- However, the pooled OLS estimator is not efficient
- The usual standard errors of the pooled OLS estimator are incorrect -> Correct standard errors can be estimated with the cluster-robust covariance estimator treating each individual as a cluster
What happens if this fixed effects model is estimated by Pooled OLS?
The pooled OLS estimators of α, β and γ are biased and inconsistent, because the variable ci is omitted and potentially correlated with the other regressors
What is the first difference etimator?
- The first-difference (FD) estimator is an approach used to address the problem of omitted variables with panel data.
- The estimator is obtained by running a pooled OLS estimation for a regression of Δ yit on Δxit