HANDOUT 5 Flashcards
Panel data allows us to…
observe multiple individuals over multiple points in time.
restriction on individuals for panel data
MUST be the SAME individuals that we follow over time.
Strongly balanced data =
same number of observations for every individual
Why may the data be unbalanced? Is this an issue?
Some individuals drop out the study = attrition rate. Problem if attrition rate = f(X variables) –> self-selection bias = sample no longer representative.
2 types of unobserved heterogeneity
- unobserved individual heterogeneity = ai
2. unobserved time heterogeneity = dt
Omitted relevant variable formula for Eit Ai
E(b1) = B1 + B2 COV(Ai, Eit) / Var(Eit)
E(b1)≠B1
Pooled OLS =
same intercept, same slope
Use all NT observations
No dummies = no heterogeneity
Treat same individual at 2 points in time as 2 separate individuals.
Pooled OLS equation
Yit = alpha + BXit+ €it
4 assumptions for pooled OLS
- Each individual randomly selected
- No individual heterogeneity
- No time-specific heterogeneity = parameters constant across time (no structural change)
- E(€it I Xit) = 0 - strict exogeneity
Are estimates for pooled OLS good?
They’re consistent for large N / large T / large NT.
beta interpretation for pooled ols
average increase in Y for unit increase in X, averaged across all individuals and all times.
formula for pooled OLS estimate of b
sum i=1,..,N sum t=1,…,T (Xit - X bar)(Yit - Y bar)
/ sum i=1,..,N sum t=1,…,T (Xit - X bar)^2
How can we have different intercepts and different slopes for all individuals?
Yit = alpha i + Bi Xit + €it
Run N separate regressions - get alpha i and bi for every individual. get sigma^2 for every individual.
We can only do OLS with different intercepts and different slopes if…
T is large enough for every i = consistency.
bi formula for different slopes, different intercepts
bi = sum t=1,..,T (Xit - Xi bar)(Yit - Yi bar) /
sum t=1,..,T (Xit - Xi bar)^2
De-mean using individual’s own time average.
How else can we run a regression that allows intercepts and slopes to differ for all individuals?
Run one large regression with additive and multiplicative dummies for N-1 individuals - this way we only get one sigma^2 estimate.
Fixed effects / within-groups model allows…
Different intercepts, same slopes
= allows for individual heterogeneity ai
Equation for FE model
Yit = alpha + BXit + ai + €it
b formula for FE
b = sum i=1,..,N sum t=1,…,T (Xit - Xi bar)(Yit - Yi bar)
/ sum i=1,..,N sum t=1,…,T (Xit - Xi bar)^2
Why is FE model also know as “within-groups” model?
Because we de-mean by an individuals own time average (then average across all individuals)
How many parameter estimates do we get for FE?
1 x slope coefficient Beta
N x intercept estimates
Slope coefficients in FE are consistent if…
Large N / large T / large NT
Intercepts in FE are consistent if…
large T
De-meaning method for FE
Yit = alpha + B1Xit + B2Ai + €it
Yi bar = alpha + B1 Xi bar + B2 Ai + €i bar
(Yit - Yi bar) = B1(Xit - Xi bar) + (€it - €i bar)
Does it matter if we omit Ai for FE model?
NO - because de-meaning eliminates Ai anyway = we can control for unobserved individual heterogeneity. So don’t worry about bias.
3 stages of partitioned regression for FE
- Regress Yit on Ai & save resid (Yit - Yibar)
- regress Xit on Ai & save resid
- Regress Yi tilda on Xi tilda
= we’ve stripped out the individual heterogeneity.
In a FE model, our X variables must…
MUST vary with time. So we cannot estimate coefficients on race, gender etc. we can only control for them as part of FE.
Ai captures…
all the time invariant variables.
Random effects model equation
Yit = alpha + BXit + ai + €it
How does RE differ to FE? What 3 assumptions do we make?
WE make some assumptions about ai, which are random drawings from a distribution:
E(ai) = 0
V(ai) = sigma^a - time invariant
COV(ai, aj) = 0
In a RE model, what happens to ai?
It becomes part of the error term
Ui = ai + €i
Does Ui in RE model satisfy CLRM assumptions?
- E(Ui) = 0 yes
- V(Ui) = sigma^2 a + sigma^2 yes time invariant
- COV(Uit, Uis) = sigma^2 a ≠ 0 violates CLRM.
Why is there a non-zero covariance between Uit and Uis for RE?
Looks like serial correlation
Because ai is time invariant and is part of error term.
Solution to serial correlation in RE model
Transform the equation so that the error term is serially uncorrelated within i.
lamba to transform RE model =
Lambda = 1 - sqrt[sigma^2 / (sigma^2 + Tsigma^2 a)]
Transformed RE model equation
(Yit - lamba Yi bar) = (1-lambda) alpha +
B(Xit - lamba Xibar) + (Rit - lamba Ri bar)
If lamba=0, what does RE model become?
lamba = 0 –> POOLED OLS
If lambda = 0, sigma^2 a = 0
So NO individual heterogeneity
Yit = alpha + BXit + Rit
If lamba=1, what does RE model become?
lamba = 1 –> FE Model
If lambda = 1, sigma^2 a –> infinity
HUGE individual heterogeneity
Yit - Yi bar = B(Xit - Xi bar) + (Rit - Ri bar)
If 0 < lambda < 1, which model is most efficient?
RE model
FE vs RE when COV(ai, Xit) = 0
RE = unbiased, efficient FE = unbiased, but inefficient
FE vs RE when COV(ai, Xit) ≠ 0
RE = efficient still, but BIASED FE = unbiased still, but inefficient
Why is FE model inefficient? But then why is it useful?
inefficient as estimate a lot of parameters - different intercept for every individual.
But good if huge individual heterogeneity.
Test for FE vs RE
Hausman Test
H0: COV(ai, Xit) = 0 - RE correct and efficient
H1: COV(ai, Xit)≠0 - RE incorrect
FE correct under either
Test stat for Hausman test
H = (b FE - b RE)^2 / V(b FE) - V(b RE)
Hausman test: CVs from…
Chi-squared distribution
dof = number of slope coefficients we estimate
Hausman test stat under H0 and H1
H0: b FE = b RE so H –> 0
H1: b FE ≠ b RE so H –> infinity as we square
Under both V(b FE) > V(b RE) so denom > 0
We can also do the Hausman test for…
OLS vs IV
H0: COV(Xi, €i) = 0
OLS is efficient, but biased under H1
IV is always ok, but inefficient
First difference model equation
change Yt = B change Xt + change €it
Ai eliminated.
In first difference model, t goes from…
t = 2,…,T
2 because we have a first difference so decrease no observations by 1.
OLS estimate of b for first difference model
b = sum i =1,…,N t=2,…,T (changeXit - changeX bar)(changeYit - change Y bar) / sums (change Xit - change X bar)^2
In first difference model, change X bar =
change X bar = sums change Xit / n(T - 1)
When is first difference model same as FE?
when T=2
When is first difference model NOT same as FE?
When T>2
When T>2, worry with first difference model =
The error term is MA(1) process
change €it = €it - €it-1
Complications with dynamic model for panel data
OLS -> biased estimators
As t–>infinity, and N fixed, estimates consistent
But usually in panel data T small & large N
What is the bias caused by in dynamic model?
BY having to eliminate ai from each observation –> correlation of order (1/T) between lagged dependent variable and residuals.
Solution to OLS bias in dynamic model
Take first differences & then use instruments to do IV estimation. Since error term = MA1 = 1 period memory, use lags yt-2 or anything before as replacement for yt-1.