Panel data-FE Flashcards
The model for fixed effects is
yi = β0 + β1xi + ci + ui
What are some examples of what ci could be?
ci could represent the effects of ability, health, motivation, intelligence, parental resources, managerial quality, organizational culture, state/local policies or regulations, etc.
What does the regression of the de-meaned y on de-meaned x look like, mechanically?
yit −y ̄i =β1(xit −x ̄i)+(uit −u ̄i)
What does it mean to de-mean in the context of fixed effects?
within each panel unit i, take the average over t on both sides and subtract the average from each it observation:
What are some examples of time-invariant explanatory variables that fall out of the fe model?
gender, race
Why do time-invariant explanatory variables fall out of the fe model?
They all equal their within-group mean, so the within-transformation equals zero
write out first difference model
∆yi =β1∆xi +∆ui
How do OLS assumptions apply to first difference model?
The new error term ∆ui is uncorrelated with the new explanatory variable, ∆xi .
This requires that we have no cross-period correlations between u and x: called strict exogeneity
The xi must vary over time for at least some i, else they difference out (same as the within transformation)
What does strict exogeneity require?
no cross-period correlations between u and x
In theory, what happens to constant when you estimate first difference model?
Differences out–if you want, you can include it to allow for year-to-year trend
What happens when you apply first difference model to multiple years?
each year of data is differenced with previous year, so you lose the first year in your dataset
True or false: In the one-way fixed effects model, we treat ci as a parameter to be estimated
True
Mechanically, what are we doing when we estimate ci?
Effectively we are allowing for a unique intercept for every cross-sectional
unit i. This is feasible to estimate since each i is observed multiple times.
model for fixed effects?
yit = β0 + β1xit + ci + uit
What paramaters are we estimating when using fe?
intercept (B0), slope(B1), and fixed effects (which are n-1 intercepts)
what does the LSDV (least squares dummy variable) model do?
includes (n-1) dummy variables in the regression
drawbacks of LSDV approach?
- time-consuming
- soaks up degrees of freedom
- often not interested in the fixed effects themselves–(exception is the teacher effects work)
When is FE more efficient than first difference?
FE is more efficient (smaller standard errors) than first differencing if the error terms are serially uncorrelated and T > 2
True or false: FE Assumes no correlation in u across units of panel i
True
Consistency and unbiasedness of fixed effects themselves in large samples?
The estimates of the fixed effects themselves (ci ) are unbiased but inconsistent in large samples. (Why? As the number of panel units grows (N → ∞) the number of parameters to estimate grows).
What model does stata fit when you run xtreg?
(yit −y ̄i +y ̄)=β0 +β1(xit −x ̄i +x ̄)+(uit −u ̄i +u ̄)
Things to ask yourself when you run fe
Where is the identification coming from?
How much variation is there within panel units?
What happens when there is little variation within panel units?
You risk imprecise estimates
xtreg ouput: What does the f test tell you? What is the null?
F-test for joint significance of fixed effects (null hypothesis H0 is that all fixed effects are zero). If rejected, fixed effects model is a reasonable assumption and regular OLS would provide inconsistent estimates. In practice, rarely rejected.
xtreg output: what does R-squared within tell you?
variance “explained” by within-group deviations from mean
xtreg output; what does R-squared between tell you?
variance in group means y ̄i “explained” by the group mean x’s: x ̄i
xtreg output; what does sigma_u tell you?
estimate of the standard deviation in fixed effects (ci )
Assumptions for FE?
FE.1: linear model yit = β1xit1 + … + βkxitk + ci + uit
FE.2: cross-sectional units are a random sample
FE.3: xit varies over time for some i, no perfect collinearity
FE.4: ∀t, E(uit|Xi,ci) = 0 or the expected value of u given x in all time periods is zero (strict exogeneity)
FE.5: Var (uit |Xi , ci ) = Var (uit ) = σu2 - homoskedasticity
FE.6: for t ̸= s errors are uncorrelated: Cov (uit , uis |xi , ci ) = 0. No serial correlation.
What assumptions do you need for unbiasedness for FE and first difference?
FE.1: linear model yit = β1xit1 + … + βkxitk + ci + uit
FE.2: cross-sectional units are a random sample
FE.3: xit varies over time for some i, no perfect collinearity
FE.4: ∀t, E(uit|Xi,ci) = 0 or the expected value of u given x in all time periods is zero (strict exogeneity)
What assumptions do you need for FE model to be BLUE?
FE.1: linear model yit = β1xit1 + … + βkxitk + ci + uit
FE.2: cross-sectional units are a random sample
FE.3: xit varies over time for some i, no perfect collinearity
FE.4: ∀t, E(uit|Xi,ci) = 0 or the expected value of u given x in all time periods is zero (strict exogeneity)
FE.5: Var (uit |Xi , ci ) = Var (uit ) = σu2 - homoskedasticity
FE.6: for t ̸= s errors are uncorrelated: Cov (uit , uis |xi , ci ) = 0. No serial correlation.
When is fixed effects more efficient than the first difference model?
FE.6: for t ̸= s errors are uncorrelated: Cov (uit , uis |xi , ci ) = 0. No serial correlation.
Where is variation in FE (within) model coming from?
uses deviations from unit means, e.g., mean “pre” vs. mean “post”
Where is variation in first difference model coming from?
uses variation in successive time periods, e.g., just prior to and just after a “treatment” (a change in x)
Is the assumpion that errors ui are iid typically satisfied in panels?
No–With repeat observations on the same cross-sectional unit, it is likely that errors are correlated across observations for the same i.
How do you cluster standard errors in fe?
The “cluster” is typically the cross-sectional unit, although when the regressor of interest is aggregated at a higher level (e.g., state), can cluster at that level. Theory requires large N and that higher levels nest the cross-sectional units.
Two advantages of fixed effects models?
- Unobserved ui can be correlated with the explanatory variables
- β1 is estimated using within-group (i) variation in x,y
5 disadvantages of fixed effects models?
Cannot estimate slope coefficients for time-invariant x
Fixed effects “remove” a lot of the variation in y
The “within” model is less efficient (higher standard errors)
There may be more measurement error (and attenuation bias) when relying on within-group changes vs. levels
Group intercepts use up a lot of degrees of freedom