Panel Data Flashcards
What is panel data?
Also called longitudinal data, are data for multiple entities in which each entity is observed at two or more periods.
Balanced & Unbalanced.
Pooled OLS: If individual effect does not exist. Does not take time-specific effects and variation across entities into account. If theres no cross-sectional or time specific effect, Pooled OLS can be used
Fixed Effects: 1) Control for unobserved variables that vary across entities but not over time, and 2) time specific effects that don’t vary across entities.
Can control for biases that control across entities, but not over time. For example, if you are analyzing Norwegian exports to the EU region, this variable can control for the French price sensitivity, which might be different to Polands.
You can also control for time specific effects. From the last example, if there EU issues a law, this will affect all of the buyers (not vary across entities).
Random Effects:
That entities have variables that individually varies over individual time. individual disturbance
Fixed: the different regressions have different intercept
Random: have individual disturbance
What are the assumptions?
Same as OLS with some small adjustments
1.Error term must have a conditional mean of zero WITH ALL OBSERVATIONS OF THE VARIABLE X: so there shall be no past, present, future interactions between x and u. a - good profit one year does not mean anything the next year?
- I.I.D ACROSS ENTITES: this one is not the same as in a regular OLS.
- observations within an entity can correlate, X - autocorrelation allowed
- Autocorr: a firms income one week can affect the income the next week. BUT, Firm A’s income cannot affect firm B’s - large outliers unlikely
- no perfect multicorr
How can the fixed regression be done?
- Binary regression
- Entity demeanded
- First diff specification
Entity demeaned the way to go
How can you do entity demeandet regression in R
Use plm-package with the “within” function. Next use “coeftest” and its function “vcovGC”. This will extract the heteroskedasticy and handle the autocorrelation.
Why are panel data useful?
With panel data you can control for factors that:
(1) vary across entities but do not vary over time,
(2) could cause omitted variable bias if they are omitted,
(3) are unobserved or unmeasured – and therefore cannot be included in the regression using multiple regression.
The key idea: if an omitted variable does not change over time, then any changes in y over time cannot be caused by the omitted variable
Describe the differences and equalities in the regressions if you have a entity fixed regression
The intercept is unique for each entity, but the slope is the same for all.
Recall that shifts in the intercept can be represented using binary regressors
What is Time Fixed Effects regression?
An omitted variable might vary over time but not across states. This can for example be safer cars or changes in national laws. These produce intercepts that change over time. We use S to find the combined effect of variables with changes over time that are same for each entity.
Why do we use clustered standard error?
The usual OLS standard errors will in general be wrong, because they assume that the error term is not autocorrelated. The solution for this is to use clustered standard errors. We allow for autocorrelation WITHIN entities. Are also robust for heteroskedasticy within and across entities.
what is the main advantage of panel data
we are able to allow for certain forms of unobserved individual heterogeneity that is constant over time which cannot be done with cross-sectional or time-series
do you need cluster robust standard errors for pooled OLS
yes, standard errors don’t take the serial correlation of vit into account will be wrong,
need cluster rob se unless σσ^2=0 (2nd σ subscript)
what is heterogeneity
the quality or state of being diverse in character or content
In panel data, is omitted variables a problem?
No, assuming the ommited variable does not change over time, the change in Y must be caused by the observed factors
What is a fixed effects model?
A regression performed on panel data to test the effect of being in state i. The model can be either entity demeaned, time demeaned or both. All regressions will have the same slope, but different intersections.
Pick Fixed Effects versus Pooled OLS
F-test or Wald Test
When H0 is rejected
Pooled OLS vs Random Effects
Breusch-Pagan LM test
LM = Linear Model