Panel Data Flashcards
What are the confounders in panel data regression split into (i.e. what is vi split into aka the residual)
ai + uit
What is the general form for a panel data regression (in its most basic form)
(outcome)it = b0 + B1(regressor)it + ai + uit
What does ai represent in the residual
the innate confounder, e.g. if the outcome was school absences across cities, ai would represent the innate propensity of the city to have students miss school, for example the culture there
This confounder is constant over time
How does the first differences estimator work to overcome the bias of ai?
We consider the difference of the outcome over time (t and t-1)
This is because b0 and ai get differenced out since they are constant over time, as well as any other variable that is constant over time.
We run an OLS on the regression above, and use variation over time that is uncorrelated with ai to estimate the effect of interest
How does the fixed effects estimator overcome the bias of ai?
For each i we define di to be a dummy variable taking the value 1 if the observation is from e.g. city i
We include di in our model for all i EXCEPT ONE in order to avoid the dummy variable trap and a violation of no perfect collinearity
This is thus literally just an application of the method of multiple regression to control for confounders - we take the confounder out of the error term and include it directly in the model
What is the general form of a first difference regression equation?
The regression we get is:
(Change in outcome)it = b1(change in regressor)it + (change in u)it
What is the general form of a fixed effects regression (not controlling for time)?
(Outcome)it = b0 + b1(regressor)it + sigma from i=2(di x ai) + uit
What is the general form of a fixed effects regression controlling also for time?
(Outcome)it = b0 + b1(regressor)it + sigma from i=2(di x ai) + sigma from t=2(dt x ct) + uit
How to interpret b0 in fixed effects regression (not controlling for time)
the expected outcome in the excluded variable (due to dummy var trap) when regressor = 0
How to interpret bo in fixed effects regression when controlling also for time
the expected outcome in excluded variable (due to dummy var trap) on the excluded time period (e.g. day) when regressor = 0
How to interpret the coefficient ai, e.g. a(new york) where the i variables are cities (not controlling for time)
The average difference in the outcome in that variable (new york) versus the excluded one, holding fixed the regressor
How to interpret the coefficient ct in fixed effects regression when you are controlling for time, lets say for c(jan 3 2022)
The average difference in outcome across all the variables of interest (e.g. different cities) on that day (jan 3 2022) compared to the excluded time period (excluded day due to DVT), holding fixed the regressor
How to interpret B1 in fixed effects regression (for both controlling for time and not)
The change in outcome associated with an increase of 1 in the regressor on average, holding fixed the outcome in the other variables (e.g. all other cities)
If also controlled for time, then also holding fixed the time period
Pros and Cons of First Differences
Pros: We remove the confounder ai, as well as all other unobserved confounders that stay constant over time
Cons: But we also remove any binary control variables that we implemented in the regression because they also stay constant over time so are differenced out
Pros and Cons of Fixed Effects
Pros: We remove the confounder ai and any other unobservable time invariant confounder
Cons: But we also remove any binary control variables that we implemented in the regression because they violate no perfect collinearity (since it is a binary variable where all are included) so the regression cannot be estimated