Lecture 10 (Panel data) Flashcards

Question 1

Q

What is the benefit of using panel data?

Answer

A

We can solve OVB that is either different across units but constant over time or constant among units but that changes over time.

Question 2

Q

What do we need for consistent estimates in panel data when we assume that the error term is vit = ai + uit?

Answer

A

Using OLS, for a consistent estimate of $\beta$ we need that OLS.1$E[x’{it} v{it}] =0$ holders This can be divided into two parts.

$$
E[x’{it}u{it}]= 0 \
E[x’{it}a{i}]= 0
$$

where the second part is our main concern.

Question 3

Q

What is the random effect estimator?

Answer

A

See notion.

Question 4

Q

Explain the assumptions of RE and when one should use this one instead of FE etc.

Answer

A

The two key assumptions for the RE model are:

$$
E[u_{it}|x_i,a_i]= 0 \
E[a_i|x_i]= 0
$$

That is, we need the independence of $x$ in all periods (strict exogeneity) and that $E[a_i|x_i]= E[a_i] = 0$.

If we do believe these assumptions, we will have an efficiency gain by using this RE-estimator compared to pooled OLS or FE/FD.

If we don’t believe the second assumption (that $a_i$ not is a problem or that we can condition away the problem), we will have a biased estimate and should go for a FE or FD-estimator.

This is thus an efficiency/consistency trade-off. In fact, the RE estimator can be expressed as an “efficiently” weighted average of within (FE) and between estimators.

The full rank condition also applies.

Question 5

Q

Explain what a FE model and what the assumptions are.

Answer

A

This refers to running a “demeaned” regression. FE is a within estimator as it exploits within-unit variation for identification.

We should then think of $a_i$ as an individual-specific intercept. Fixed effects (FE) is a way of controlling for this cofounder by eliminating it by averaging over time and then estimating a transformed model where we subtract the average.

We can off-course do this demeaning for both unit- and time-fixed effects. Doing both refer to two-way-fixed effects regression (TWFE).

Assumptions:

Linearity (? see the beginning of slides)
Strict exogeneity
- $E[u_{it}|x_{it},a_{it}]=E[u_{it}|x_{i1},…,x_{iT},a_{it}]=0$
- Implying that $E[\tilde x_{it}’\tilde u_{it}]=0$$E[\tilde x_{it}’\tilde u_{it}]=0$
Full rank - $k$
- $\text{rank} \ E[\tilde X_i’ \tilde X_i]=k$

Under these assumptions, $\hat \beta_{FE}$ is an unbiased estimator of $\beta$.

Question 6

Q

Derive the FE estimator. Also show the estimator looks in matrix notation

Answer

A

See notion.

Question 7

Q

What are the caveates with FE and FD?

Answer

A

We get rid of all time-invariant observables (mechanically by the transformation)
We drop observations without within-variation in $x_{it}$
We will have power and measurement error issues
Demeaning an interaction term is not the same as interacting demeaned variables
We will have age-time-cohort collinearity problems

Question 8

Q

What do we mean with strict exogeneity?

Answer

A

Strictly exogenousmeans the error term $u$ is unrelated to any instance of the variable x; past, present, and future. $x$ is completely unaffected by $y$.

E[u_{it}|x_{i1},…,x_{iT},a_{it}]=0

Question 9

Q

What is the difference between the FE approach and the dummy approach for a_i?

Answer

A

The dummy approach is in fact equivalent to the FE-estimator! However, it is rarely computationally feasible since we might end up with a lot of dummy variables to estimate.

Question 10

Q

Explain and derive the FD estimator. What are the assumptions?

Answer

A

See notion

Assumptions:

Linearity (? see the beginning of slides)
Strict exogeneity is sufficient but not necessary! instead:
- $E[\Delta u_{it}|\Delta x_{i,t-1}, \Delta x_{i,t}, \Delta x_{i,t+1}]$
Full rank - $k$
- $\text{rank} \ E[\Delta X_i’ \Delta X_i]=k$

Under these assumptions, $\hat \beta_{FD}$ is an unbiased estimator of $\beta$.

Question 11

Q

What are the differences between FE and FD?

Answer

A

When $T = 2$, FE = FD. This is not true when $T > 2$.

Differences

Identifying assumptions are less strict for FD
We mechanically loose more observations with FD than FE.

Question 12

Q

How should we think about standard errors if we use FE?

Answer

A

We need to cluster SE on panel unit (e.g firm level)!

Question 13

Q

What can we do if we have violations agains strict exogeneity with panel data?

Answer

A

If we have a violation of strict exogeneity in the form of an Ashenfelter’s dip, one solution can be to include a lagged dependent variable as a control

$$
y_{it} = \lambda_t + \rho y_{it-h}+\beta D_{it}+u_{it}
$$

Hence, we compare workers with similar earning histories.

The LDV estimator relies on qualitatively different identifying assumptions:

$$
E[u_{it}|y_{it-h}, \lambda_t, D_{it}]=0
$$

We can however NOT use FE and LDV together. Using a lag-dependent variable in a FE model or a unit fixed effect an LDV model will mechanically create a violation of the strict exogeneity assumption.

Question 14

Q

What are the panel data validity checks?

Answer

A

Individual-specific time trend

A general threat to identification is if the unobserved effects evolve over time. A useful check is thus to estimate the random trend model

$$
y_{it} = a_i +g_it+x_{it}’\beta+v_{it}
$$

where $g_it$ denotes the individual-specific time trend. Including this trend should not change our estimate of $\beta$.

Lead of treatment

The effect on a one-year lead $w_{it+1}$ of $x_{it}$ should not be significant. If it does, then strict exogeneity is violated since $E[u_{it}|w_{it+1}]\neq0$.

Question 15

Q

Qualitativly, how should we think about measurement errors in FE/FD-models?

Answer

A

With measurement errors we will get a version of the classical attinuation bias. Here the bias depends on the persistence of the measurement error and a other term for the indipendent variable.

If we do have a very persistent measurement error, that is, individual reports with the same error (underestimate or overestimate) then the correlation will be close to one and thus the noise component cancels and we get consistent estimates. However, if the persistence in the independent variable is large, then we will ha an increased measurement error).

The difference here from cross-sectional data is that we have two counteracting effects.

In microdata people often argue that measurement errors are persistent.

Question 16

Q

How should we think about inference with FE/FD?

Answer

Study These Flashcards

A

In general, economic variables are typically serially correlated. Even in the unlikely case that the original errors are not correlated over time, differencing (or deviations from means) will introduce (negative) serial correlation.

Therefore, when working with panel data, and especially when estimating FE or FD models we need to cluster the standard errors on the panel unit level!

Lecture 10 (Panel data) Flashcards

(16 cards)