Lecture 5 (Instrumental variables (IV)) Flashcards

1
Q

Explain qualitatively the basic IV-setup. What are we doing to achieve a causal effect?

A

IV isolates only the part of the variation in the endogenous variables that is due to exogenous factors (i.e., the variation in the instrument). We then use only this part of the variation to identify the effect of the endogenous variable on the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the basic IV stages

A

The basic stages:

Z → X = First stage

Z → Y = Reduced form = ITT

X → Y = Structural equation (second stage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the identification we need to think about in IV?

A

Identification:

  • Relevance: $cov(X_i,Z_i)\neq0$
  • Validity: $cov(u_i, Z_i )=0$
    • Randomization
    • Exclusion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we test for randomization in IV?

A
  • Show a balance table for pre-treatment covariates across different values of our instrument.
  • Z is uncorrelated with the other covariates in our study (this is the same thing as above but from PPL).
  • Add controls to show that the estimates do not change (as above but from PPL)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we test for exclusion in IV?

A
  • Can not directly be tested!
  • Comes from economic reasoning and carefully thinking about violations!
  • Specific to each instrument and outcome variable.
  • Run a refutability test (PPL)
    • If we find a sample where there is no first stage, we should have no reduced form effect. If we do, there is a violation of the exclusion restriction.
  • Show that no other studies use the same instrument (PPL)
    • If we have collective use of the same instrument, there will be a violation of the exclusion restriction.
    • The same instrument can’t be used by anyone else.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we test for instrument relevance in IV?

A
  • See if there exists a first stage!
    • Run first-stage regression and look at F-value
      • Should be > 10 or > 105 depending on the literature.
        Can also look at tF- stat and AR-test as an alternative to F-stat.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Use method of moments to derive the IV estimator

A

We can then derive $\hat \beta$ using the method of moments. Starting with the assumption:

$$
E[z’e] =0
$$

$$
E[z’(y-x’\beta)] =0
$$

$$
E[z’y]=E[z’x]\beta
$$

$$
\hat\beta =E[z’x]^{-1}E[z’y]
$$

$$
\hat\beta =(Z’X)^{-1}Z’Y
$$

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain what is meant by “exclusion restriction”?

A

The only way our instrument affects the outcome is through our variable of interest. That is, Z has no direct effect on the outcome variable or takes no other route to the outcome variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is 2SLS and how to we estimate it?

A

2SLS works by regressing the endogenous variables on the instruments (and exogenous variables) to isolate this part of the variation. Actually, there are many ways to “isolate the variation” due to exogenous factors, 2SLS is only the most popular.

The two-step procedure in 2SLS is the following:

  1. Project (regress) each of the exogenous and endogenous variables on
    the set of instruments and endogenous variables (the “first stage”)
  2. Regress the outcome Y on the fitted values obtained in the first step
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Use method of moments and the the projection matrix to derive the 2SLS

A

See notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are mistakes that we need to avoid with 2SLS

A
  • Estimating 2SLS manually incorrectly calculates the standard errors!
  • We should not use probit or logit in the first stage!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Show that 2SLS is consistent

A

See notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Can we test and see if our instrument is really needed? what is the consequence of including an instrument that is not needed?

A

We employ a “Hausman test for endogeneity”.

If we have an exogenous independent variable, it is better to just run an OLS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do we mean by “indirect least squares”?

When can we use this?

How does it connect with the Wald estimator?

A

This is basically just dividing the reduced form coefficient form with the first-stage coefficient. This is only valid in the just identified case.

The reduced form yields the intention to treat (ITT) effect. When scaling the reduced form with the first stage, we get the average treatment effect on the treated (ATTE). If the first stage = 1, we get LATE!

Using indirect least squares in the case with a binary instrument yields the Wald estimator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is two-sample IV?

A

We can still use IV if we use different data sets. That is, having data on the instrument and first stage in the big dataset 1 and data on the instrument and the outcome of interest in the smaller dataset 2, lends itself to a “Two-sample IV”(T2IV). The important thing is that the instrument exists in both sets. We create the first stage with dataset 1 and the reduced form with dataset 2. We then decrease the risk of a weak instrument, since we create the first stage with the big dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the two-sample TS2SLS procedure?

A

The procedure for TS2SLS is as follows:

  1. Estimate the 1st stage coefficients in “sample 1”
  2. Retain these and predict the endogenous
    regressor in “sample 2”
  3. Run the 2nd stage regression using the predicted values
  4. Adjust the SE:s, since you are using a generated regressor
17
Q

What is Jack-knife IV?

A

The Jackknife IV method is a way to test the robustness of the IV regression results to the assumption of exogeneity. Specifically, it involves re-estimating the IV regression multiple times, each time leaving out one observation from the dataset. By doing so, we can examine how much the estimated treatment effect changes when any one observation is excluded from the analysis. If the results are consistent across different subsets of the data, we can have more confidence in the validity of the IV regression.

18
Q

What assumptions do need to hold when we are to use an instrument to correct for measurement errors?

A

et som måste gälla är att $cov(z,u) = cov(z,e)= 0$.

Verkar inte som att det behöver vara “as good as randomly assigned”. Bara att det mätfelet instrumentet ($v)$ inte ska vara korrelerat med mätfelet i vår endogena variabel ($u$) samt att det inte ska vara korrelerat med feltermen i vår huvudekvation ($e)$. Det vill säga, vi kan ha två mått på ability. IQ och KWW. Då kan vi använda IQ som en proxy for Ability, sen KWW som ett instrument for IQ, eftersom mätfelen i IQ och KWW inte är korrelerade.

19
Q

Show how we can solve measurement errors using IV.

Let’s say that what we observe for x is

x = x* +u

where x* is the true value of x and u the measurement error.

Say we have an alternative measurement of x* that is

z=x*+v

A

See notes.

20
Q

What is a weak instrument and why is it problematic?

A

Weak instrument $\iff$ weak first stage. This thus refers to violations of the relevance assumption. 2SLS is then biased towards OLS.
The asymptotic theory does not hold if we have weak instruments. We will make type-1 errors! We will find significant effects when there are non.

What is a weak instrument?
We have a weak first stage!

What is the effect of weak instruments?
Weak instruments will create biased estimates. With small samples of Z and X we will always have a bias problem. Increasing the sample size might fix this given that the relationship between Z and X is strong. If the instrument is weak, increasing the sample size will not fix this problem. Thus the bias will persist even in larger samples. 2SLS is then biased towards OLS! Adding more bad instruments will worsen the problem.

21
Q

What can we do if we have a weak instrument?

A
  • Drop all the irrelevant instruments (if we have more than one). This will increase the first-stage F-stat
  • Report exactly identified-IV estimates. These are less subject to bias caused by weak instruments.
  • Use more robust IV-estimators such as LIML or JIVE.
22
Q

What will happen if we have violations against the validity assumption in IV?

A

That is, we have E[z’e]≠0.

Hence, we have violations against the exclusion restriction and (or) randomness. To evaluate this, we study

$$
\plim \hat \beta_{iv}=\beta+E(z’x)^{-1}E[z’ e]
$$

Our estimate will clearly be biased if our instrument isn’t valid, E[z’e]≠0. Combining this with a weal instrument, E(z’x) ->0, will exacerbate the bias. Thus, with this combination, we could get a bias that is potentially worse than just running an OLS.

If the instruments violate the exclusion restriction we can not use them to estimate the structural regression. But if it is as randomly assigned, we can still run the reduced form (total derivative) and get the causal effect of our instrument on the outcome variable.