Instrument Variables Flashcards

1
Q

What is instrument variables?

A

X is correlated with the error term. Think of X as a two parts: one part that is correlated and one that is not correlated with the error term. We isolate the part that correlates with the error term by using instruments.

One study from USA was on how being in the military affected the future income. One can with good reason believe that people that volunteer for the military comes from poor neighborhoods which is correlated with less future income. Besides that, there is also good reason to believe that people from good neighborhoods have more money and power, hence not joining the military. To fix this bias, there there was used an instrument on people getting drafted to the military. Being drafted to the military is completely random. By using this as an instrument, one was able to split the X into two parts and eliminate the effect that people in the military might come from poor neighborhoods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Endogenity and Exogenity

A

Engogen: Variables that are correlated with the error term
Exogen: Variables that are not correlated with the error term. (but things outside of the model?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the conditions for a valid instrument?

A

RELVEVANT and EXOGENOUS: The two conditions for a valid instrument:

  1. Instrument relevance: if an instrument is relevant, then variation in the instrument is relevant to the variation in X.
  2. Instrument Exogeneity: Z is correlated with Y solely through its correlation with X.

Relevant: It is relevant in a way that the IV actually affects X

Exogenous: Our IV only affect Y through X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we use IV?

A
  • Z is correlated with X, but not with error term. It has to satidy conditions for Relevance and exogeneity

Use Two Stage Least Square:

  1. Regress X on the Instrument Variable (X as dependent)
  2. Use the calculated Y variable of this regression in the original

Lets call the instrument for Z. if it satisfy the two conditions of relevance and exogeneity, we can estimate B1 by using an IV estimator called two least squares (TSLS). TSLS is calculated in two stages. First stage splits X into two parts; one part that is problematic and might be correlated to the error term, and one other part that is problem-free. The second stage uses the problem free part to estimate B1.

In the first stage you regress x on its instrument that gives X(hatt).

You then put X(hatt) in the regression.

From our example, the intuition is now that we only regress future salary on those who got draftet, thus eliminates the bias that veterans usually earn less.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How many instruments can we have? What do we call these models?

A

It is UNDERIDENTIFIED if it has less IV than endogeneous variables. Can not be computed

It is EXACTLY IDENTIFIED if it has the same number of IV’s and endogenous. It can now be computed but cannot be tested. Hence, one would need a good storytelling, economic knowledge to be certain that it is the right one to use.

It is OVER-IDENTIFIED if it has more instruments than endogenous variables. It can now be tested if the instruments are RELEVANT and EXOGENOUS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we test the instrument variables?

A

1.First step is to test for relevance:

First we make a regression of X and all of its IV’s.

H0: Instruments do not have any effect on X. If H0 is rejected, the instruments are relevant. A general rule of thumb for this testing is to look at the F-statistics. If it is greater than 10, we reject H0, concluding that the instruments are relevant. If it is lower than 10, it implies that the instruments are weak.

  1. Second step is to test for Exogeneity:
    We use a J-test one the error term to see if the instruments are exogenous. If they are exogenous, it means that all of our IV’s has a conditional mean of 0.

The null hypothesis of the J-test is that all of our instruments are exogen variables and that they don’t have any relation to the error term. This gives us a chi-squared distribution with (m – k) numbers of freedom where m is the number of instruments used an k is the number of endogen variables. We then compute an F-test and look at the p-value. A p-value over 0,05 tells us that all of our IV’s are exogenous with 95% certainty. So we want to keep H0.

When we do a J-test, our nullhypothesis is that all of our instrument is exogenous variables. And that they don’t have any relation to the error term. This gives us a chi-squared distribution with (m – k) numbers of freedom, where m is # of instruments and k is # of endogen variables. We then compute an F-test and look the p-value. WE WANT A P-VALUE THAT’S HIGHER THAN 0,05 OR 0,10, BECAUSE THIS TELLS US THAT OUR IV’S ARE EOGENOUS. SO WE WANT TO KEEP THE H0 IN THIS CASE, OR WE WANT TO FAIL TO REJECT THE H0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two assumptions of including an instrumental varable?

A

It should be relevant (there should be a (strong) correlation to the explanatory variable) and exogenous (there should be no correlation to the error term)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Instrumental variables: What test is used to test the second assumption of whether an IV-variable is exogenous?

A

The j-statistic. Can only be used when the regression is overidentified, and it tests whether the error is explained/correlated to the instrumental variable. The null-hypothesis is, that the terms are uncorrelated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some of the drawbacks of using IVs?

A
  1. It is difficult to find good estimates that captures all of the variance of the endogenous variables, which is not correlated with the error. 2. The instrument is often not well correlated with the endogenous variable, which is a problem (weak instrument/low relevance). 3. The OLS standard errors from the second stage regression are not right (use the ones provided by the software).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Underidentified

A

has less IV than endogeneous variables. Can not be computed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

EXACTLY IDENTIFIED

A

has the same number of IV’s and endogenous. It can now be computed but cannot be tested. Hence, one would need a good storytelling, economic knowledge to be certain that it is the right one to use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

OVER-IDENTIFIED

A

as more instruments than endogenous variables. It can now be tested if the instruments are RELEVANT and EXOGENOUS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

IV variable need to satisfy which assumptions?

A
  • Correlate with X, but not with the error term.
  • Relevance
  • Exogeneity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we test IV variables for relevance?

A
  1. Make a regression of X and all of its IV’s (First step, X = b0 + b1*Z)

Now have the First Stage Regression

H0: All IV’s have no effect on X

  • If rejected, they are relevant
  • Look at F-stat, role of thumb is F-Stat over 10

If IV’s is not > X, cannot test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Consequences of weak instruments

A

If so, the TSLS estimator will be

  • biased, and
  • statistical inferences (standard errors, hypothesis tests, confidence intervals) can be misleading
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Test Weak IV’s with single X

A

F-test

H0: IV are engodenous

  • so p-value over 0,05 = all IV are exog
17
Q

How do we test for IV Exogeneity?

A
  1. J-test (when Over-Identified)

H0: ALL of our IV’s is exogenous

  • gives chi-squared distribution
    2. Compute the F-statistic
    3. Compute J-test statistic

J = mF, (number of IV’s * F-stat)

J less than Critical: Do Not Reject
J more than Critical: Reject

18
Q

Relevancy means that

A

The variation in the instrument is relevant to the variation in X