Regression discontinuity Flashcards

1
Q

Internal and external validity of RD?

A

RD design is generally regarded as having the greatest internal validity of all quasi-experimental methods. Its external validity may be quite small, since the estimated treatment effect is local to the discontinuity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

4 assumptions needed to hold for RD estimation to be causal?

A
  1. The assignment variable, x_i, (often also called the “running” or “forcing” variable.) cannot be caused by or influenced by the treatment. In other words, the assignment variable is measured prior to the start of treatment or is a variable that can never change.
  2. The cut-point is determined independently of the assignment variable (i.e. exogenous), and assignment to treatment is entirely based on the assignment variable and the cut-point.
  3. Nothing other than treatment status is discontinuous in the analysis interval. That is, there are no other relevant ways in which observations on one side of the cut-point are treated differently from those on the other side.
  4. The functional form representing the relationship between the rating variable and the outcome, which is included in the estimation model and can be represented by f(x_i), is continuous throughout the analysis interval absent the treatment and is specified correctly.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is it important that individuals cannot manipulate the assignment variable in the vicinity of the threshold?

A

so that being below or above the cutoff is (almost) random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If when selecting students for a scholarship, the selection committee can look at which students received high scores and set the cut-point to ensure that certain students are included in the scholarship pool, or can give scholarships to students who did not meet the threshold–is this ok for RD?

A

No–violation of assumption 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Write out using potential outcomes framework: ATE at cutoff point for RD?

A

=E[Y_1i-Y_0i |x_i=c]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is RD different from an RCT?

A

First, the functional form of the curves E[Y_0i |x_i] and E[Y_1i |x_i] need not be flat (or linear or monotonic)

Second, it may be possible for units (individuals, etc.) to alter their assignment to treatment by manipulating the forcing variable in a way that is not possible when it is assigned at random by the investigator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

One plausible approach to finding differences for those right above and right below the cutoff would be to just take a difference in means-what’s one advantage of this approach? One challenge?

A

Advantage-don’t need to know functional form

you need a lot of data for those people just above and just below cutoff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In this OLS formula, what is the causal effect of interest(Ti is treatment and xi is assignment var)

Yi=B0+B1Ti+B3xi+ei

A

B1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or false: in RD, you need to recenter x so that it equals 0 at the cutoff

A

true-centering x at the cutoff ensures that the coefficient on T is the treatment effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is it possible to allow for different functions on each side of the cutoff? If so, how?

A

Yes-include interactions between x and T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are two drawback of using OLS to estimate discontinuity?

A

Global estimates of the regression function use data far from cutoff
Might look like a discontinuity when really you just have the wrong functional form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is one way to reduce the likelihood of detecting a jump when there isn’t one?

A

Look at data only in neighborhood around discontinuity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are implications for bias and variance of reducing bandwidth?

A

Decreased bias but increased variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens to bias and standard errors if you use extra regressors?

A

Likely decrease in bias (if you have a large bandwidth) and reduction in standard errors (greater precision)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If the probability of treatment at the cutoff changes discontinuously at the threshold, but not from 0 to 1, can you still use RD?

A

Yes–fuzzy RD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the fuzziness refer to in fuzzy RD?

A

change in probability of T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In sharp RD, how do we obtain causal estimate?

A

estimate jump in outcome at cutoff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In fuzzy RD, how do we obtain Wald estimate?

A

jump in outcome/jump in probability of T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a Type I fuzzy RD?

A

some T members don’t get T (no-shows)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a Type II fuzzy RD?

A

Some T members don’t get T, and some C members do get T (cross-overs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

If Pr(T|xi=c)=.8, what kind of RD is this?

A

Fuzzy, Type I

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

If we accidentally estimated a fuzzy RD using a sharp RD design, what kind of estimates do we get?

A

Intent to treat==average impact for those offered, whether or not they participated (includes never-takers and always-takers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How do we recover the treatment effect using fuzzy rd?

A

jump in outcome-assignment relations/jump in T status-assignment relatinoship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What kind of effect are we able to estimate using RD?

A

LATE=impact of program on individuals who were assigned and participated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

If Ti=1 but Di=0, what do we call this?

A

always-taker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

If Ti=0 but Di=1, what do we call this?

A

never-taker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the steps in fuzzy RD?

A

Estimate first-stage using OLS

Use predicted values of T in place of T in second stage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the parametric approach to ensuring that functional form is specified correctly?

A

Try out a number of polynomial functions and choose the one that fits the data best

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the nonparametric approach to ensuring that functional form is specified correctly?

A

Choose optimal bandwidth

30
Q

When you are choosing optimal bandwidth, do you need to use the same bandwidth for 1st and second stage?

A

yes–need to be using the same sample

31
Q

what are 3 concerns to keep in mind when evaluating RD designs?

A
  1. Is this sharp or fuzzy?
  2. Are there any other changes at cutoff?
  3. Is there evidence of manipulation?
32
Q

How can i check if I should use sharp or fuzzy design?

A

plot graphically-is pr(t|xi

33
Q

Why would manipulation be bad?

A

Because then E(y01|xi) and E(y1i|xi) are not continuous at cutpoint–
these people near the cutoff are different (ie mercy passing, retakers)

34
Q

How can I check for evidence of manipulation?

A

Is there clumping in densities of assignment var just above and just below cutoff?

35
Q

How can I formally test for evidence of manipulation?

A

McCrary test

36
Q

How can I check for whether there are discontinuities in observables just above and just below cutpoint?

A

estimate RD model with covariate as DV

you shouldn’t see a jump

37
Q

How can I test for jumps at non-discontinuity points?

A

Pick other points and test them

38
Q

When is it appropriate to use RD?

A

RD can be used when a precise rule based on a continuous characteristic determines treatment assignment. Examples:

39
Q

what is a sharp RD?

A

probability(T assignment) goes from 0 to 1 at threshold c

40
Q

How come we can say that having controlled for Xi, Di is exogenous?

A

There are no omitted variables correlated with Di and ui .

41
Q

True or false: In RD, there is no common support between groups

A

True–We are not comparing outcomes of units with the same X. Rather, we are extrapolating.

42
Q

What are consequences of not having common support?

A

Because of extrapolation, we have to rely heavily on functional form assumptions. The linear model may not be realistic, especially as we move away from the cutpoint.

43
Q

What is a nonparametric approach?

A

Estimating a regression for observations near the cutoff is a nonparametric approach, e.g. local linear regression or local polynomial

44
Q

What is a parametric approach?

A

Can expand bandwidth and explicitly try to model the relationship between Yi and Xi , a parametric approach.

45
Q

What are tradeoffs of parametric v nonparametric?

A

The choice of nonparametric vs. parametric approach involves a
tradeoff of more bias for greater precision (re: larger N).

46
Q

What are two ways of addressing validity of RD?

A

Can estimate the same model for outcomes not expected to be affected by the discontinuous shift in treatment.

Can estimate the same model using pre-treatment characteristics. Would not expect to see discontinuous shifts in these at the cutpoint.

47
Q

What effect does difference in means around an extremely narrow bandwidth near c represent?

A

LATE
but note that in practice researchers still use local linear regression to model relationship between Y and X around the cutoff.

48
Q

What are implications of choosing larger bandwidth for bias and precision?

A

Larger bandwidths provide more observations (can improve precision) but also introduce bias if observations away from the cutpoint are systematically different (and/or the model you fit is imperfect).

49
Q

3 RD assumptions?

A

The relationship between Yi and Xi is continuous in the neighborhood of c. There is no reason to expect a sharp break in Yi in the absence of treatment.
Xi has not been manipulated to affect who receives treatment. There are no other programs or services with the same eligibility
rule (to avoid confounding with some other treatment).

50
Q

How can you test for other discontinuities in f(xi) (other than at c)?

A

Regress Y on a high-order polynomial in X and include dummy
variables for values of X above various quantiles of X (e.g., deciles). Conduct an F-test for significance of these dummy variables.
If the relationship between Y and X is generally smooth, there should not be significance.

51
Q

what is manipulation?

A

occurs whenever cases have their value of X altered in order to affect their treatment status. For example, a teacher might adjust a test score in order to help a student pass or become eligible for a program.

52
Q

How can you visually see manipulation?

A

This may be visible in a histogram, or not if some with Xi < c have their Xi increased but others with Xi ≥ c have their Xi reduced.

53
Q

When is manipulation a problem?

A

when it’s not random!
f manipulation is random or uninformed, such that the expected value of Yi in the absence of treatment is no different for those whose X has been manipulated, then manipulation will not pose a problem. Manipulation is not usually random, however.

If the “best” of those on the margin are nudged into the treatment by manipulation of X, this will alter the equivalence of those below and above c. By removing these cases from the control group we may estimate an effect where there is none.

54
Q

What does the mccrary test do?

A

look for manipulation (clustering)

55
Q

what characterizes fuzzy RD?

A

Treatment status (e.g., program participation) increases sharply at c but there is non-compliance.

56
Q

T and D: which is treatment assignment and which is status?

A

D is status, T is assignment

57
Q

what two variables are in the first stage of fuzzy RD?

A

The first equation shows the relationship between treatment status Di and treatment assignment Ti (whether or not i is below or above the threshold).

58
Q

what does the reduced form equation in second stage of fuzzy RD show?

A

The second equation shows the relationship between the outcome Yi and Ti (whether or not i is below or above the threshold).

59
Q

Why is the jump diluted in fuzzy RD?

A

this jump will be “diluted” because of non-compliance.

60
Q

What is the local Wald estimate?

A

the reduced form coefficient on Ti divided by the first stage coefficient on Ti .

61
Q

Formula for TOT for Type I fuzzy RD (there are no always-takers)

A

TOTc =ITTc/T ̄+, where T ̄+ is the proportion treated above c.

62
Q

What is the issue with Type II fuzzy RDs?

A

it is not as easy to partition this into “treated” and “no-shows”.
Problem: there are “crossovers,” individuals below c that are treated. It is now harder to rationalize equivalence of those above/below c.

63
Q

what are compliers? Can we estimate their T effect?

A

receive treatment if and only if they are assigned to it. Their treatment status changes at c, so they contribute to a treatment contrast.

64
Q

what are always-takers? Can we estimate their T effect?

A

receive treatment regardless of assignment. Their treatment status is unchanged, so it is impossible to estimate a treatment effect for them.

65
Q

what are never-takers? Can we estimate their T effect?

A

do not receive treatment regardless of assignment. Their treatment status is unchanged, so it is impossible to estimate a treatment effect for them.

66
Q

Can we determine which group people are in? (ie always-takers, never-takers, compliers)

A

no

67
Q

what do we assume about the relationship between proportion of always takers and proportion of never takers?

A

It is also assumed that the proportion of always-takers and never-takers is equal in the neighborhood of the cutoff. (The proportion may vary with the running variable X, but is continuous through c).

68
Q

Which group contributes to treatment contrast?

A

Because compliers are the only ones whose treatment status is affected by the discontinuity in T at c, they are the only subgroup that contributes to the treatment contrast.

69
Q

What kind of effect do we get from fuzzy RD?

A

We have a local average treatment effect at the cutpoint:

70
Q

Formula for LATE for Type II fuzzy RD?

A

ITTc/(T ̄+−T ̄−)

71
Q

One drawback of RD re heterogeneous effects?

A

If treatment effects are heterogeneous the RD may tell us little about impact away from c. The population near c may not be the one of greatest interest.

72
Q

what do you use to determine optimal bandwidth?

A

cross validation