Assumptions Flashcards

1
Q

What is the SUTVA assumption?

A

The potential outcomes of an individual i do not depend on the treatments received by other individuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ATT Defintion

A

Mean difference in observed outcomes and counter factual for treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ATC defintion

A

Mean difference in observed outcomes and counter factual for control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ATE defintion

A

Mean effect in the entire population, whether or not they actually participate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Mean Independence Assumption (MIA) and what does it mean if it holds?

A

E[Y0|D = 0] = E[Y0 |D = 1] and E[Y1|D = 0] = E[Y1|D = 1].

If it holds, data is generated from experiment and potential outcomes are independent of treatment.

Perfect balance between treated and untreated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens to the treatment effects under randomisation? And why?

A

DIM = ATT (No BB) = ATE (No BB or DTE) = ATC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some issues with experiments, and threats to internal validity?

A

They can be costly, impractical and sometimes impossible.

Threats to internal validity:
Hawthorne effect - People react to being observed
John Henry effect - People react to be in control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the CIA?

A

The CIA is when the potential outcomes are independent of treatment given X is controlled for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the CMIA, and what does it mean if it holds?

A

If it holds, the selection bias disappears after conditioning on the observed characteristics X, as the treatment is as good as random. They will, on average, have the same potential outcomes.

E[Y0|D = 1, X] = E[Y0|D = 0,X]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 2 assumptions to calculate the treatment effect from observational data?

A

E[Y1|D = 1, X] = E[Y1|D = 0, X]
E[Y0|D = 1, X] = E[Y0|D = 0, X]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Different types of individuals based on the potential treatments

A

Always Takers: D(1) = 1 and D(0) = 1
Never Takers: D(1) = 0 and D(0) = 0
Compilers: D(1) = 1 and D(0) = 0
Defiers: D(1) = 0 and D(0) = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are 2 assumptions for making calculations on ‘Always takers’ etc?

A

No defiers
Instruments are independent of treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What makes a valid instrument?

A
  1. Has a causal effect on treatment (First Stage)
  2. It’s as good as randomly assigned
  3. It effects outcomes only through treatment (Exclusion Restriction)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Benefits of the 2SLS

A

1) Allows use of multiple instruments
2) Controls for exogenous variables
2) Controls for observable characteristics X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do the elements in this regression mean?

M_Hat[A] = alpha + rhoD[a] + gamma[alpha] + e[alpha]

A

M - Mortality Average
Rho - Estimate of the jump exactly at the threshold
Gamma - Slope coefficient
D[a] - treatment dummy
a - running variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why might a simple linear model produce misleading estimates? Regression discontinuity

A

If relationship between 2 variables is not linear

If relationship between variables is not the same on each side of the discontinuity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What identification assumptions need to be made for an estimate to be causal? RD

A
  1. Independence of potential outcomes either side of the discontinuity
  2. No OVB in the estimating equation, implies rho is causal
  3. No other ‘jumps’ at D (threshold)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you test the causal estimate assumptions? RD

A

2 possible tests

  1. See if other observable characteristics are balanced either side of the discontinuity
  2. Ensure that there is no manipulation of the running variable. (If no manipulation, density of a would be smooth around DC)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is manipulation?

A

Manipulation is when the variable X is chosen -> ideally we would like it to be something like age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Pros and cons of Narrower bands for Band width

A

Pros - Less likely to be misspecified, closer to true estimate of rho

Cons - Means less data and less precise estimate

21
Q

What is the difference between standard error and standard deviation?

A

Standard error is the difference in how much the mean would vary if it were measured from lots of different samples.

Standard deviation is a measure of how much observations vary from one another.

22
Q

What is the relationship between Regression and CEF, and what would be saturated model mean?

A

Regression is an approximation for the CEF. If the regression model is saturated, the regression should have the same number of parameters as the CEF has values. (Another way of estimating a naive comparison of means)

CEF: E[Y|D]
CEF is just an average -> does not mean its causal

23
Q

Baseline Bias

A

Difference in average outcome, in absence of treatment, between the treated and untreated.

E[Y0|D=1] - E[Y0|D=0]

24
Q

DTE Bias

A

The benefit of the treatment (causal effect), for those who are treated and untreated is not the same.

If positive the treated gain more.

(1-pi){ E[Y1-Y0|D=1] - E[Y1-Y0|D=0]}

Where pi is the proportion of sample who getting treated

25
Q

Why does matching and regression produce different results?

A

Matching, groups ‘matches’ individuals who have same observable characteristics and computes ATE/ATT for each group.

OLS is simply a weighted average of the ATE of these groups.

OLS uses all observations, even those off common support. OLS is much quicker and easier.

26
Q

What happens if there is OVB?

A

If there is OVB, the CIA doesn’t hold and therefore regression estimates will be biased.

27
Q

What is the difference between Long and Short Regression?

A

Long regression controls for selection, so includes a dummy (control) variable, whereas the short regression does not.This means the short regression gives a biased estimate, so the difference between the 2 is the OVB - Baseline and DTE Bias occurs.

28
Q

OVB formula and explain each element?

A

Effect of D in short (Biased) =

Effect of D in long (Unbiased) +

Relationship between omitted and included (Pi1 in aux regression: X = Pi0 + Pi1 D)

x effect of omitted in long (gamma in long regression)

So OVB: Beta(S) - Beta(Long) = pi1 x gamma

29
Q

Advantages of Regression and what does OLS give?

A

We can add observable characteristics (X) and use them as control variables and if CIA holds, OLS gives estimator of the Average Treatment Effect (ATE).

30
Q

What are potential omitted variables? In the aux regression?

A

Potential omitted variables are anything thats correlated with the treatment.

Then in X = Pi0 + Pi1D, if pi1 is significant, X is correlated with the treatment.

31
Q

What is a bad control?

A

A bad control is a variable which is itself an outcome variable: something which might be affected by the treatment.

Be careful with these, but usually more controls is always better.

32
Q

Residuals properties

A

Variance - How well the regression fits the data (R squared)
Regressions will produce 0 as uncorrelated with the regressors.

e(i) = Y(i) + Y_hat(i)

33
Q

Regression standard errors

A

For a sample, we estimate Beta with Beta_hat:

SE(B_hat) = sigma(e)/sqrt(n) x 1/ sigma(x)

1/sigma(x) is the residual variance square rooted

34
Q

How to calculate an IV estimate?

A

You have the calculate the Wald ratio (lambda = rho/phi), which is equal to the reduced form divided by the first stage.

Z->Y / Z->D

35
Q

How do you calculate the first stage in IV?

A

Z -> D

P[Di|Zi = 1] - P[Di|Zi = 0] = phi

36
Q

How do you calculate the reduced form in IV?

A

Z -> Y

P[Yi|Zi = 1] - P[Yi|Zi = 0] = rho

37
Q

What is one thing to remember about lamba (Wald Ratio)?

A

It is a Local Average Treatment Effect (LATE), meaning it’s only an average for a certain group, this group being the compilers - those who obey their lottery outcome.

38
Q

3 assumptions to identify a causal effect with DiD

A

1) Treatment and control group

2) These treatment and controls groups are comparable

3) There is info on treatment and control group, before and after the treatment occurs.

39
Q

Common (Parallel) Trends Assumption

A

In the absence of treatment, the difference between the treatment and control group remains constant over time.

40
Q

How would you violate the common (parallel) trends assumption?

A

Add an unobserved variable (X) into the regression that is correlated with the treatment and changes at the same time as the treatment -> causes OVB.

41
Q

4 Pros for using DiD Regression

A

1) Can easily calculate standard errors for DiD

2) Treatment can be continuous, not just binary

3) Easily add control variables

4) Easily add additional time periods

42
Q

MLDA vs Real effect Vs Spurious Effect

A

MLDA - parallel trends assumption holds -> simple model

Real - Trends aren’t parallel, there is a DiD effect, and is a differential time effect

Spurious - Trends aren’t parallel, No DiD effect but is a differential time effect

43
Q

2 Problems with normal standard errors in DiD, how to fix

A

For panel data(repeated observations on same units over time), it can be a poor estimate of the uncertainty of our estimate.

1) Heteroskedasticity -> Use robust SE

2) Serial correlation -> Use clustered standard errors - they relax assumptions that observations are independent (need a reasonable amount)

44
Q

Why do we use 2SLS and steps involved in the process?

A

Occurs when we are combining two separate instruments and want to calculate 2 IV estimates.

1) First stage is a regression

2) Calculate fitted values (no residual)

3) Estimate second stage

4) Finally, can add control variables simply by adding them to first and second stage.

45
Q

How can you test whether randomisation was successful?

A

Can use a t-test for continuous variables, to test whether there is a statistically significant difference in average outcome, between treatment and control group. -Value < 0.05 and t statistic > 2, to reject the null.

46
Q

Confidence Intervals

A

[ Y_bar - (2xSE(y_bar), Y_bar + (2xSE(y_bar)]

47
Q

T - test

A

Want to test whether E[y] = x

T(mu) = Y_bar - mu / SE(Y_bar)

Or t(0) = Y_bar / SE(Y_bar)

48
Q

What is non-compliance, and what does it do to randomisation ?

A

Non compliance occurs when participants do not adhere to their treatment or control group, leading to a deviation in the intended random assignment.

Causes selection bias which affects results.

49
Q

What can we do to combat non-compliance?

A

We could use the Intention to Treat Analysis, which is the impact of offering the treatment, as opposed to the impact of the treatment itself.

ITT maintains the benefits of random assignment. So provides an unbiased estimate for the causal effect of the treatment.