Causal inference Flashcards
Explain the consistency assumption
That you can generalize from observed data to potential outcomes: E(Y|A=a, X=x) = E(Y^a|A=a, X=x)
Explain the ignorability assumption
That you have no unmeasured confounders, meaning that we have controlled for all relevant confounders such that the treatment indeed is randomized. For example (in an extremely simplified model where the only confounder X is age), older people are more likely both to get a medicine and to die within the next 5 years (target Y). But if you controll for age, which patients that get the medicine becomes randomized, and you can draw causal conclusions about its effect.
Explain the positivity assumption
That every possible combination has to be represented in the observed dataset: P(A=a|X=x) > 0
What is incident user design?
Focusing on “new initiaters”, that is, users that just started on a treatment. Then it’s easier to understand the effect compared to a case where you e.g. did yoga for 70% of the time the last two years.
What is a great benefit of active comparator design?
You reduce the amount of confounding. If you for example study two groups that started doing yoga and zumba, you can expect those groups to be relatively similar. However, then you do not answer how well yoga works compared to nothing; you compare yoga with zumba.
This is almost like in DFT physics where you use relative energies because you have no clue what’s going on.
Define confounding variables.
Confounding variables are variables that effect both the treatment (A) and the outcome (Y)
What are chains, forks and colliders (in paths terminology)?
Chain: A -> G -> B
Fork: A B
G is a collider here: A -> G
Explain how you can block the following path by conditioning (i.e. controlling, or “fixing the variable” in more intuitive terms):
A -> G -> B, where
A is the weather,
G is “icyness” of sidewalks,
B is whether people fall on the side walk
With no conditioning, there will obviously be an association between A and B: people will fall more on the sidewalks in the winter.
However, if you condition on G, you condition on the reason why people fall, and hence you block the association between A and B.
Explain what happens if you condition on G in the following DAG:
A -> G
G was a blocker, but when conditioning on it, you open up a new, associated path between A and B. For example, A and B can be light bulb switches based on coin flip, while G can be the light shining if both A and B are on. In this case, A and B will be correlated once you control for G.
One lesson here is that you have to be careful with what you control for.
Define d-separation between nodes A and B.
A and B are d-separated by a set of nodes C if it blocks every path from A to B.
Define a backdoor path
Paths between treatment (A) and outcome (Y) that travel through arrows going into A, e.g. A Y.
Here, it is important to control for X.
What are the two parts of the backdoor path criterion
- All backdoors are blocked
- No descendents are controlled for (i.e. that one opens up paths by controlling).
The set of confounding that satisfies these conditions are not unique, i.e. more stuff works.
What is the disjunctive cause criterion?
Control for all variables that causes the treatment (A), the outcome (Y) or both.
Explain matching.
One tries to make observational data look like a randomized study.
Let’s say age is the only variable. Much more old than young people will get a treatment. By matching, you exclude a lot of the data, to make sure there is a 50-50 distribution of A=1 and A=0 for each age group.
After this, outcome analysis becomes very simple; it feels a lot like a randomize trial.
Explain stochastic balance and fine balance in terms of matching.
Stochastic balance is minimizing the absolute difference between treated group and matched group.
Fine balance is minimizing the differences of the average of the covariates in the two groups (I think of this intuitively as bias)