Stats II Flashcards
Pre Mid Term
Pearl - Intro
Language of queries
The description of an intervention or treatment, written symbolically.
eg. Effect of a drug (D) on lifespan (L)
P(L | do(D))
Pearl - Intro
What is a counterfactual?
“What would have happened if we had acted differently (not taken the treatment)?”
Allows us to emulate in retrospect, reflect on past actions and envision alternative scenarios.
Session 2
Individual Treatment Effect
The difference of outcomes between a group under treatment and the control group.
ITE = Y1i − Y0i
Session 2
Fundamental Problem of Causal Inference
“It is impossible to observe two states of the world for the same person at the same time (e.g. vaccine effectiveness).”
Session 2
Average Treatment Effect
An average of the individual level effect.
ATE = Avg[Y1i − Y0i]
Also known as the Estimand.
Session 2
True ATE
AVG[ITE1 + ITE2 + … ITE X]
Session 2
Difference in Means
**difference in means = ATE + selection bias
**
naive difference in means = Y1i-Y0i
We can interpret the ATE as a difference in means when there is no selection bias.
Experiments ensure that the selection bias term is 0.
Session 2
Randomization
Random assignment guarantees that:
**E[Y0i|Di = 1] = E[Y0i|Di = 0]
**
Which means the selection bias term equals zero.
Session 2
Law of Large Numbers - LLN
Sample average can be brought as close as we want to the AVG in population by increasing the sample.
Session 2
Constant Treatment Effects
When the treatment affects groups in the same way.
In a hypothetical example, an outcome of Treatment ITE = 6 and and Control ITE 4 means different effects. If both had 4 and 4, there would be a constant treatment effect.
Session 2
Randomization considerations
- Feasibility: Not everything that counts should be randomized, and not everything that can be randomized counts.
- Ethics: Concerning the denial of services we know to be good, the effect of bad treatment, the fair allocation of incentives, the repercussions of using human subjects.
Session 3
DAG
Directed Acyclic Graphs (DAG) represent the beliefs, relationships and assumptions of a causal model.
Session 3
Confounder
A variable that has a mixing effect associated with our causal path. Occurs when treatment and outcome have a shared common cause not controllef for.
May be observable or unobservable.
Can lead to OVB and selection bias.
https://jamanetwork.com/journals/jama/fullarticle/2790247
Session 3
Backdoor Paths
Backdoor paths are non-causal, open paths that create associations even in the absence of a causal effect.
D <- X -> Y
Session 3
Mediator
A mechanism that mediates the causal relationship of a set of variables. Must be added one wants the total effect of the treatment.
D -> X -> Y
Session 3
Collider
A third variable that has been influenced by both the treatment and the outcome.
It is a closed path unless controlled for, which induces statistical association between the variables (collider bias).
D -> O <- A -> Y with O being the collider.
https://jamanetwork.com/journals/jama/fullarticle/2790247
Session 3
Collider Bias
Threatens the internal validity of a study and the accurate estimation of causal relationships.
Session 3
Conditioning
Means holding the confounder (X) fixed at some value.
eg. “adjusting or controlling for”.
Session 3
Backdoor Criterion
- Conditioning for a confounder closes an open backdoor path (eliminates selection bias)
- Conditioning for a collider opens a closed path (and we get collider bias).
- Conditioning for a mediator is fine, depending on what the effect to measure (to get the total effect, we would need the to add the direct + mediated treatment effects).
Session 3
Overcontrolling
“Sometimes you end up controlling for the thing you are trying to measure” (Pearl on Ezra Klein’s example).
Session 4
Regression
Predicts the value of an outcome variable based on one or more input explanatory variables.
What is our best guess of y given an observed x?
Bivariate regression: yi = α+βxi+ei
*Modeling variable y as a function of one variable x
Session 4
Loss function
Sum of the squared residuals
Session 4
Conditional expectation function
E[Yi|Xi = X] = α+βxi
give years of schooling = expected income
α+βxi + vi + s1 * 1500 + s2 * log(100,000)
- v is associated with the dummies, eg. SAT score
- we can change Pi to compare treatment / control
**On average and holding all else constant, a one-unit change in P (eg. from 0 to 1) is associated with a B-unit change in log(y).
**
Session 4
Bias-variance tradeoff
The difference between being systematically off, but consistent (low bias, high variance), or accepting bias but having low variance.
Session 4
Regression of conditional means
When we know there is selection bias, the naive difference does not equal ATE. We have to:
- look at naive difference between groups themselves
- get mean or weighted AVG
Session 4
Overall average effect
log() can help us get a normal distribution for eg. income, education
Yi = α+βxi+ vAi + ei
Pearl - Intro
Why did the causal revolution arise?
It has been possible, unlike other historical periods, thanks to the vocabulary that allowed us to capture:
1- causal diagrams (what we know)
2- symbolic language (what we want to know).
Session 4
Regression for causal effects
We can include Estimated B by following our DAG.
If we control for a confounder, given a treatment, we include all controls in the X and keep them constant.
E[Yi | Di = 1, X] - E[Yi | Di = 0, X]