Observational Causal Design Flashcards
What is an Observational Causal Design?
An observational causal design studies how the world naturally assigns conditions (treatments) and analyzes their effects, but can’t definitively prove cause and effect.
What is Process Tracing?
Process tracing is a qualitative method used to understand how a treatment (X) might cause an outcome (Y) within a single case or across multiple cases.
It relies on a pre-existing theoretical model that outlines the potential causal pathway from the treatment to the outcome.
What techniques are used to infer conclusions from process tracing?
- Hoop test: Observing a mediator (M) can be a positive sign, but doesn’t definitively prove X causes Y. (Like seeing a hoop doesn’t guarantee a successful jump)
- Smoking gun test: Observing a moderator (W) that influences both M and Y provides stronger evidence for the causal pathway. (Like a puff of smoke directly linking a gun to the shooter)
- DAG (Directed Acyclic Graph) used to illustrate causal model and specify restrictions on causal relations.
What is Multivariate Regression (or selection on observables)?
Multivariate regression is a technique used in observational causal design to control for confounding variables in a regression model to address selection bias.
What does the validity of your causal claims depend on?
The validity of causal claims rests on the accuracy of your model regarding backdoor pathways
Why do you adjust for a confounder?
Adjusting for a confounder (X) in a model with cause (D) and outcome (Y) removes confounding bias (YES).
Why do you adjust for a moderator?
Adjusting for a moderator (X) does not introduce bias and may improve estimate precision (YES).
Why do you adjust for a collider?
Adjusting for a collider (X) introduces collider bias (NO).
Why do you adjust for a mediator?
Controlling for a mediator (X) estimates the “direct effect” of D on Y, while not controlling gives the “total effect” (MAYBE).
What can be used to adjust for a confounder if it biases the treatment outcome relationship?
Selection on Observables Matching: This creates comparable “treatment” and “control” groups based on the confounder (X). It adjusts the covariate properties of these groups to reduce selection bias.
What is Difference in Differences (DID)
Difference-in-differences methods involve comparing changes over time between a group that receives a treatment and a group that does not, aiming to isolate the treatment effect by subtracting out unit-specific characteristics and trends affecting all units.
What steps are their in the DiD method?
- Take “pre-treatment” and “post-treatment” measurements of the outcome variable (Y) for both the treated unit and a comparator unit.
- Calculate the difference in the outcome variable for each unit between pre- and post-treatment periods.
- Compare the differences-in-differences between the treated and comparator groups.
What assumptions does DiD make?
- No Time-Variant Unobservables: No unit-specific factors changing over time, besides the treatment, influence the outcome.
- No Heterogeneous Temporal Effects: The impact of time is the same for both the treated and comparator units.
- Parallel Trends: In the absence of treatment, both groups would have followed similar outcome trajectories. (Unobservable counterfactuals)
What is Propensity Score Matching?
This is an alternative to matching when there are multiple confounders. However, it doesn’t address unobserved confounders.
What are some limitations to DiD?
The parallel trends assumption can be violated, leading to biases. Diagnostics involve examining pre-treatment trends to assess this assumption.
What is the Instrumental Variables Technique?
Instrumental Variables (IV) is a powerful technique used in research to estimate causal effects when confounding variables or challenges with other methods exist. It works in a two-stage process to address omitted variable bias, measurement error, and simultaneity issues.
What do IV address?
IVs address confounding concerns, especially when controlling for all confounding variables is difficult or assumptions in other designs (like parallel trends) are unlikely.
What are Instrumental Variables and what is their notation?
Z: Variables that influence the treatment variable (D) but not directly affect the outcome variable (Y) except through D (“as-if randomly assigned”).
- Are not correlated with unobserved confounders (exogeneity).
- Have a statistically significant effect on the treatment (non-zero effect).
What is the advantage to the IV method?
IVs offer a way to estimate causal effects without concerns about confounding.
IV is a two stage process, outline these processes:
- Stage 1: Identify Effect of Z on D:
- Regress D on Z to estimate the exogenous part of D influenced by Z. (We lose some information about D due to this estimation.)
- Stage 2: Identify Effect of D on Y:
- Use the estimated D from stage 1 as an independent variable in a separate regression to estimate the causal effect of D on Y.
What are some conditions for Validity for the IV method?
- Relevance: Z must affect the D
- Exogeneity: there must be no confounding between Z and Y
- Exclusion Restriction/ No Direct Effect: Z should only affect Y through D
- Monotonicity: The effect of Z on D should be 0 or positive for all units.
What is LATE in relation to IV?
When treatment effects vary across individuals, IV estimates the Local Average Treatment Effect (LATE)
What are the four populations with heterogenous treatment effects?
- Compliers: Treatment is affected by the instrument in the “right” direction (e.g., influenced by a lottery to participate in a program).
- Defiers:Treatment is affected by the instrument in the “wrong” direction (e.g., avoid a program due to a lottery prompting participation).
- Never-Takers: Never take treatment, regardless of the instrument.
- Always-Takers: Always take treatment, regardless of the instrument.
What is RDD?
Regression Discontinuity Design is a quasi-experimental design that leverages a sharp cutoff point in a variable (running variable) to create comparable treatment and control groups.
This helps overcome selection bias, a common issue in observational studies.
What characteristic defines treatment assignment in a RDD
In RDD, treatment is assigned based on whether a unit’s characteristic (running variable) crosses a specific threshold value.
What is the main limitation of RDD?
As we get closer to the cutoff point, there’s less data available, making analysis challenging. Limited generalizability can also be a concern.
What is the LATE estimated by RDD?
RDD estimates the treatment effect for units closest to the cutoff, where they are most comparable.
What is the role of running variable in RDD?
The running variable is the characteristic that determines treatment assignment based on the cutoff point.
What are some considerations for RDD designs?
Bandwidth Selection: Choosing the appropriate window size around the cutoff point to balance bias and variance.
Covariate Balance: Checking for imbalances in observable characteristics between the two groups.
Diagnosis: Assessing potential threats to the validity of the RDD design.