10. Differences-in-Differences Flashcards
What is the differences-in-differences (DiD) design?
The Differences-in-Differences (DiD) design is a statistical technique used to estimate the causal effect of a treatment or intervention by comparing changes in outcomes over time between a treated group and a control group. It leverages natural experiments, where the treatment affects only some units over time, to create a credible comparison.
How it works: DiD combines before-and-after comparisons within groups and between-group comparisons.
1. First difference: For each group, measure the difference in the outcome from before to after the treatment period.
2. Second difference: Subtract the change in outcomes for the control group from the change in outcomes for the treated group.
What are fixed effects + time trends and their relvance in the context of DiD?
Fixed effects refer to time-invariant characteristics of the treatment and control groups that could influence the outcome –> constant characteristics of each group that don’t change over time.
* Fixed effects are removed by the first difference (within-group before-and-after comparison), ensuring that differences in outcomes are not driven by these unchanging group-specific factors.
Time trends are time-varying factors that affect all groups in the same way over time (e.g., economic growth).
* Time trends are removed by the second difference (between-group comparison of changes), because they impact the treated and control groups similarly, leaving only the treatment effect.
By accounting for both fixed effects and time trends, DiD ensures that the observed differences in outcomes are driven by the treatment and not confounded by time-invariant fixed efffects or time trends.
What are the main assumptions for DiD?
- Parallel trends assumption: In the absence of treatment, the treated and control groups would have followed the same trajectory in outcomes over time. This ensures that observed post-treatment differences are attributable to the treatment.
- No anticipation effect: The treatment does not affect outcomes before it is implemented. Units should not change their behavior in anticipation of receiving treatment.
- Stable unit treatment value assumption (SUTVA): The treatment status of one group does not affect the outcomes of the other group. Each unit’s outcome depends only on its own treatment status.
- Common time trends: External time-varying factors (e.g., economic or seasonal effects) impact the treated and control groups equally, ensuring no bias from shared trends.
- Consistency in group assignment: Units remain consistently assigned to the treatment or control group throughout the study period (no switching between groups).
What is the simple 2x2 DD?
The simple 2x2 DiD is a basic version of the method that involves two groups (treated and control) and two time periods (pre- and post-treatment). It calculates the treatment effect as the difference between the before-and-after change in the treated group and the before-and-after change in the control group. This approach controls for fixed effects and common time trends, isolating the causal effect of the treatment.
* Treatment group (k): The group that receives the treatment.
* Control group (u): The group that does not receive the treatment.
* Pre-treatment-period for both groups (k and u)
* Post-treatment-period for both groups (k and u)
How is the treatment effectc (mathmatically) estimated in a Differences-in-Differences (DiD) design?
The DiD estimate (δ2x2) is calculated using the following formula: δ2x2=(ypost(k)-ypre(k)) - (ypost(u) - ypre(u))
* ypost(k) and ypre(k): sample means for the treatment group (before and after treatment).
* ypost(u) and ypre(u): sample means for the control group (before and after treatment).
DiD typically estimates the ATT (Average Treatment Effect on the Treated) because it isolates the effect of the treatment on the treated group. The control group serves to estimate the counterfactual (what would have happened to the treated group without treatment).
DiD can estimate the ATE (Average Treatment Effect) if treatment is randomly assigned across the entire population, ensuring homogeneity of treatment effects and representativeness of the treated and control groups.
How can the DiD estimate be explained using expected outcomes and potential outcomes?
The DiD estimate can be expressed in terms of expected outcomes and potential outcomes:
𝛿DiD=(𝐸[𝑌𝑘∣Post]−𝐸[𝑌𝑘∣Pre])−(𝐸[𝑌u∣Post]−𝐸[𝑌u∣Pre])
Using potential outcomes and the switching equation, this becomes:
δDiD=(E[Yk1∣Post]−E[Yk0∣Post])+(E[Yk0∣Post]−E[Yk0∣Pre])−(E[Yu0∣Post]−E[Yu0∣Pre])
* E[Yk1∣Post]−E[Yk0∣Post]: The ATT (causal effect of the treatment for the treated group).
* E[Yk0∣Post]−E[Yk0∣Pre]: Change over time for the treatment group if it had not been treated.
* E[Yu0∣Post]−E[Yu0∣Pre]: Change over time for the control group (not treated).
The second and third terms capture potential non-parallel trends. If the parallel trends assumption holds, these terms cancel out, isolating the ATT.
What is the problem with reliable inferences in DiD caused by serial correlation?
Inference in DiD refers to making reliable conclusions about the treatment effect, based on statistical estimates. Accurate inference requires reliable standard errors to measure the uncertainty around the estimated treatment effect.
In multi-period DiD designs (using multiple years of data), standard errors can be biased if they fail to account for serial correlation (when the outcome variable is correlated with itself over time). This dependency over time violates the assumption of independent observations across time.
If serial correlation is ignored, standard errors are often underestimated, making the results appear more statistically significant than they actually are. This leads to overrejection of the null hypothesis (falsely concluding that the treatment effect is significant) and creates misleading conclusions about the causal impact of the treatment.
What are common solutions to address serial correlation in DiD studies?
- Clustering standard errors: Standard errors are adjusted at the group level, allowing for correlation within groups over time. Instead of assuming independence across all observations, this approach acknowledges that observations within the same group are related.
- Block bootstrapping: Groups are resampled with replacement, preserving the correlation structure within each group.
- Data aggregation: Collapsing the data into a single pre-treatment and post-treatment period for each group. By aggregating the data, you eliminate the time-series structure and remove serial correlation from the analysis entirely.
What is the challenge for testing the parrallel trends assumption in DiD?
The main challenge for testing the parallel trends assumption is that it relies on an unobservable counterfactual: what would have happened to the treated group in the absence of treatment after the treatment period. This counterfactual cannot be directly observed, making it impossible to definitively verify that trends would have been parallel post-treatment without the treatment.
Instead, researchers use pre-treatment trends as a proxy to evaluate the plausibility of parallel trends. They test whether the treated and control groups followed similar trends in the pre-treatment period. If pre-treatment differences are statistically insignificant, it provides indirect evidence supporting the parallel trends assumption. However, this approach has limitations:
1. Even if pre-treatment trends are parallel, it does not guarantee that trends would remain parallel post-treatment.
2. If treatment assignment depend on factors that also influence the outcome (treatmentassignment is endogenous), the parallel trends assumption can be violated, even if pre-treatment trends appear similar.
What are the different approaches to testing pre-treatment balance in DiD, and how do they work?
Plotting raw data: Visually compare trends in the outcome variable for treated and control groups in the pre-treatment period. If the trends look similar, this supports parallel trends.
* Strengths: Transparent and shows unadjusted differences.
* Weaknesses: Becomes impractical with many treatment groups and only compares treated to never-treated groups, ignoring comparisons like early-treated vs. late-treated groups.
Recentered time paths: Assign random “treatment dates” to control groups and plot their trends relative to these dates alongside trends for treated groups.
* Strengths: Visualizes trends across groups with differential timing.
* Weaknesses: The constructed control group is artificial and not used in the final analysis
Regression models with leads and lags: A regression model that includes leads (pre-treatment periods) and lags (post-treatment periods) as dummy variables to evaluate pre-treatment balance and post-treatment dynamics
* Leads: Represent the time periods before treatment begins. These are used to test whether the treated and control groups had similar trends prior to the treatment. If the coefficients on the leads are statistically zero, it indicates that the two groups followed parallel trends before treatment, supporting the parallel trends assumption.
* Lags: Represent the time periods after treatment occurs. These capture how the treatment effect evolves over time, showing whether the impact is immediate, delayed, or fades over time.
What are placebo tests in DiD?
Placebo tests are a diagnostic tool in DiD analyses used to evaluate whether the observed treatment effects are credible. They work by testing the treatment effect on groups or outcomes that should not logically be affected by the treatment. The goal is to ensure that the observed effects are not driven by spurious factors.
How it works:
1. Identify a group that should not be impacted by the treatment and run the DiD analysis on this group.
2. If the treatment has no effect on the placebo group, the estimated coefficients for the placebo group should be statistically zero. If a placebo group shows an effect, it suggests that the observed treatment effect might be due to confounding factors, rather than the treatment itself.
What is the purpose of the triple differences (DDD)?
The purpose of the triple differences (DDD) method is to improve the accuracy of causal estimates in DiD analyses by addressing biases that standard DiD cannot eliminate.
* While DiD accounts for time-invariant group-specific factors (fixed effects) and time-varying factors that affect all groups equally (time trends), it cannot control for time-varying factors that affect one group or location differently from another group or location over time.
* DDD adds a third layer of comparison by introducing an additional group unaffected by the treatment, helping to control for these remaining biases. By comparing changes across time, groups, and locations simultaneously, DDD isolates the treatment effect more reliably, ensuring the results are not influenced by unrelated trends or time-specific events.
How does triple differences (DDD) work?
Triple differences (DDD) refines the standard DiD design by adding an additional comparison group.
To estimate a DDD model, you need at least 8 groups, which result from combinations of:
* Time (before and after the intervention) → 2 categories
* Treatment group vs. control group → 2 categories
* Subgroup within each group → 2 categories
These 2 × 2 × 2 categories give you 8 distinct groups for comparison.
To estimate DDD empirically, we rely on a triple interaction term in a regression model that adds a third dimension to the analysis.
What is staggered DiD? (!)
Staggered DiD (differential timing) occurs when treatment is implemented at different times for different groups, rather than all treated groups receiving treatment simultaneously.
Unlike the basic 2x2 DiD, which compares two groups (treated and control) across two time periods (pre-treatment and post-treatment), staggered DiD involves multiple groups and multiple time periods. This staggered structure complicates the analysis because groups treated earlier may act as controls for later-treated groups, and vice versa, potentially introducing bias if treatment effects vary over time (time-heterogeneous effects).
To handle these complexities, researchers often use the two-way fixed effects (TWFE) model, which explicitly controls for:
* Unit fixed effects: Account for time-invariant differences across groups.
* Time fixed effects: Account for shocks or trends that affect all groups at the same time.
Yit=α0+δDit+Xit+αi+αt+εit
While TWFE helps to manage staggered treatment, it assumes constant treatment effects over time. If treatment effects vary across time, the TWFE estimate can become biased.