General information Flashcards
External validity
Diff and diff
Instrument
Rgression discontinuity
Greater external validity as able to use it in more contexts.
RCT
- Randomization: Participants are randomly assigned to treatment and control groups.
- Control: The control group does not receive the treatment, while the treatment group does.
- Blinding: Often involves blinding to prevent bias.
- Prospective: Typically prospective, following participants over time after the intervention.
- Purpose:
Causality: Establishes causality by attributing differences between groups to the treatment.
Internal Validity: High internal validity due to randomization.
Necessary Assumptions:
Randomization: The randomization process is properly executed, ensuring that the treatment and control groups are equivalent on average.
No Contamination: Participants in the control group are not exposed to the treatment.
Compliance: Participants adhere to their assigned treatment or control condition.
Stable Unit Treatment Value Assumption (SUTVA): The potential outcomes for any participant are unaffected by the treatment assignment of others.
Applications:
Strengths:
- Control of Confounders: Balances both known and unknown confounders.
- Robustness: Strong evidence for causal relationships.
- very specific
- No need for parallel trends as there is randomisation
- endogenity issues removed, as everything is balanced
Limitations:
- Cost and Time: Expensive and time-consuming.
- Ethical/Practical Constraints: Not always feasible or ethical to randomize.
- Low external validity
- Can be non-compliance
Types of Data:
- Panel Data: Collected over multiple time periods for the same participants (longitudinal).
- Cross-Sectional Data: Collected at a single point in time from different participants.
- Time Series Data: Collected at multiple time points but typically involves a single unit (not common in RCTs).
Regression discontinuity
Regression Discontinuity (RD)
Design and Implementation:
- Assignment Variable: Participants are assigned to treatment or control based on a cutoff score on an assignment variable.
- Sharp vs. Fuzzy RD: Sharp RD has a strict cutoff, while fuzzy RD allows for some crossover.
- Local Comparison: Compares individuals close to the cutoff point.
Purpose:
- Causality: Infers causal effects by exploiting the discontinuity at the cutoff point.
- Internal Validity: High internal validity around the cutoff point.
Necessary Assumptions:
- Continuity Assumption: Potential outcomes are continuous around the cutoff point. This means that any discontinuity in outcomes at the cutoff can be attributed to the treatment.
- No Manipulation: Participants cannot precisely manipulate the assignment variable to ensure they fall just above or below the cutoff.
- Local Randomization: Close to the cutoff, treatment assignment is as good as random.
Strengths:
- Natural Experiment: Mimics randomization near the cutoff.
- Clear Identification: Provides clear causal estimates for those near the cutoff.
Limitations:
- Local Effects: Results are local to the cutoff and may not generalize.
- Manipulation Risk: Participants might manipulate their score to be just above/below the cutoff.
Types of Data:
- Cross-Sectional Data: Collected at a single point in time, focusing on observations around the cutoff.
- Panel Data: Possible but less common; focuses on tracking units around the cutoff over time.
2SLS/ IV
Instrumental Variables (IV)
Design and Implementation:
- Instrument: Uses an external variable (instrument) that affects the treatment but not the outcome directly.
- Two-Stage Process: First stage predicts treatment using the instrument; the second stage estimates the effect of the predicted treatment on the outcome.
Purpose:
- Causality: Addresses endogeneity by isolating exogenous variation in the treatment.
- Validity: High internal validity if the instrument is valid.
Necessary Assumptions:
- Relevance: The instrument must be correlated with the treatment (the instrument must have a strong first-stage relationship with the treatment).
- Exclusion Restriction: The instrument affects the outcome only through its effect on the treatment, not directly.
- Independence: The instrument is independent of the error term in the outcome equation (the instrument is not correlated with any unobserved confounders).
Strengths:
- Endogeneity Control: Controls for unobserved confounders correlated with the treatment.
- Natural Experiments: Useful when randomization is not possible.
Limitations:
- Instrument Validity: Requires a valid instrument that is correlated with the treatment but not with the error term in the outcome equation.
- Complexity: Interpretation can be complex, and finding a valid instrument is challenging.
- Relevance: where it is correlated with the regressor of interest but exogenity where the regressor is not correlated with anything else.
Types of Data:
- Cross-Sectional Data: Collected at a single point in time from different participants.
- Panel Data: Can be used, tracking participants over time, which may help strengthen the validity of the instrument.
- Time Series Data: Used in some contexts, especially with repeated measures.
Two-Stage Least Squares (2SLS) is for multiple instruments
Diff and diff
Difference-in-Difference (DiD)
Design and Implementation:
Comparative Method: Compares changes in outcomes over time between treatment and control groups.
* Pre- and Post-Intervention Data: Requires data from both before and after the intervention for both groups.
* Non-Randomized: Relies on observational data.
Purpose:
* Causality: Infers causal effects by examining differential impacts over time.
* External Validity: Higher external validity due to real-world data.
Necessary Assumptions:
- Parallel Trends Assumption: In the absence of the treatment, the treatment and control groups would have followed parallel trends over time.
- No Simultaneous Interventions: No other events differentially affecting the treatment and control groups occurred at the same time as the intervention.
- Consistency: The composition of the treatment and control groups remains consistent over time.
Strengths:
- Existing Data: Less costly and quicker, using existing data.
- Practicality: Useful when randomization is not feasible or ethical.
Limitations:
- Assumptions: Relies on the parallel trends assumption.
- Bias Potential: Susceptible to bias if the parallel trends assumption does not hold.
Types of Data:
- Panel Data: Collected over multiple time periods for the same participants, allowing for tracking changes over time.
- Repeated Cross-Sectional Data: Different samples of participants are collected at different time points.
Types of data
- Panel Data: Collected over multiple time periods for the same participants (longitudinal).
- Cross-Sectional Data: Collected at a single point in time from different participants.
- Time Series Data: Collected at multiple time points but typically involves a single unit (not common in RCTs).
validity
Validity:
RCT: High internal validity due to randomization.
DiD: Relies on the parallel trends assumption.
RD: High internal validity near the cutoff point.
IV/2SLS: High internal validity if instruments are valid.