Empirical Tools Flashcards
Correlation
when two variables move together
- useful for passive prediction
Causation
when one variable causes (or affects) another variable
Identification problem
the question of whether one variable causes another
Treatment variable
Di : the variable that generates the causal effect we’re interested in
Outcome Variable
Yi : the var. that is affected by the treatment variable
the concept of potential outcomes
The outcomes that a person would have under different values of the treatment
the individual treatment effect
The difference in i’s potential outcomes in the case in which i does v. doesn’t receive the treatment
the fundamental problem of casual interference
we can only ever observe one potential outcome per person. We can see i’s potential outcome in the observed state
- It is equal to the person’s realized outcome, Yi
Average treatment effect
the average of the individual treatment effects in a population
selection bias
when the selection of subjects into a study (or their likelihood of remaining in the study) leads to a result that is systematically different to the target population
- the difference in what average outcomes would be absent the treatment
- prevents us from identifying the average treatment effect
observed state
state we do observe
the counterfactual state
the state we do not observe
treatment group
those with Di = 1
control group
those with Di = 0
naive method
Compare outcomes for people who in the real world happened to get a treatment v. those for people who didn’t
- fails due to selection bias
randomized trial
the scientific term for an experiment
- In a rand. trial, the treatment, Di, is assigned via random chance
Independence
the difference in avg. outcomes is the ATE
Indirect Random Assignment
Randomly assign a variable that is related to, but not exactly the same as our variable of interest
External Validity
results are valid only for the experiment’s participants
– The same experiment in U.S. and Sweden may generate different results
– But this suggests running lots of experiments and comparing them
Attrition
When participants leave an experiment before it is complete
– If attrition is non-random, it can generate bias
-> Compare all of the treatment group with only some of the control
– Deal with attrition by finding a dataset with universal coverage
– apply for access to a database, instead of just conducting a survey
Observational Data
Data that does not come from a deliberately designed experiment
Time series
data on a particular variable over time
Cross-sectional data
data on many people at one point in time
Time-Series Analysis
the study of how series co-vary over time