Block 1: X-centered RDs: Causal strategies: X and Y (Lesson 3) Flashcards
What is the golden standard for x-centered research?
The experimental template
Outcome (Y) is less important, the focus is on the effect on the population
Any deviations from this template are viewed as sources for bias
Definition of ontology
The nature of being, existence or reality
Definition epistemology
Theory of knowledge: It’s nature and the limitations thereof
Definition of ethics/ aesthetics
What value does an observation have?
Different types of causal relationships
CEMS LICS PCPCP
Conjunctures
Equifinality
Monotonicity
Sequence
Linearity
Irreversibility
Constancy
Set-theoretic causes
Proximal Causal chain Path-dependency Causal laws Probibalistic causes
Conjunctures (causal relationship)
A combination of causes that produce an effect
Equifinality
Several causes acting independently of each other, but lead to the same effect/ outcome
Monotonicity
Where an increase (decrease) in X always causes an increase (decrease) in Y
Linearity
Rise in X causes a predictable rise in Y, explainable by a linear relationship
Irreversibility
One way relationship
X affects Y as X increases but not as it decreases (or vice versa)
Constancy
A constant cause operates continually upon an outcome
Proximal
A proximal cause operates immediately
Sequence
The effect of X(1-3) on Y depends upon the sequence they appear in
Causal chain
Multiple factors (M) form a chain between X and Y
Path-dependency
A single causal intervention has enduring, and perhaps increasing, effects over time
Causal laws
Exception-less relationships between X and Y
Probabilistic causes
Relationship with errors, i.e., exceptions, which can be given a certain probability of occurring
Set-theoretic causes
Where X is necessary and/or sufficient for Y
Criteria for good causal analysis on the treatment variable (X)
p e e v s
s
u
d
s
Is X P = proximate to Y, E = exogenous to Y, E = evenly distributed, V = varying, S = simple,
S = strong, U = uniform, D = discrete, S = scaleable?
Criteria for good causal analysis on the outcome variable (Y)
Is Y free to vary?
Criteria for good causal analysis on the sample
Are the chosen observations (a) independent (of one another) and
(b) causally comparable?
Definition of independence (sample criteria)
Each observation is seperate and gives new evidence of the causal linkage
Changes in Y need to be due to the treatment, not because the units themselves are influncing each other
Definition of causal comparability (sample criteria)
The average value of Y for a given value of X should remain the same across units and during the period of analysis
Incomparabilities which may influence causal comparability
Noise (B), which is random and only influences Y
Confounders (C), non-random and influence both X and Y
Strategies for causal inference
Randomized designs (experimental)
Non-randomized designs (non-experimental, observational)
Beyond X and Y
Randomized designs (strategies)
– Pre-test/post-test,
– Post-test only,
– Multiple post- tests,
– Roll-out,
– Crossover,
– Solomon four-group,
– Factorial
Non-randomized designs (strategies)
– Regression discontinuity,
– Panel,
– Cross-sectional,
– Longitudinal
Beyond X and Y (strategies)
– Conditioning confounders,
– Instrumental variables,
– Mechanisms,
– Alternate outcomes,
– Causal heterogeneity,
– Rival hypotheses,
– Robustness tests,
– Causal reasoning
Generally on randomized designs and their internal/ external validity
Internal validity and the assignment problem are per definition solved, though post-treatmen threats can be present
External validity can still be problematic
Give two examples of pre-treatment/ assignment bias?
1) Common-cause confounder: Confounder C affects both the treatment (X) and outcome (Y)
2) Self-selection bias: Treatment assignment is done by the subjects of the study
=> Solved through randomization!
Post-Treatment Bias, examples
Attrition: The loss of subjects during the course of a study
Noncompliance: When subjects do not comply with instructions
Contamination: Where treatment and control groups are not isolated from one another
Reputation effects: Where the reputation of the treatment in the minds of subjects affects an outcome
Researcher (Hawthorne) effects: Where the condition of being studied affects an outcome
Testing effects: Where responses to a test are influenced by a previous testing experiences, rather than the treatment itself
Examples of Pre/Post-Treatment Bias in Longitudinal Studies
– History: Where the treatment is correlated with some other factor that affects the outcome, which is to say, where the variation over time is driven by some factor other than the treatment.
– Regression to the mean: Where some change observed over time is a product of stochastic variation rather than the treatment of interest.
– Instrumentation effects: A change in the measurement of an outcome over the course of a study which alters the estimate of X’s effect on Y (learning effects).
=> Remain present even when randomization is used!
Stochastic variable
Variable which is completely random with no systemic component (statistics)
Generally on Non-Randomized Designs
– When randomization isn’t possible, the assignment problem has to be dealt with
– Except in case of (perfect) natural experiments, non- randomization is the standard in studies using observational data
– The fundamental problem: We do not know the ‘data generating process’ (DGP), because we did not create the data ourselves
Cartwright’s socio-economic machine argument
We cannot compare countries because they have different socio-economic backgrounds and we know little about the data-generation process
Non-randomized designs:
Regression-Discontinuity Design (RDD)
Assignment principle:
(a) known;
(b) measurable (prior to treatment, for all units in the sample);
(c) Has a cutoff point, which defines the assignment of subjects, producing a binary treatment variable;
(d) many units fall on either side of this cutoff;
(e) this principle is maintained throughout the period of study
Non-randomized designs:
Panel Design
Multiple observations are taken from each unit and there variation in X across time/ units.
Two types:
1) Difference-in-difference (DD)
2) Fixed effect
Difference-in-difference (DD) panel design formula
Y = B + T + X + T*X
B = covariates T = time dummy X = treatment
Types of autocorrelation
1) Spatial autocorrelation: Non-independence because observations are correlated across space (neighborhood effect)
2) Serial autocorrelation: Non-independence because observations are correlated across time (temporal stickiness)
Non-randomized designs:
Cross-Sectional Designs
Based on post-test observations, where X and Y vary spatially (but not over time)
Advantage: Creates fewer statistical (method) problems
Non-randomized designs:
Longitudinal Designs
Variation in variables is longitudinal (over time) but not cross-sectional, thus all units are treated
Treatment effect is found by comparing pre-treatment and post-treatment status
Sub-types of the longitudinal design
1) Interrupted time-series: Single intervention affects the units, which are observed ofer time, before and after the intervention
2) Repeated observations: Units receive the same treatment multiple times