Endogeneity Flashcards
What is Endogeneity?
Endogeneity literally means “within the system” and refers to the situation where a parameter or variable is correlated with the error term, possibly from measurement error, omitted variables, and simultaneity.
Omitted Variable Bias
third variable influences both the predictor and criterion variable simultaneously, such that one isn’t sure of whether variation in the criterion variable is due to variation in the predictor variable or third variable (i.e., alternative explanation).
How to address omitted variable bias
control for variables thought to be potentially confounding
- utilize instrumental variables when control variables are sparse or unavailable (but instrumental
variables are difficult to find because they cannot be allowed to correlated with the error term. The aim
is to isolate the portion of x that is correlated with the error term from the portion that is not and only
using the uncorrelated portion in the regression) - Complement field study with a lab study to provide evidence of causality under a controlled
experiment (help rule out alternative explanation) - Propensity Score Matching and Use of Control Groups
- Longitudinal Hierarchical Linear Models with Time-Varying Covariates
Error in measurement
error in measurement is to be expected when constructs are measured via surveys. Latent constructs typically have multiple items. As such there is always a downwardly-biased effect on the estimation of a coefficient for the given variable since reliabilities are always below 1.00 (Campbell & Kenny, 1999). The error in measurement is a bias not just limited to the coefficient but also influences the error term.
How to address error in measurement
To address the error in measurement, one can use SEM – because it is able to encapsulate all items into the model simultaneously (unlike HLM and OLS), SEM uses all measured items to determine how much variance is due to “true variance” versus “error variance”, thereby enabling an explicit account for measurement error (Raykov & Marcoulides, 2000).
Simultaneity (reverse causality)
Does X cause Y or Y cause X?
How to address Simultaneity (reverse causality)
To address reverse causality, researchers can use techniques that are mindful of time and temporality
(Mitchell & James, 2001). At the research design stage, collect predictor variables before criterion
variables. At the data analysis stage, show that DV does not predict IV. (eg. autoregressive models)
Autoregressive models
In autoregressive models, earlier values of the dependent variable in a time series, also known as lagged variables, are used as independent variables in the regression model. This technique helps to account for lagged effects, wherein what happened in the past
influences the future.
By considering temporal dynamics, this technique allows for stronger evidence of causality through the separation of the dynamic portion of the model (i.e., the relationships between the time-lagged values of the variables) from the simultaneous portion (i.e., the
contemporaneous values), which assists in inferences about the temporal order of the effects
(Rosmalen, Wenting, Roest, de Jonge, and Bos, 2012; see Brandt & Williams, 2007, for an in-depth
discussion).