SPECIFICATION BIAS Flashcards
Parsimony
Occam’s razor suggests that model be kept as simple as possible because a model can never truly capture reality.
What are the attributes of a good model
- Parsimony
- Identifiability
- Goodness of Fit
- Theoretical Consistency
- Predictive Power
Identifiability
Only one estimate per parameter for a given set of data.
Goodness of Fit
Explanatory variables explain as much of the variation in Y as possible; least squared residuals, meaning a high as possible adjusted R-squared
Predictive Power
The model whose theoretical predictions are borne out by experience.
Theoretical Consistency
When constructing a model it should have some theoretical underpinning. The coefficients must then reflect theory and have the correct sign.
Specification Errors
- Omission of relevant variables
- Unnecessary variables
- Wrong Functional form
- Errors of measurement
Omission of Relevant Variables
- X2 & X3 are correlated, then a2 is biased { E(a2)= B2 + B3b32 (where b32 is the interaction). Ea1 = B1 + B3 (Xbar3- b32Xbar2). Unless b32 is zero, then the estimator is biased.
- a1 & a2 are inconsistent no matter the sample size
- If X2 & X3 are uncorrelated, a2 unbiased & consistent. (Estimators can be consistent and not unbiased).
- Variance for the estimator will not be the same: E[vara2]= var (b2) + B23 sum x2/ (n-2) sum x2
- Even when uncorrelated, the variance will be biased and overestimate the true variance; this means a positive bias and wider confidence interval–more likely to accept the null even when shouldn’t.
- Intercept will be biased and underestimate the true intercept. Also, the Standard Error will be incorrect.
Upward bias and Downward bias
bias or deviation from the true parameter, positive results in an upward bias, and negative results in downward bias.
Why is Omission of Relevant Variables bad?
It overestimates the impact of a variable , and prevents the model from determining the true impact of the explanatory variable.
Inclusion of Irrelevant variables
Overfitting; including unnecessary variables will retain the unbiased and consistent nature, however the variance and in turn the SE will be larger, expanding the confidence interval; less efficient and precise–potential to not reject the null. *may lead to multicollinearity e.g. X’s that do the same thing.
Incorrect Functional Form
Log-linear model; A2 measures the elasticity of Y with respect to X2. Whereas the regular linear model tells you the slope or rate of change; unclear what model it is suggesting we use* follow-up
Errors of Measurement
Nonresponse errors, computing errors, reporting errors;
If the error is in the dependent variable: bs (OLS estimators) are unbiased, variance unbiased, but variance is larger because u incorporates the error in the dependent variable.
Error in the Explanatory Variable: OLS estimators are biased; inconsistent.
Error in both: very serious
solutions: instrumental or proxy variables
Consistent
As the the sample gets larger, the estimator becomes unbiased
Instrumental or Proxy Variables
Substitutes for the variable itself when it is difficult to collect accurate values; these variables are highly correlated with X variables, uncorrelated with the measurement error & u.