Bias Variance Tradeoff & Linear Models Flashcards
What is bias within the context of modeling?
Basically it describes inflexibility in a model- a model’s inability to capture the effects of other random variables that have been omitted from the model
What does low bias lead to?
Low bias leads typically leads to overfitting, which leads to high variance (out of sample performance) but low error
What does high bias lead to?
It leads to underfitting, and consequently low variance and high error
Why is the violation of the strict exogeneity assumption problematic ?
A violation means there is endogeneity, which means there is omitted variable bias (model is mis-specified)
This means that the error term is capturing the effects of omitted variables, and the residuals will be correlated with other Independent variables, which induced multicollinearity -> large sampling variability and incorrect standard errors
How can the impact of multicollinearity be measured?
VIF- multicollinearity Very Intensely Fucks standard errors (use Variance Inflation Factor)
Why do we sometimes assume the error term is normally distributed?
It is a known property from probability theory that a random variable that is a linear transformation of a normal random variable is also normally distributed! So if the errors are normally distributed, we can then assume the Betas are normally distributed and conduct the proper inference assuming normality.