5. Feature Selection and Shrinkage Methods for Regression Flashcards
What are some downsides of MLE?
MLE estimates can be extremely off if estimated on little data.
Write the formula for bias and variance. explain what they mean
Bias:
Does the expectation of the estimator match the population statistic?
B(ˆθ) = E(ˆθ) − θ)
Variance:
What is the spread of estimates from different samples?
Var(ˆθ) = E((ˆθ − E(ˆθ))2)
WHat is expected estimation error?
The expected estimation error can be decomposed into a bias term, a variance term
and a process-inherent irreducible error (aleatoric uncertainty).
E((ˆθ − θ)^2) = B(ˆθ)^2 + Var(ˆθ) + σ^2
What performs better MLE or James-Stein?
For four or more dimensions James-Stein perform better.
This is only the best if we are interested in the squared error in the whole estimator vector.
Explain the formula in for the james-stein estimater
x with _ = with take the mean on of the whole vector
we take the data point and subtract the mean, we are interpolating. We are shrinking the estimates towards the mean.
We shrink it by the factor N (number of dimensions and minus 3), the more dimensions the more we shrink. This is devided by the squared difference between the observation vector and the mean vector