Chapter 5 - Resampling methods Flashcards
What is the probability that the first bootstrap observation is not the jth observation from the original sample? Justify your answer.
1-1/n, the probability to select a sample is always 1/n
Argue that the probability that the jth observation is not in the bootstrap sample is (1-1/n)^n.
Pick with replacement - The probability to select a certain sample is always 1/n, each time. To not pick it is always (1-1/n). This is independet. The set is always the same when we pick, tehrby we can multiply.
What happens with the probability of the jth observation being in the bootstrap sample for n from 1 to 100,000?
The probability for the jth example in the bootstrap sample decreases rather quickly and the asymptote is around 63%.
k-fold cross validation relative to the validation set approach
The advantage of using k-fold cross validation relative to the validation set approach:
* The split itself will matter less for the k-fold approach since we have several of them and use more than one split.
* The variance will be higher for the validation set approach and the result depends on the split, sensitive to the set of observations that are in each set.
* k-fold will give a more reliable estimate of the test error.
* Validation set approach, there is a risk of high variance.
* The use of data is more efficient if k-fold is used since less is set aside for validation. -> Less variance in the test error because we have a larger training set,
* Validation set approach has a risk of overestimating the test error.
The disadvantage with the k-fold approach is that we need to fit several models and work iteratively which for large or complex model might be computationally expensive. The implementation and interpretation of the validation set approach is easier.
k-fold cross validation relative to the LOOCV approach
LOOCV is a special case of k-fold cross validation where k=n.
- LOOCV have a arge set of data that we can use for training which leads to the models being very similar.
- There is a risk of high variance in the error estimate when using LOOCV since it is dependent on only one sample and that sample might be an outlier.
- The advantage of k-fold compared to LOOCV where k-fold will have less variance in the test estimate and models that are more different.
- k-fold is computationally cheaper
- The use of data and observations is much more efficient for LOOCV which is also more significant for small data sets where LOOCV gives an almost unbiased estimate of the test error.
- Smaller k for k-fold will give more biased models because the model is trained with less observations.