Jackknife and bootstrap Flashcards
Reason for jackknife and bootstrap
“A central element of frequentist inference is the standard error.. Direct standard error formulas like (1.2) exist for various forms of averaging, such as linear regression (7.34), and for hardly anything else.
This chapter focuses on standard errors, with more adventurous bootstrap ideas deferred to Chapter 11.”
Jackknife
“Involves removing one observation at a time from the dataset, computing the statistic of interest for each reduced dataset, and combining these results.
Suitable for a wide range of statistics, including means, medians, regression coefficients, etc.
Advantages:
Simple to implement and computationally efficient for small datasets.
Does not require strong distributional assumptions.
Limitations:
It is nonparametric; no special form of the underlying distribution F
need be assumed
It is completely automatic: a single master algorithm can be written that
inputs the data set x and the function s.x/, and outputs se_jack
Computationally expensive for large datasets.
May not perform well for highly biased or nonsmooth statistics.
Sensitive to influential observations.
The principal weakness of the jackknife is its dependence on local deriva-
tives. Unsmooth statistics s.x/, such as the kidney data lowess curve in
Figure 1.2, can result in erratic behavior.
See here: But things go awry at age 25, where the local derivatives greatly overstate the sensitivity of the lowess curve to global changes in the sample x
“
The nonparametric bootstrap
“The frequentist standard error of an estimate is ideally, the standard deviation we would observe by repeatedly sampling new versions of x from F. This is impossible since F is unknown. the bootstrap (“ingenious device” number 4 in Section 2.1) substitutes an
estimate F-hat for F and then estimates the frequentist standard error by direct
simulation, a feasible tactic only since the advent of electronic computa-
tion.
each xi is drawn randomly with equal probability and with replacement
Each bootstrap sample provides a bootstrap replication of the statistic of interest
Some large number B of bootstrap samples are independently drawn
(BD500in Figure 10.1). The corresponding bootstrap replications are
calculated
The resulting bootstrap estimate of standard error for theta-hat is the empirical standard deviation of the theta-hat values
”
Resampling plans
“There is a second way to think about the jackknife and the bootstrap:
as algorithms that reweight, or resample, the original data vector x.
esampling plans (2) Many other resampling schemes have been proposed, e.g.,
- The Infinitesimal Jackknife
- Multisample Bootstrap
- Moving Blocks Bootstrap
- The Bayesian Bootstrap”
Parametric bootstrap
“A resampling method used to estimate the sampling distribution of a statistic by assuming a specific parametric model for the data and generating new samples based on this model.
Estimate the parameters, generate samples using estimate, compute standard error.
Parametric families act as regularizers, smoothing out the raw data and
de-emphasizing outliers. In fact the student score data is not a good can-
didate for normal modeling, having at least one notable outlier,7 casting
doubt on the smaller estimate of standard error”
Influence function
“The Influence Function measures the sensitivity of an estimator to small changes in the data.
It assesses how much a single observation affects the estimator.
Useful in robust statistics to understand how estimators respond to outliers or extreme values.
Large Influence: High sensitivity to observations at x (non-robust). *
Small/Bounded Influence: Less sensitivity to extreme values, meaning the estimator is more robust.
The influence function of the (traditional sample) mean is linear, hence non-robust (outliers have a large effect).
The median has a bounded influence function, making it robust to outliers.
Robust estimators (Huber’s M-estimator, trimmed mean) are designed with bounded influence functions.
Used in Outlier Detection to assess the impact of individual points.
Helps in evaluating the robustness of models.
Applied in regression models to assess the impact of influential data points (e.g., leverage in linear regression).”
Robust estimation
“Robust estimation theory seeks estimators of bounded influence, that do well against heavy-tailed densities without giving up too much efficiency against light-tailed densities such as the normal. Of particular interest have been the trimmed mean and its close cousin the winsorized mean.
Trimmed mean
Definition: A measure of central tendency calculated by removing a fixed percentage of the smallest and largest values from a dataset before computing the mean. It is a robust alternative to the arithmetic mean, reducing the influence of outliers.
winzorized mean
A measure of central tendency calculated by replacing a fixed percentage of the smallest and largest values in a dataset with the closest remaining values, then computing the mean. It reduces the influence of outliers while preserving the dataset’s size.
Bootstrap confidence intervals
“Chapter 10 focused on standard errors. Here we will take up a more am-
bitious inferential goal, the bootstrap automation of confidence intervals.
The familiar standard intervals for approximate 95% coverage, are immensely useful in practice but often not very accurate. For instance, for a Poisson model Standard intervals (11.1) are symmetric around theta, this being their main weakness. Poisson distributions grow more variable as theta increases.
The goal: automatic estimation of asymmetric confidence intervals accurately”
Neyman’s construction for one-parameter problems
Remember the earlier example with student score data where the correlation coefficent between the “mechanics” and “vectors” was estimated. How to evaluate the confidence intervals of the estimate?
Transformation invariance
“Confidence intervals enjoy the important and useful property of transfor-
mation invariance. Suppose interest shifts from theta to omega = log(theta).
A property of a statistical procedure (e.g., estimator, test, or confidence interval) where its result remains consistent under transformations of the data, provided the transformation is applied appropriately to all related elements. Transformation invariance ensures that the procedure’s interpretation is not affected by changes in the scale, location, or other transformations of the data.”
The percentile method
“Our goal is to automate the calculation of confidence intervals: given the
bootstrap distribution of a statistical estimator theta-hat, we want to automatically
produce an appropriate confidence interval for the unseen parameter theta.
The percentile method uses the shape of the bootstrap distribution to
improve upon the standard intervals (11.1). Having generated B bootstrap
replications, we use the obvious percentiles of their distribution to define the percentile confidence limits. (we empirically take the value at 2.5% lowest and 2.5% highest as the conf. interval )
The histogram in Figure 11.3 has its 0.025 and 0.975 percentiles equal to 0.118 and 0.758
The percentile intervals are transformation invariant.
Some comments concerning the percentile method are pertinent.
The percentile method can be thought of as a transformation-invariant
version of the standard intervals, an “automatic Fisher” that substitutes
massive computations for mathematical ingenuity
The method requires bootstrap sample sizes on the order of B = 2000
There are even better methods than percentile methods: BC and BCa”
The bias-corrected percentile method (BC)
“A refined bootstrap method for constructing confidence intervals that accounts for bias and skewness in the bootstrap distribution of an estimator. It adjusts the percentile-based intervals to better reflect the true variability and asymmetry of the statistic.
Particularly useful when the bootstrap distribution is not symmetric around the parameter estimate.
Bias-corrected and accelerated (BCa) confidence intervals
“A bootstrap method for constructing confidence intervals that improves on the bias-corrected (BC) method by also adjusting for the variability (acceleration) of the estimator. BCa accounts for both bias and skewness in the bootstrap distribution, making it suitable for non-symmetric or biased estimate.
Adjusts confidence interval bounds for bias (as in BC) and acceleration, which measures how the standard error of the estimate changes across the parameter space.”
Objective Bayes intervals (credible intervals)
“Objective Bayes intervals are credible intervals derived using non-informative priors in Bayesian analysis. They aim to provide a subjective interpretation of uncertainty that is minimally influenced by prior beliefs.
Credible Intervals: Represent the range within which the parameter of interest lies with a specified probability (e.g., 95% credible interval).
Objective Priors: Use priors like Jeffrey’s prior or reference priors, which are chosen to be minimally informative and invariant under reparameterization.
Often interpreted similarly to frequentist confidence intervals but have a Bayesian foundation.
Advantages:
Avoid strong subjective assumptions by using non-informative priors.
Provide probabilistic interpretation (e.g., ““The parameter lies within the interval with 95% probability””).
Limitations:
Results can still depend on the choice of the ““objective”” prior.
May differ from frequentist intervals, particularly in small samples.
Bayesian data analysis has the attractive property that, after examin-
ing the data, we can express our remaining uncertainty in the language
of probability.”