Topic 4: Non-parametric methods and bootstrap Flashcards

Question 1

Q

Describe the jackknife principle

Answer

A

We can use the Jackknife to assess the uncertainty for a real-value statistics, such as the mean.

We define the real-value statistic as hat θ = s(x), so it is just a function of x.

The core idea is to consider how much a single element in the sample affects the estimate of hat θ. (Leave-one-out cross-validation).

We could write the Jackknife estimate of the standard error as such:

EQUATION
https://docs.google.com/document/d/1_z06ButEgfbbXJBirkbQrzsobuFkwlva4BW1XOD5e60/edit?tab=t.0

Features about jackknife:

It is nonparametric, so it doesn’t assume your data follows any specific distribution (like normal, poisson, etc.)
You don’t need to make choices about parameters or settings
Hidden assumption of smooth behaviour across sample sizes: It assumes that small changes in your data lead to small changes in your results
Upwardly biassed estimate of the standard error: it tends to overestimate the uncertainty in your calculations
Closely connected to the Taylor series method with the difference
having numerically computed directional derivatives: But instead of using mathematical derivatives, it uses actual data to estimate changes

Question 2

Q

Describe the Bootstrap problem

Answer

A

The bootstrap principle is a statistical method for estimating the distribution of a sample statistic by resampling with replacement from the original dataset. It is particularly useful when making inferences about population parameters or estimating the variability of an estimator when the theoretical distribution is unknown or hard to derive.

EQUATION
https://docs.google.com/document/d/1N_Tu5vwK4HJyd0j_zSsY73pd_qmDs-lWJ1SiJiZoGbg/edit?tab=t.0

Question 3

Q

Describe empirical bootstrapping

Answer

A

We have no knowledge about the distribution from which the random sample is from, F.

We know that the Empirical Distribution Function (EDF) is a good approximation for the true distribution function.

We use the EDF to define hat F. In practice this means that we sample uniformly at random, WITH REPLACEMENT from the dataset.

The original estimate is obtained in two steps:
EQUATION

Application of Empirical Bootstrap:
https://docs.google.com/document/d/1cT3bXH4aGHKcoCMVdcCNL0-ldr-2wcbjsepzN40WqtY/edit?tab=t.0

We can evaluate the probabilities by approximating it using bootstrapping, to create the bootstrapped samples.

Question 4

Q

Describe Parametric bootstrapping

Answer

A

Suppose we are willing to assume that the observed data vector x comes from a parametric family F. Parametric bootstrap is as a way to estimate uncertainty by creating simulated datasets based on your original data’s estimated parameters.

Parametric families act as regularisers, smoothing out the raw data and de-emphasising outliers.

EQUATION
https://docs.google.com/document/d/1UqwLI1D-6TmSrOrkMpeJ2Z1s0U9Y98QO9k3DsKRYst4/edit?tab=t.0

Question 5

Q

Describe the influence function and
its relation to robust estimation

Answer

A

The influence function measures the sensitivity of an estimator to small changes in the data. It assesses how much a single observation affects the estimator.

EQUATION
https://docs.google.com/document/d/1SQXeB83e6I05tQo6WjzXKm8PwDFZ3_Oz6DA3uHr6Cz4/edit?tab=t.0

The influence function for the mean shows that the influence of a single point x on the mean is simply the deviation of x from the mean (x-\theta).

If x is very far from \theta (e.g., an outlier), the influence is large, indicating that the mean is sensitive to outliers (this is because T(F) = \theta)

EQUATION

So, when a small contamination is introduced to the data, such as another point, x, we add the weight to that x. The influence function tells us how much the estimator T changes when the small contamination is introduced.

Large influence: This means it’s highly sensitive to observations at x, and it’s therefore non-robust (unbounded influence)
Small influence/Bounded influence: This means it’s less sensitive to extreme values, meaning that the estimator is more robust.

The sample mean suffers from an unbounded influence function, which grows as x moves farther away from \theta (the sample mean).

The influence function helps in designing estimators that are less sensitive to outliers. Robust estimation theory seeks estimators \hat\theta of bounded influence (that can deal with heavy-tailed densities).

The influence function is used in outlier detection to assess the impact of individual points, x.

It helps in evaluating the robustness of the models.

It can be applied in regression models, to assess the impact of influential data points (e.g. leverage in linear regression)

Question 6

Q

Describe robust estimation in regression

Answer

A

Robust regression is a technique that reduces the influence of outliers and unusual data points when fitting a model.

Traditional regression (like Ordinary Least Squares – OLS) minimises squared errors, which makes it sensitive to outliers.
Robust regression methods limit the effect of outliers to provide more stable and reliable estimates.

A method to avoid outfliers would be to use the trimmed mean.

When doing regression it is very sensitive to outliers. Therefore using a robust estimation, that designs parameters less influenced by outliers it could be beneficial for regression models.

The idea of robust regression is toweigh the observations differently based on how well behaved these observations are.

Topic 4: Non-parametric methods and bootstrap Flashcards

(6 cards)