Bootstrapping Flashcards

1
Q

what is bootstrapping?

A

any test/metric that uses random sampling with replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the empirical distribution function?

A

the distribution function associated with the empirical measure of a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is resampling?

A

any method for:

  • estimating the precision of sample statistics (medians, variances, perecentiles) by using subsets of data (jackknifing) or drawing randomly (bootstrapping)
  • validating models using random subsets (bootstrapping, cross validation)
  • exchange labels on data points (for significance tests) = permutation tests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

intuition for bootstrap

A
  • infer info about a population by resampling the sample data
  • the ‘population’ is the sample and the quality of inference using resampled data can be measured
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is variability?

A

aka dispersion, scatter

  • is the extent to which a distribution is stretched or squeezed
  • measures: variance, std deviation, interquantile range (IQR=Q3[75%] -Q1[25%]), median absolute deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

consistent? consistency?

A
  • terms restricted to cases where the same procedure can be applied to any number of data items
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

statistic/sample statistic

A
  • single measure of some attribute of a sample

- calculated by applying a function (statistical algorithm) to the set of data = values of the items of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is point estimation?

A
  • use of sample data to calculate a single value (a ‘statistic’) which is to serve as the best guess/best estimate of an unknown population parameter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

recommendations for boostrap

A
  • when the distribution of the statistic of interest is unknown or complex
  • when the sample size for the unknown statistic is insufficient
  • when power calculations have to be performed, and a small pilot sample is available
  • MUST be sure that the distribution is NOT a power law/heavy tailed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to do bootstarp (simple case)?

A
  • using MonteCarlo algorithm: resample with replacement, use the same data set size as the original, calculate the statistic of interest, repeat to increase estimate’s precision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

other bootstrap types?

A
  • bayesian
  • parametric
  • wild
  • gaussian process regression
  • smooth
How well did you know this?
1
Not at all
2
3
4
5
Perfectly