Week 4 - Bootstrap Flashcards

1
Q

General format for CIs

A

estimate ± quantile × se(estimate)

If we know the quantiles -> we can calculate a CI
Using CLT or knowing sampling distribution

However, Bootstrap is an alternative method if we can’t use CLT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bootstrap

A
  • Create new datasets (bootstrap samples) by randomly picking data points from the original dataset, allowing the same point to be picked more than once (with replacement)
  • Each new dataset will have the same size as the original one
  • Repeat this process many times to get approximate sampling distribution
  • We can this bootstrapping to understand uncertainty of estimate and calculate CI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If sample size is not the same as original

A

Lead to unreliable and biased estimates

  • small sample size -> large variability (bigger CI)
  • large sample size -> small variability (tight CI) (not accurate)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why use bootstrap?

A

Bootstrap uses all of data and thus is more versatile if we have non-normal data

  • When we have smaller dataset
  • Non-normal distribution
  • can be applied to any point estimator
  • Can incorporate skewness
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Difference between simulation and bootstrap

A

Simulation starts with an assumed or known model (e.g., normal distribution, Poisson process) to generate data

While bootstrap relies on the original dataset as the only “population” available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly