5.2: Coincidence Intervals, Resampling, and Sampling Biases Flashcards
Definition of Confidence Interval
A range for which one can assert with a given probability 1 − α, called the degree of confidence, that it will contain the parameter it is intended to estimate.
How standard normal random deviation is denoted?
Z
What does Zα denote?
The point of the standard normal distribution such that α of the probability remains in the right tail.
What are the considerations for Confidence Intervals for the Population Mean?
1) Whether it is normally distributed or not,
2)What is the sample size,
3)Whether we know the variance
How to calculate Confidence Intervals for the Population Mean if it is normally distributed with known variance?
x̄ ± Za/2 * σ / √n
σ - population standard deviation
Reliability Factors for Confidence Intervals Based on the Standard Normal Distribution:
90% confidence intervals: Use z0.05 = ?
95% confidence intervals: Use z0.025 = ?
99% confidence intervals: Use z0.005 = ?
= 1.65
= 1.96
= 2.58
Formula for Confidence Intervals for the Population Mean—The z-Alternative (Large Sample, Population Variance Unknown):
x̄ ± Za/2 * S / √n
s - sample standard deviation
In which cases t-distribution is appropriate for calculating Confidence Intervals for the Population Mean?
1)When the population variance is unknown,
2)When the variance is unknown and even in cases where the samples is large and Z reliability factor could be used from CLT,
In which cases t-distribution can be used for calculating Confidence Intervals for the Population Mean?
The sample is large,
Or the sample is small, but the population is normally distributed, or approximately normally distributed.
Confidence Intervals for the Population Mean (Population Variance Unknown) - t-Distribution formula?
x̄ ± ta/2 * S / √n
The Bootstrap Resampling Method is often referred to as?
Model-free or non-parametric resampling.
How does Bootstrap Resampling Method work?
It mimics the process by treating the randomly drawn sample as if it were the population.
Often used to find standard error or construct confidence intervals of population parameters.
What is the bootstrap sampling distribution formula to approximate the true sampling distribution?
(To estimate the standard error of the sample mean)
S x̄ = √(1 / B-1)* Σ(01 – 02)^2
B denotes the number of resamples drawn from the original sample,
θˆb (O1) denotes the mean of a resample,
θbar (O2) denotes the mean across all the resample means.
What is the Jackknife resampling method?
A resampling method that repeatedly draws samples by taking the original observed data sample and leaving out one observation at a time (without replacement) from the set.
What is Data Snooping?
Data Snooping relates to overuse of the same or related data through extensive searching through a dataset for statistically significant patterns.