Confidence and hypothesis testing Flashcards
What is a confidence interval?
A plausible range of values for a population parameter.
What is a confidence level?
Associated to a confidence interval is a confidence level. It gives the probability that a statistic lies within that interval.
Give the 95% confidence interval for a sample mean.
95% of the time, the true population mean will lie within:
sample mean +- Z_(0.05/2) . SE
What are accuracy and precision?
Accuracy: does the confidence interval contain the population value?
Precsision: how narrow is the confidence interval,
Define a hypothesis test.
Measure some data, and hypothesize some result due to that data.
Define the null hypothesis: it assumes the result is false i.e. there is nothing going on.
The alternate hypothesis assumes the result is correct. Only take H_a if we find enough evidence to rekect the null hypothesis.
What is a p-value?
The probability that an outcome is at least as extreme as the observed outcome assuming the null hypothesis is true. Small p is more evidence against H_0.
What is a statistically significant result?
One that is unlikely to have occurred given the null hypothesis.
When would we reject the null hypothesis?
When the p-value is less than a pre-specified significance level - usually 0.1, 0.05, 0.01 or smaller for particle physics experiments.
What is the z-score?
The number of standard errors a statistic is away from our hypothesized population mean.
What is the students t distribution?
If we don’t know the standard deviation of the population, then we need to use an estimate instead. Approximate the standard error using the sample standard. The t-score is the number of approximated standard errors away from the mean. Assuming the null hypothesis is true, the t-score should be t-distributed with (N-1) degrees of freedom (N being the sample size).
Describe how you would perform a hypothesis test on the comparison of two means.
Since the difference of two normally distributed random variables is normally distributed, we can perform a hypothesis test on the difference between the two means.
What is an important assumption we make when we find the SE for the difference of two means?
The populations are independent.
Describe how we would perform a hypothesis test on paired means: i.e. control vs treatment.
The mean and std of the difference between control and treatment can be used as the random variable. The null hypothesis should be that there is no difference, and this should be rejected if the mean is significantly far from zero.
What are type-1 and type-2 errors?
type 1: reject the null hypothesis despite it being true.
type 2: retain the null hypothesis despite it being false.
What is the limitation of hypothesis testing and how can it be misused?
Sometimes, things happen just by coincidence, and type 1 and type 2 errors can occur by chance. This fact could be misused if multiple tries of the same experiment are repeated but with one aspect changed. Eventually a significant result will appear by chance. This goes against the principles of the scientific method.