hypothesis testing Flashcards
What are the types of hypotheses?
Research hypothesis (the question being investigated)
Null hypothesis (π»0): The hypothesis that is tested
Alternative hypothesis (π»1 or π»π΄): The opposite of π»0
What is a confidence interval?
An interval (lower & upper limit) within which the true value of a population parameter lies with a specified confidence level.
How do hypothesis tests differ from confidence intervals?
Hypothesis tests assess whether a single value is the true parameter.
Confidence intervals estimate a range where the true parameter likely falls.
What are the steps in a hypothesis test?
- Specify the hypothesis
- Obtain a test statistic from the data
- Compare the test statistic to a reference distribution
What is the null hypothesis (π»0)?
The hypothesis being tested, the test determines how much evidence the data provides to support this hypothesis.
Example:
π»0: π = 3 (Mean density is 3 birds/kmΒ²)
What is the alternative hypothesis (π»1 or π»π΄)?
The hypothesis that contradicts π»0, suggesting a difference or effect.
Example:
π»1: π β 3 (Mean density is not 3 birds/kmΒ²)
What are one-tailed and two-tailed tests?
One-tailed: Tests for a directional effect (e.g., π»1: π < 3 or π»1: π > 3)
when an effect can only occur in one direction
an effect can occur in both directions but only one direction is of interest.
What is two-tailed tests?
Two-tailed: Tests for any difference (e.g., π»1: π β 3)
What do we compare the test statistic to?
A reference distribution (e.g., t-distribution) to determine if the observed difference is significant.
How does variability affect hypothesis testing?
Low variability β Easier to detect a true difference
High variability β Harder to conclude significant differences
What types of statistical tests are covered?
One-sample & two-sample t-tests
ANOVA (more than two groups)
z-tests (proportions)
Chi-square tests (categorical data)
Linear regression (t and F tests)
How is a t-statistic calculated in a one-sample t-test?
tstat = (data estimate - hypothesised value) / SE(data estimate)
β
what will the tstat be if H0 is true
the test statistic (π‘π π‘ππ‘
) will be small (dependent on sampling variability) because the difference between the data-estimate (sample mean) and the hypothesised value is small.
what will the tstat be if H0 is false
the test statistic will be large (dependent on sampling variability) because the difference between the data-estimate and the hypothesised value is large.
What distribution is typically used as the reference distribution in these examples?
The t-distribution is used as the reference distribution.
What do the degrees of freedom (df) for the t-distribution depend on?
The degrees of freedom depend on what is being tested.
What are the two ways the reference distribution helps determine the strength of evidence for the null hypothesis?
- By obtaining an exact probability for the test statistic.
- By comparing the test statistic to a critical value based on a predetermined significance level.
In a one-sample two-tailed test, what is the null hypothesis (H0) and alternative hypothesis (H1)?
-π»0:π=3.6
π»1:πβ 3.6
What is the reference distribution for the test with
π=16?
π‘ππ=πβ1=π‘15
In a two-tailed test, how is the area in the two tails interpreted?
The area in the tails represents the probability of obtaining a test statistic as extreme or more extreme than the observed value.
How do you calculate the area in the two tails for the test statistic β0.753?
Add the area in the left tail (< -0.753) and the right tail (> 0.753): 0.226+0.226=0.452
What is a p-value in hypothesis testing?
The p-value is the probability of observing a test statistic as extreme, or more extreme, than the one observed, assuming the null hypothesis is true.
What does the p-value quantify in hypothesis testing?
The p-value quantifies the chance of observing the data (or something more extreme) if the null hypothesis (H0) is true.
When is the null hypothesis typically rejected based on the p-value?
The null hypothesis (π»0) is usually rejected when the p-value is very small.
What are some common threshold values (significance levels) for p-values?
0.10 β No evidence against π»0
0.05 β Weak evidence against π»0
0.01 β Some evidence against π»0
0.001 β Strong evidence against π»0
What does a large p-value indicate?
The test statistic is likely under π»0
We fail to reject π»0
What does a small p-value indicate?
The test statistic is unlikely under π»0
We reject π»0 in favor of π»1
What does it mean to βfail to reject the null hypothesisβ?
It means that the p-value is large, indicating that the test statistic is likely to occur if the null hypothesis is true. Therefore, there is no strong evidence against π»0
What does it mean to βreject the null hypothesisβ?
It means the p-value is small, indicating that the test statistic is very unlikely to occur if the null hypothesis is true. This provides evidence in favor of the alternative hypothesis (π»1)
What is the relationship between a large p-value and the test statistic?
A large p-value suggests that the test statistic is likely to occur under the null hypothesis, providing no strong evidence to reject π»0
What is Fixed Level Significance Testing in hypothesis testing?
Fixed Level Significance Testing involves comparing the test statistic to a critical value based on a fixed significance level (e.g., 0.05, 0.1) to decide whether to reject the null hypothesis.
What are statistical tables used for in significance testing?
Statistical tables provide critical values (quantiles) for different reference distributions (like the t-distribution) at various significance levels.
What does the critical value represent in significance testing?
The critical value is the threshold beyond which the test statistic would be considered extreme enough to reject the null hypothesis.
In a two-tailed test with a 5% significance level, how is the significance level distributed in the tails?
2.5% in the lower tail
2.5% in the upper tail
What is a two-sample t-test used for?
A two-sample t-test is used to compare the means of two groups to test if there is a statistically significant difference between them.
What are the null and alternative hypotheses in a two-sample t-test?
π»0:ππ΄βππ΅=0 (no difference between means)
π»1:ππ΄βππ΅β 0 (a difference exists between means)
What formula is used to calculate the test statistic in a two-sample t-test?
tstat = (ΞΌ^A-ΞΌ^B)β0) / SE(ΞΌ^AβΞΌ^B)
What happens if the test statistic is less extreme than the critical value in a fixed-level test?
The null hypothesis is not rejected, indicating that the observed data does not provide strong evidence against the null hypothesis.
Why do we divide the significance level equally between two tails in a two-tailed test?
Because we are testing for a difference in either direction (greater or smaller), so the probability of an extreme result must be shared between both tails.
What is the conservative assumption made about variance in the two-sample t-test?
It is assumed that the variances are unequal, making the test more conservative.
What is a paired t-test used for?
A paired t-test is used to compare two dependent groups where observations in one sample are paired with observations in the other sample (e.g., before and after treatment).
What is the formula for the paired t-test statistic?
tstat = (ΞΌ^dβ0)/ SE(ΞΌ^d)
How is the standard error for the paired t-test calculated?
SE(ΞΌ^d)=SDd / sqrt(n)
ΞΌ^d = Mean of the differences
ππ·π = Standard deviation of the differences
π = Sample size
What are the key assumptions for t-tests?
Independence of data within and between groups
Normal distribution of data (assess via histograms or Shapiro-Wilk test)
t-tests are robust to non-normal data if sample sizes are similar.
What should you do if normality is unreasonable in a dataset?
Use non-parametric tests, such as the Mann-Whitney-Wilcoxon test.
What does the Mann-Whitney-Wilcoxon test compare?
It compares the ranks of two groups to test if their distributions are the same or different.
How is the Mann-Whitney U test statistic calculated?
U=Wβ (n(n+1))/2
β
W = Sum of the ranks
π = Sample size
What does a large p-value in the Mann-Whitney test indicate?
A large p-value suggests no evidence of a difference between the two groups.
How can you assess the practical significance of a result?
Present the effect size and confidence interval to provide context beyond just statistical significance.
What are the key steps in hypothesis testing?
Specify null and alternative hypotheses
Calculate a test statistic
Compare to a reference distribution
Use data to determine the strength of evidence for the null hypothesis