Khan Academy: Significance Testing (Hypothesis Testing) Flashcards
What are type I and type II errors?
Type I: Rejecting Ho while Ho is actually True
Type II: Failing to reject Ho while it’s actually False
What’s the probability of type I error?
It’s the significance level we choose, because type I error is for when we reject Ho but actually Ho is true, the probability of rejecting a Null hypothesis is the same as the P-value (significance level) we choose
What is statistical power and What is the relationship between Power and type II error?
Power= P(rejecting Ho| Ho is false)
Which is Power=1- ( failing to reject Ho| Ho is false)=1-typeII error
Which is P(not making type II error)
What are the ways of increasing statistical power? which is more attainable?
- Increasing significance level (but it also increases P(Type I error)
- Decreasing SD, which can be done by increasing sample size (this is because there will be less overlap between sampling distribution of mean for Ho True and sampling distribution of mean for Ho False, so more area in the Ho False is covered note that power= P(rejecting Ho | Ho False))
- Less variability in the underlying dataset
- If the true parameter is far from what the null hypothesis is
We can control the sample size the best
Basically what he does in the video is, assume Ho is true, so we’ll have a sampling distribution of mean, with the mean= mean of Ho and SD= sample SD/√n, mean of sample is located after the significance level in the tail. On the other hand, assuming Ho is false, then we’ll have a sampling distribution of mean, with the mean= sample mean and SD= sample SD/√n. mean of Ho is located after the significance level in the tail. to find the area containing P(rejecting Ho|Ho false) (power), we need to find the area under the Ho false curve, where it’s after the significance level of Ho true curve, which is drawn in the video
Ref
Why does changing the significance level (alpha) impacts the probability of type I and type II errors?
P(Type I error) is basically the same as alpha so alpha ↑ → P(Type I error) ↑
Increasing alpha causes an increase in the statistical power, which is P( NOT making Type II error) so alpha ↑ → P(Type II error) ↓
Is the significance level (alpha) always 0.05?
No, based on the goal, sometime one type of error is more important, therefore we may need to change the significance level
What are the conditions for a sample to be valid for z test about proportion? (Condition for Inference)
Basically for us to be able to use a sample’s statistics to estimate the population statistics and therefore, the sampling distribution of proportion (or mean), the sample must meet some conditions. The Normal condition is different for mean vs proportion but the random and independence conditions are calculated the same way.
answer
How can we calculate a z statistic in a [hypothesis] test about a proportion?
Note that for sample proportion problems, when assuming Ho is true, we can calculate the population SD using P of Ho, whereas in the sampling distribution of mean, we use the sample SD as an estimation of the population SD when assuming Ho is true.
This is because in sampling distribution of sample proportion, mean and SD are calculated using P, which in case of assuming Ho is True, we’ll have the P of Ho to use for estimates
When do we use Z or T statistics in significance/hypothesis testing?
Note that for Sampling distribution of sample proportion, we always use Z-statistics.
Answer
In the below video it says, use T-statistics either way(sample size over 30 or under) when you don’t have the population SD , but if sample size>30, it’s ok to use Z-statistics.
Also
More significance testing videos/Z-statistics vs. T-statistics
If Ho: Mean = 1.02
and Ha: Mean != 1.02
do we report one tailed p-value for significance level to two tailed?
Two tailed
We want to test the hypothesis that more than 30% of U.S. households have internet access with a significance level of 5%. We collect a sample of 150 households, and find that 57 have access. What is Ho and Ha in this problem? what value do we choose for the sampling distribution of sample proportion mean?
Ho: P< = .3
Ha: P> .3
we use .3 as the sampling distribution proportion because if the biggest number in the Ho is still very far from the Ha’s calculated proportion, that we can reject Ho, then smaller numbers would be rejected too.
How do we know when to use intervals vs hypothesis tests?
When we want to estimate something, we use intervals
When we want to test something we use hypothesis tests