Hypothesis Testing 2 Flashcards
What is the correlation test between 2 categorical variables?
Chi-square
What is the test for comparing the means for two groups?
T-test for independent samples
What are the marginals?
Row and column totals
Describe the chi squared test
-Calculate expected variables if independent
What is the chi-squared test
The χ
2 value is calculated by comparing the actual
frequencies to the expected frequencies.
• The larger the discrepancy between these two, the less
probable it is that observations like this would occur were
the null hypothesis true.
• More precisely, if the null hypothesis were true, then the χ
2
value would vary according to the χ
2 distribution.
• If the χ
2
is significantly large then we reject the null
hypothesis.
How to work out degree of freedom
An r × c contingency table has
(𝑟 − 1) × 𝑐 − 1 degrees of freedom.
What happens when there are low frequencies?
The statistics underlying the χ
2
test become inaccurate
when expected frequencies are small.
• Reasons include: inevitable differences up to 0.5 as
observed values can only be whole numbers; and that χ
2
is only an approximation to the exact (but
computationally more expensive) distribution.
• The test is usually considered unreliable for a 2 × 2 table
if any cell has expected value below 5; or for a larger
table, if more than 20% of cells have expected value
below 5.
• For these cases there are more refined methods,
such as Fisher’s Exact Test.
What is the T-test?
Purpose: compare the mean of a sample to a
population with a known mean
-Calculate
-We next consult the table of upper critical
values for the t-distribution (e.g. as in this link)
to see if we can reject the null hypothesis
at the significance level of choice.
What are the assumptions in the one-sample t-test?
• Normality: the population distribution is
normal
• Independence: the observations in our sample
are generated independently of one another
What is the independent samples t-test?
Main idea: compare the means of two samples that
were independently drawn, with the purpose to
determine whether the means of the corresponding
populations are the same
After calculating the t-statistic we consult the
table of critical values for the t-distribution
What are the assumptions of the independent samples t-test?
Normality: the population distribution is normal
Independence: the observations in our sample are
generated independently of one another, both
within and across samples
Homogeneity of variance: the population standard
deviation is the same in both groups
What are the stages of the chi-squared test?
Chi-square test State H0 and H1 Create contingency table Calculate expected frequencies Compute χ 2 statistic and consult table of critical values