Categorical analysis 2 Flashcards
What should you use if you are comparing two nominal variables?
Chi-square test of association or test of independence
tests if two nominal-scale variables are related to each other
What is ‘effect size’?
The outcome of a hypothesis depends on the sample size (larger better)
Why is it recommended that an independent measure of effect size be used when reporting a significant statistical effect?
Small treatment effect can be statistically significant if the sample is large enough
What does the effect size estimate a metric provide information about?
The size of an effect that is not influenced by factors such as sample size
Measures how ‘big’ the difference between the data and the null hypothesis predictions actually were
What does ‘cramer’s v’ do?
Measures effect size in categorical analysis (chi-square)
What is the R command for ‘Cramers v’?
associationTest() prints it automatically but can also use cramersV() for it directly
How should you roughly interpret Cramer’s V?
Why are assumptions necessary in a test?
Necessary to allow inference
If assumptions are wrong though, you can make mistakes
Is sampling distribution equal to ‘chi-square’ in chi-square tests?
No, only approximately
What assumptions do both chi-square tests (‘goodness of fit’ and ‘association’) make?
‘Large’ expected frequencies
Independence of the data
What are ‘large’ expected frequencies an assumption of chi-square tests?
Data only becomes chi-square if we can presume that there are enough observations for the underlying binominal distributions to be ‘normal’
What test should you use for comparing nominal variables if frequencies are too small?
Fisher Exact Test
What is the Fisher Exact Test?
An analogue of the chi-square test of association
However, it doesn’t require large expected frequencies (works best for small frequencies)
What assumptions does the Fisher Exact Test make that the chi-square test of association doesn’t?
It assumes that row and column totals are fixed
(can’t be changed and are the same number)
How does the Fisher Exact Test work?
By calculating the exact probability of obtaining a particular contingency table (i.e. cross-tabulation)
- The p-value is calculated by summing over all contingency tables that are “more extreme” than the observed one.*
- The definition of “more extreme” is tricky, but basically means “more uneven”*