L10 Categorical Data Analysis Flashcards
State the purpose behind the hypothesis testing of Chi-square test or Fisher’s exact test.
To test H0 that the population proportions corresponding to the random samples are equal.
OR
To test H0 that there is no association between the ‘exposure’ and ‘outcome’.
If H0 is true:
There is no difference between the proportion of “yes” outcome among those exposed and the proportion of “yes” outcome among those who were not exposed.
OR
There is no association between exposure and outcome.
When analysing nominal data of independent samples, how are the observed counts often arranged?
Construct a R x C contingency table to present observed counts.
R = no. of rows; typically exposure
C = no. of columns; typically outcomes
State the assumptions when using Chi-square test or Fisher’s exact test.
1) The samples are random samples of their populations.
2) All the observations are independent (i.e. each subject contributes data to only one cell; one observation per subject).
Use Fisher’s exact test if the following are NOT met:
3) For 2x2 contingency table, expected count of each cell MUST be at least 5
4) For larger contingency tables, expected count of each cell MUST be at least 1 AND no more than 20% of cells be less than 5.
Fisher-Freeman-Halton test is an _____ of the Fisher’s exact test.
extension
State how the expected count of each cell of the contingency table is calculated. Explain its signficance.
Expected counts are the values that would be expected if H0 is true.
Expected count for a given cell
= (row total x column total) / grand total
The value of chi-square test statistic changes if the order of the rows or columns of a R x C contingency table are switched. True or false?
False
The value of chi-square test statistic does not change if the rows or columns of a R x C contingency table are interchanged (i.e. transpose of the matrix). True or false?
True
E.g. of how to write conclusion of the chi-square test or Fisher’s exact test or Fisher-Freeman-Halton test.
For two independent groups:
At a significance level of 0.05, there is NO statistically significant difference between the proportion of smokers quitting smoking among those who participated in the program (13.2%) and that among those who did not participate in the program (7.3%) (p = 0.056)
OR
At a significance level of 0.05, there is NO association between participation in the program and quitting smoking (p = 0.056).
For more than two independent groups:
At a significance level of 0.05, there is an association between the amount of MSG in a meal and the occurrence of headache (p < 0.0005). In addition, the proportion of individuals having headaches appears to increase as the amount of MSG in a meal increases.
OR
At a significance level of 0.05, not all the proportions of individuals having headache among those who took a meal with high, medium or low amounts of MSG are the same (p < 0.0005). In addition, the proportion of individuals having headaches appears to increase as the amount of MSG in a meal increases.
State the assumptions when using McNemar’s test.
1) The samples are random samples of their populations.
2) Each observation in the first sample has a corresponding observation in the second sample (i.e. paired samples).
When analysing nominal data of independent samples, how are the observed counts often arranged?
Construct a 2R x 2C contingency table to present observed counts that take into account the paired nature of data.
R = no. of rows; typically exposure
C = no. of columns; typically outcomes
Unit of analysis is matched pairs, rather than individual subjects!
Define what are ‘concordant pairs’ and ‘discordant pairs’.
Concordant pairs are where the outcome is the same for each member of the pair.
- Ignored & NOT used in McNemar’s test since concordant pairs do NOT provide information about differences in outcomes resulting from exposure.
Discordant pairs are where outcomes differ for the members of the pair.
State the purpose behind the hypothesis testing of McNemar’s test.
To test H0 that the population proportions of the ‘outcome’ corresponding to the paired random samples are equal.
OR
To test H0 that there is no association between the ‘exposure’ and ‘outcome’.
Note that there are three ways to formulate H0 & H1:
1) Based on association:
H0: There is no association between exposure and outcome.
H1: There is an association between exposure and outcome.
2) Based on proportions:
H0: There is no difference between the proportion of “yes” outcome among those exposed to A and the proportion of “yes” outcome among those who were exposed to B.
H1: There is a difference between the proportion of “yes” outcome among those exposed to A and the proportion of “yes” outcome among those who were exposed to B.
3) Based on no. of discordant pairs:
H0: There is no difference between the number of pairs in which reaction to Test A is positive and the matched reaction to Test B is negative (n1), and the number of pairs in which reaction to Test A is negative and the matched reaction to Test B is positive (n2).
H1: The number of pairs in which reaction to Test A is positive and the matched reaction to Test B is negative (n1) is different from the number of pairs in which reaction to Test A is negative and the matched reaction to Test B is positive (n2).
- For McNemar’s test to be used, n1 + n2 should be at least 20; otherwise, use other distributions for smaller sample sizes.
E.g. of how to write conclusion of the McNemar’s test.
1) Based on association:
At a significance level of 0.05, there is no association between the reaction observed and the test used (p = 0.362).
2) Based on proportions:
At a significance level of 0.05, there is no significant difference between the proportion of persons with positive reaction to Test A (64.0%) and that for Test B (58.0%) (p = 0.362).
3) Based on no. of discordant pairs:
At a significance level of 0.05, there is no significant difference between the number of pairs in which reaction to Test A is positive and the matched reaction to Test B is negative (n1), and the number of pairs in which reaction to Test A is negative and the matched reaction to Test B is positive (n2) (p = 0.362).