Statistics: Analyzing the association between categorical variables Flashcards
What name is given to the percentages in a particular row?
The conditional percentages
What guidelines should be followed when constructing tables with conditional distributions
- Make the response variable the column variable
- Compute conditional proportions for the response variable within each row by dividing each cell frequency by the row total.
- Include the total sample sizes on which the percentages are based.
What is meant if two categorical variables are said to be independent or dependent of each other?
If the population conditional distributions for one of them are identical at each category of the other. The variables are dependant if the conditional distributions are not identical.
What are the main properties of the chi-squared distribution?
- Always positive
- Degrees of freedom from rows and columns ((r-1)x(c-1))
- mean= df
- as df increases the distribution goes bell shaped
- A large chi square is evidence against independence
What are the limitations of a chi squared test?
The test statistic and p value tell us nothing about the strength or strength of the association. A high P value means high probability of association, not a strong association.
Name some common misuses of the chi square
- When some of the expected frequencies are too small
- when seperate rows or columns are dependent samples
When does the df for a chi squared test change
when a hypothesis predicts a population proportion value for each category of a variable that has c categories, a chi squared statistic has df=c-1
What is meant by the measure of association?
The strength of the dependence between two variables
what is meant by a residual?
The difference between and expected count in a particular cell.
How do you get a standardized residual?
observed count-expected cound/se
When would a Fishers exact test be appropriate?
When any of the expected frequencies are 5 or less