Chi-square Flashcards
What is chi-square?
Test of association - variables of interest are categorical
What are contingency tables?
Shows how data are distributed across variables
- observed frequencies are the numbers in the cells - represent the frequency of people who fall in each combination of levels of the variable.
Column and row totals should add up to N
What kind of contingency table has 2 variables and 2 levels?
2x2 design
How can the association between variables be eyeballed?
Descriptive statistics
- do we observe roughly the frequencies we would expect if there were no association
Are our observed frequencies different from expected frequencies?
Need to look at percentage, rather than N of ppts in each category (what % of each category would we expect to fall into each category if there is no association between each variable?)
What are expected values?
Values you expect to see in each cell if no association exists between the 2 variables (null hypothesis = true)
Want to know these for each individual cell
What is the equation for expected values?
(row total x column total) / grand total
What is the equation for chi-square?
X2= ∑ (O-E)squared/E
Square each O-E value in contingency table
Divide each result by expected value
Add up all results
What value should Chi-square always be?
0+ - if not 0, an association exists
- Check for stat sig
How do you check statistical significance?
need to know df
df = (R-1)(C-1)
R = number of rows
C - number of columns
What is an effect size?
Strength and association
What are the two effect size measures for chi-square?
Cramer’s V
- both give values between 0-1
and take sample size into account
When should you report phi?
2x2 contingency tables
When should you report Cramer’s V?
For anything other than 2x2 contingency tables
How should you interpret effect size? (Phi and Cramer’s V)
- check value against standardised cutoffs to judge magnitude of effect
- value should be equal to or greater than that given on table to fall into that category
- Values answer the question: what is the association between the 2 variables as a percentage of their max possible variation?
How should you report chi-square?
eg. χ2 (1, N = 180) = 30.13, p < .001, φ = .4
df = degrees of freedom
χ2= chi-square statistic
p= significance value
φ = Phi OR V = Cramer’s V (effect size)
What assumptions should be met before running chi-square?
- Both variables are categorical
- Categories are mutually exclusive
- No cells in contingency table should have expected frequency of lower than 1
- More than 80% of cells should have an expected frequency of 5+
What does it mean for categories to be mutually exclusive?
Ppts cant be in more than 1 category of each variable
What if the expected count is less than 5? (4th assumption is not met)
- If table is 2x2, read significance from Fisher’s exact test (corrects the fact that there is an expected count of less than 5 in one of the cells) - calculates exact probability.
- If not met for larger tables, consider pooling categories, use exact p-value for persons chi-square, or use other analyses such as likelihood ratio