11 Chi Squared Flashcards
When performing a chi square goodness of fit test the degrees of freedom is..
The number of categories - 1
When performing a chi square test for independence or homogenity the degrees of freedom is ….
(number of rows - 1) (number of columns -1)
The formula for calculating the chi square test statistic is
(observed value - expected value)^2/expected value.
Do this for each point and add them.
Each individual value is called a component
Goodness of fit expected counts
(n)(expected “p”)
If it’s evenly distributed, do n/# of categories for each one. Ex: if you do 60 dice rolls, you’d expect 60/6 = 10 for each number.
Expected values for test for independence and homogeneity
(Row total)*(column total)/(table total).
Do this for each box. Better: just put the table in matrix A and run a chi square test and it will put the expected counts in matrix B for you.
How do you tell if it’s homogeneity or association/independence?
If the data comes from multiple populations, use homogeneity. If there was ultimately only one sample which is then split into different variables, it’s independence/association.
How do you tell if it’s chi-square GOF instead of one of the others?
You only have a one-way table and some kind of claim about the data. For the other chi-square tests, you need a two-way table to put into a matrix.
How do you tell if it’s some kind of chi-square test (as opposed to a z or t test?
You are given a two-way table or counts/percents in multiple categories. Also, the wording will likely use the word “distributed” instead of true proportion/mean.
What conditions need to be checked for chi-square tests?
Random and Independent (10% condition) and Large Counts: show that all the expected counts are at least 5.
How do you word the hypotheses for a chi-square GOF test?
H0: The distribution of (context including population) is the same as claimed.
Ha: The distribution of (context including population) is NOT the same as claimed.
Note: sometimes “the same as claimed” can be written as: “evenly distributed”
How do you word the hypotheses for a chi-square test for homogeneity?
H0: The distribution of (context) is the same for each of (these populations)
Ha: The distribution of (context) is NOT the same for each of (these populations)
How do you word the hypotheses for a chi-square test for association/independence?
H0: (the two categories/variables) are independent among the (population)
HA: (the two categories/variables) are not independent among the (population)
OR
H0: There is no association between (the two categories/variables) in the (population)
HA: There is an association between (the two categories/variables) in the (population)
How does increasing df change a chi-square curve?
The median/center of the graph will move to the right as df increases. This means that a higher chi-square statistic won’t be as rare. Ex: chi-square = 6 with df = 2 will be more rare than chi-square = 6 with df = 4.
What do chi-square components tell you about the data?
The highest components are the values that were the most different. You have to look at the observed and expected to tell if there were more or less than expected since chi-square components are always positive.
When given a side-by-side or segmented bar graph, How do you “describe what you see” (OR what does this graph tell you about the relationship between…)
Assuming the bars are somewhat different: “there seems to be an association between these variables (use context). Specifically…(highlight whichever bar(s) are the most different)”