Topic 12 Flashcards
Why might we use a Chi square test instead of a T test
This is because the other tests such as sample T tests only measure quantitative data, whereas Chi square tests are able to measure qualitative data. Additionally, it allows for testing of more than 3 categories
What are the different functions of the Chi square tests?
(3 different ways)
Goodness of fit
Homogeneity
Independence
How can χ2 tests be used for goodness of fit
Goodness of a fit tests whether the observed values match up with the expected values.
They test a hypothesis about the distribution (model) of a qualitative variable in a population
I.e. do eye colours follow the following distribution? Brown 45%, blue 27%, hazel 18%, green 10%, etc and tthen compare this to the observed value (these are the expected values)
How can χ2 tests be used to test for homogeneity
Tests a hypothesis about the distribution of a qualitative variable in several populations
How can χ2 tests be used to test for independence
Tests a hypothesis about the relationship between two qualitative variables in a population - i.e. whether they are independent or not (is there an association between the two)
I.e. is there an association between parent eye colour (qualitative) and child eye colour (qualitative)
How is the test statistics for all of the tests above calculated
χ2 (test statistic) = sum of ( [observed frequency - expected frequency] ^2 / expected frequency )
What are the hypotheses for the chi square test for goodness of fit
H0 - model fits data/expected frequencies.
H1 - Model doesn’t fit data
What are the assumptions involved in chi square tests for goodness of fit
None of the expected categories have a value of 0, and no more than 20% of the expected values are less than 5 (Cochran’s rule)
What is Cochran’s rule
No more than 20% of the expected values are less than 5 - in other words, we want at least 80% of the results with an expected value of greater than 5
How do we calculate the number of degrees of freedom from a chi square test
n-1
n = number of categories)
How do we find the p-value from a chi square test
We use χ2 (n-1) curve to find upper tail area, n = number of categories
How would we use chi square test to test for independence between two variables
We typically represent the data between two qual variables and two qual variables in a contingency table before putting it into a mosaic plot.
What is the code for the chi square test
chisq.test(dataset)
What are the hypotheses involved with the chi square test for independence
H: H0 - variables are independent
H1 - variables aren’t independent
What are the assumptions involved with the chi square test for independence
Expected categories - none are empty, and no more than 20% are <5 (Cochran’s rule) - follows same assumptions as the chi square test for goodness of fit