Chi Square- Goodness of fit Flashcards
Define nominal, ordinal, interval, and ratio scales of measurement with examples.
- Nominal (or categorical): these are categories with no rank. e.g. hair colour, religion.
- Ordinal: there are categories that are ranked or ordered. E.g. grades, likert scale. Only >< apply to these.
- Interval: Quantitative measures without a true zero. They are usually scales constructed to quantify a property. e.g. temp, IQ, ph balance. Can perform +,-
- Ratio: These are quantitative measure with a true zero. E.g. age, length, time, weight. Can perform +,-,x,/
What are the two different ways of classifying variables that determines the type of statistical test we use?
- Discrete (Nominal and Ordinal)- these can only have certain values within a range.
- Continuous (Interval and Ratio)- These can have any value in a range.
Refer back to first powerpoint for inferential test decision
What is the difference between discrete and continuous variables?
- Discrete variables have distinct and separate values, often represented by integers, and there are gaps between possible values
- Continuous variables for a continuum with no gaps between values, can take any value within a range, and are often represented by real numbers.
What are classifications and when are they most useful to a researcher?
- Classifications are a form of measurement.
- They are of interest to a researcher when they are exhaustive and mutually exclusive.
- Statistical analyses require that data be classified into four scales of measurement (nominal, ordinal,interval, and ratio)
What type of data is the chi-square test primarily used for?
Nominal data. Although also ordinal.
What does it mean for events to be exhaustive and mutually exclusive?
- Exhaustive means that the data encompasses all the members of a population.
- Mutually exclusive means that no member of the population can belong to more than one category.
What is a contingency table?
A table used to present data classified with respect to two or more categorical variables.
- They also help us understand the two types of chi-square test.
What are the two kind sof chi-square test?
- Goodness of fit (unidimensional- one variable) Used to test a single dimension of data.
- Test of contingency- multidimensional (two variables) Useful to test whether two variables are associated (are they contingent on each other?)
What is the rule about values in the cells of a contingency table?
The values should be absolute frequencies and not proportions or percentages
What is the main purpose of the chi-square test?
The chi-square test analyses frequency data to determine if there is a significant difference between observed and expected frequencies.
What are the three assumptions underlying the chi-square test? (MEI)
- Mutually exclusive classification.
- Exhaustive categories
- Independence of observations (each count should be independent of another)
What three questions does the chi-square test ask?
- Do the observed values differ significantly from the expected values?
- Are the data (O) a good fit to the model (E)
- Does the data fit the expected pattern?
How do you find the expected values in a contingency table?
- Usually a 50/50 split but there is a formula.
Expected frequency= (total of cell rows x total of cell columns)/grant total of all subjects.
What are the 6 steps for decision making in inferential statistics?
- Set up the research hypothesis (H1)
- Set up the null hypothesis (H0)
- Choose a significance level (a)
- Calculate the sample statistic
- Calculate the probability from hypothetical sampling distribution (p-value)
- Decide if result is representative of hypothetical distribution. If unlikely (p<a) reject the null hypothesis. It is reasonable to assume research hypothesis.
*Note if you’re using one those table where you find the critical x^2 value using df and a, then you can reject the null hypothesis if the chi-squared statistic is greater than the critical value.
What are the initial steps involved in the goodness of fit chi-square test?
- Data: observed distribution on a single dimension
- Assumptions: Mutually exclusive and exhaustive classification, independent observations
- The theoretically predicted population distribution: H0 is observed distribution=expected distribution.
- Compare O and E: Calculate chi-square, df=k-1 where k is the number of groups, calculate p.
- Decide: if p<0.05 reject H0 therefore the observed distribution is different from the population distribution.
What does it mean if the chi-squared statistic is greater than the critical value?
The null hypothesis is rejected. Therefore the observed frequencies are unlikely to have occurred by chance.
What is the formula for calculating the chi-squre statistic?
sum all (o-e)^2/E
What is the null hypothesis for a goodness of fit test?
THe observed distribution does not differ from the expected distribution.
What are the two ways of determining significance?
- Calculate the p-value
- Find the critical value.