Chi-Square Statistics Flashcards
Data sets that are not contiguous. For example: Dead or alive, head injury or no head injury, cancer or no cancer, etc.
Categorical Variables
There is no mean,median, mode, or normal distribution for
Categorical data
Take on values that are names or labels
Categorical Variables
Different categorical variables can be associated with
-Ex: Is there a difference between college students and medical students in the number of hours of sleep per week?
Eachother
Categories may be inherent in the data or created by the researchers from
continuous data
We may want to change the data into categories if the categories are more clinically meaningful, or if the data are
Non-normally distributed
What is the conventional data presentation for the associations between categorical variables?
Contingency tables
In a contingency table, all data is independent, meaning that each person fits into only
1 box
What should we always do for contingency tables?
Put totals outside of each row/column
The appropriate statistic to use for categorical data
Chi-Square test
In the Chi-square test, we first want to establish categories and the determine the
Frequency within each category
Once we know the frequency within each category, we want to
Formulate a model
The last thing we want to do in our Chi square test is compare the normal to the expected to see if the categories are
Independent
The Chi squared test is written as
χ^2
Measure the observed frequencies and compares them to the expected
Chi-squared test
How do we calculate the expected values for each box on the contingency table?
Expected value = (row total x column total) / grand total
For the chi-squared test, if our calculated value of X^2 is GREATER than the critical value, we
Reject the null hypothesis
For the chi-squared test, if our calculated value of X^2 is LESS than the critical value, we
Can not reject null hypothesis
The Chi-squared test is not valid for a 2x2 contingency table with very small samples. In this case, we use a
Fisher Exact Test
In Chi-squared tests, we make the assumptions that the data are frequency data, there is an adequate sample size, and the measures are
Independent of eachother
The study of disease occurence in human populations
Epidemiology
Epidemiology also uses
Contingency tables
Follow two groups of people, some who are exposed to a factor
Cohort studies
Look at people with and without the disease and determine whether or not they were exposed
-like i the bone mineral example
Case-control studies
Measures the odds of getting a disease, given an exposure, and compares that to the odds of getting the disease without it
Case-control study
For a case-control study, we use the
Odds ratio (OR)
When using the odds ratio, OR = 1 means
No difference in odds of exposure
When using the odds ratio, OR > 1 means
The odds of getting the disease are increased w/ exposure
When using the odds ratio, OR less than one
The odds of getting the disease decrease w/ exposure
If the 95% confidence interval contains 1, than there is
No effect of the exposure
Used when data is normally distributed, there are more than 2 groups, and each person can only fall into one of the groups (I.e. Married, divorced, single, etc.)
One way ANOVA
Used when the data is normally distributed, and you want to do multiple tests on a single group (I.e. Taking blood pressure measurements at various times of the day.)
Multiple Measures ANOVA