Lecture 6 - Analysis of Qualitative Data Flashcards
In this lecture, we are moving from continuous data to ____ data where we CANNOT use t-tests to evaluate the data.
count
What is Analysis of Qualitative Data ?
For evaluating qualitative frequency data (nominal or ordinal data) for “goodness of fit” (with an hypothesized underlying distribution), or for “association” between multiple measures.
Describe the Mendelian theory
25% = Homozygous Dominant
50% = Heterozygous Dominant
25% = Homozygous Recessive
*So 75% of people will express the dominant gene (unaffected) and 25% of people will be express the recessive gene (affected)
So the observed values would be ?
- 75 x n = Unaffected
0. 25 x n = Affected
How do you determine the goodness of fit?
You use Chi Square
Chi-square statistic has ____ values only
positive
Therefore only one parameter is needed to specify any chi-squared distribution. What parameter ?
degrees of freedom
For goodness of fit/chi square, what is the formula for degrees of freedom ?
df = number of categories - 1
What is required if df = 1 ?
Yate’s correction (this is the -0.5) thing
What is not required if df > 1 ?
Yate’s correction is not required! (So you DO NOT - 0,5)
For a chi square test, we should do an ____ tail probability test.
upper
If the X^2 value < Chi square critical value, ?
Accept Ho - no significant differences
If the X^2 value > Chi square critical value, ?
Reject Ho - there is a significant difference
Assumptions of the Goodness of Fit test ?
1) Random sampling: Not essential for calculation of chi-square, but may obviously impact the interpretation (bias)
2) Observations are independent: (A paired variation is available)
3) Categories must be mutually exclusive
4) Expected frequencies: In all the cells, there must be at least 1, with no more than 20% of cells with an expected frequency <5 for the resulting theoretical distribution to be reasonably accurate
5) The distribution of chi-square: Depends on the number of treatments being compared and the number of outcomes. This dependency is quantified in a degrees of freedom parameter v equal to the number of rows in the table minus 1, times the number of columns in the table minus 1
v = (r-1)(c-1)
6) The Yates Correction: For v=1 contingency tables, computing chi square using the standard formula leads to p-values smaller than they ought to be; results are biased towards Ha. The Yates correction is used when v = 1.
How do you calculate expected frequency of a 2x2 contingency table ?
EF = Row total x Column total / Grand total
EF = Expected Frequency