Reading Quiz 14 Flashcards
three chi-square procedures
- chi-square goodness of fit test
- chi-square test for homogeneity of populations
- chi-square test of association/independence
chi-square test for goodness of fit
tests the null hypothesis that a categorical variable has a specific distribution
aka X^2
chi-square test for homogeneity of populations
tests the null hypothesis that the distribution of a particular categorical variable is the same for all of the populations
chi-square test of association/independence
tests the null hypothesis that there is no relationship between two categorical variables
expected count
the expected count for any variable category is obtained by multiplying the hypothesized proportion of the distribution for each category times the sample size
chi-square statistic
X^2 = ∑ ((observed count - expected count)^2) / expected count aka ∑((O-E)^2)/E
where sum is over k variable categories
chi-square test compares the value of the statistic
X^2 with critical values from the chi-square distribution with k-1 degrees of freedom, where k = the number of categories
Ho and Ha for chi-square GOF
Ho: the population proportions equal the hypothesized values (provide them)
Ha: at least one of the population proportions differs from its hypothesized value
p-value is the
area under the density curve to the right of X^2
large values of X^2 are evidence
against Ho
the chi-square distribution is an approximation to the distribution of
the statistic X^2
can safely use the approximate (aka conditions) when sample is
an SRS from the population and when all expected counts are at least 1 and no more than 20% of all expected counts are less than 5 (state the expected counts!)
if the chi-square test finds a statistically significant p-value, you are technically supposed to do a
follow-up analysis that compares the observed counts with the expected counts and that looks for the largest components of the chi-square statistic
two-way tables
first compute percents or proportions that describe the relationship of interest
then turn to formal inference
two different methods of generating data for two-way tables lead to the
chi-square test for homogeneity of populations and the chi-square test of association/independence
chi-square test for homogeneity of populations
independent SRSs are drawn from each of several populations
each observation is classified according to a categorical variable of interest
null hypothesis is that distribution of categorical variable is same for all of the populations
one common use of the chi-square test for homogeneity of populations is to compare several
population proportions
the null hypothesis is that all of the population proportions are equal
the alternative hypothesis is that they are not all equal but allows any other relationship among the population proportions
chi-square test of association/independence
a single SRS is drawn from a single population
observations are classified according to two categorical variables
null hypothesis is that there is no relationship between the row variable and the column variable
expected count
the expected count in any cell of a two-way table when Ho is true is
expected count = (row total * column total) / n
where n = sample size
chi-square statistic
X^2 = ∑(O-E)^2 /E
where sum is over all r*c cells
the chi-square test compares the value of the statistic X6@
with critical values from the chi-square distribution with (r-1)(c-1) degrees of freedom
r = the number of rows
c= number of columns
p-value is the
area under the density curve to the right of X^2
larger values of X^2 are evidence against Ho
chi-square distribution approximation to the distribution of
the statistic X^2
can safely use this approximation aka the conditions when all expected cell counts
are at least 1 and no more than 20% of all expected cell counts are less than 5