Reading Quiz 14 Flashcards by Kate Lester

three chi-square procedures

chi-square goodness of fit test
chi-square test for homogeneity of populations
chi-square test of association/independence

How well did you know this?

Not at all

Perfectly

chi-square test for goodness of fit

tests the null hypothesis that a categorical variable has a specific distribution
aka X^2

How well did you know this?

Not at all

Perfectly

chi-square test for homogeneity of populations

tests the null hypothesis that the distribution of a particular categorical variable is the same for all of the populations

How well did you know this?

Not at all

Perfectly

chi-square test of association/independence

tests the null hypothesis that there is no relationship between two categorical variables

How well did you know this?

Not at all

Perfectly

expected count

the expected count for any variable category is obtained by multiplying the hypothesized proportion of the distribution for each category times the sample size

How well did you know this?

Not at all

Perfectly

chi-square statistic

X^2 = ∑ ((observed count - expected count)^2) / expected count aka ∑((O-E)^2)/E
where sum is over k variable categories

How well did you know this?

Not at all

Perfectly

chi-square test compares the value of the statistic

X^2 with critical values from the chi-square distribution with k-1 degrees of freedom, where k = the number of categories

How well did you know this?

Not at all

Perfectly

Ho and Ha for chi-square GOF

Ho: the population proportions equal the hypothesized values (provide them)
Ha: at least one of the population proportions differs from its hypothesized value

How well did you know this?

Not at all

Perfectly

p-value is the

area under the density curve to the right of X^2

How well did you know this?

Not at all

Perfectly

large values of X^2 are evidence

against Ho

How well did you know this?

Not at all

Perfectly

the chi-square distribution is an approximation to the distribution of

the statistic X^2

How well did you know this?

Not at all

Perfectly

can safely use the approximate (aka conditions) when sample is

an SRS from the population and when all expected counts are at least 1 and no more than 20% of all expected counts are less than 5 (state the expected counts!)

How well did you know this?

Not at all

Perfectly

if the chi-square test finds a statistically significant p-value, you are technically supposed to do a

follow-up analysis that compares the observed counts with the expected counts and that looks for the largest components of the chi-square statistic

How well did you know this?

Not at all

Perfectly

two-way tables

first compute percents or proportions that describe the relationship of interest
then turn to formal inference

How well did you know this?

Not at all

Perfectly

two different methods of generating data for two-way tables lead to the

chi-square test for homogeneity of populations and the chi-square test of association/independence

How well did you know this?

Not at all

Perfectly

chi-square test for homogeneity of populations

independent SRSs are drawn from each of several populations
each observation is classified according to a categorical variable of interest
null hypothesis is that distribution of categorical variable is same for all of the populations

How well did you know this?

Not at all

Perfectly

one common use of the chi-square test for homogeneity of populations is to compare several

population proportions
the null hypothesis is that all of the population proportions are equal
the alternative hypothesis is that they are not all equal but allows any other relationship among the population proportions

How well did you know this?

Not at all

Perfectly

chi-square test of association/independence

a single SRS is drawn from a single population
observations are classified according to two categorical variables
null hypothesis is that there is no relationship between the row variable and the column variable

How well did you know this?

Not at all

Perfectly

expected count

the expected count in any cell of a two-way table when Ho is true is
expected count = (row total * column total) / n
where n = sample size

How well did you know this?

Not at all

Perfectly

chi-square statistic

X^2 = ∑(O-E)^2 /E

where sum is over all r*c cells

How well did you know this?

Not at all

Perfectly

the chi-square test compares the value of the statistic X6@

with critical values from the chi-square distribution with (r-1)(c-1) degrees of freedom
r = the number of rows
c= number of columns

How well did you know this?

Not at all

Perfectly

p-value is the

area under the density curve to the right of X^2

larger values of X^2 are evidence against Ho

How well did you know this?

Not at all

Perfectly

chi-square distribution approximation to the distribution of

the statistic X^2

How well did you know this?

Not at all

Perfectly

can safely use this approximation aka the conditions when all expected cell counts

are at least 1 and no more than 20% of all expected cell counts are less than 5

How well did you know this?

Not at all

Perfectly

for an independence/association test the sample must be gathered

by an SRS from the population

for homogeneity all of the samples

must be independent SRSs from their respective populations

Suppose that you are dealing with a situation where there are several possible outcomes, not just 2 (success and failure). You are interested in seeing whether the proportion of outcomes falling into each of a certain set of categories is consistent with a certain hypothesized population distribution. What is the name of the test you use?

chi-square test for goodness of fit

Suppose that your hypothesized population distribution for the percent of objects that are certain colors is 20% black, 50% white, and 30% green. Suppose you draw a sample of 200, to test this hypothesis. What are the "expected" values that you use when you do the chi-square goodness of fit test?

40, 100, 60

In testing the hypothesis mentioned in Q2, suppose your observed counts are 45, 90, and 65. What does chi-square equal for this goodness of fit test? Please write a numerical expression without calculating the result.

A. chi-square = (45-40)^2/40 + (90-100)^2/100 + (65-60)^2/60

Is there just one chi-square distribution, or a family of distributions, with one distribution for each number of degrees of freedom?

A. A family, with one distribution for each number of degrees of freedom.

How do you find the number of degrees of freedom for a chi-square goodness of fit test? For example, how many degrees of freedom would there be if you were looking at the proportion of blacks, whites, and greens as in Q2?

A. The degrees of freedom is one less than the number of categories in the distribution; for example, when there are blacks, whites, and greens, the number of degrees of freedom is 3-1=2.

When you look up in a table or a calculator the P-value associated with a certain chi-square, what is that the probability of?

A. The probability of obtaining results as extreme as, or more extreme than, the ones you obtained, if the hypothesized distribution is true. (Extreme means deviant from what is expected.)

Is the chi-square distribution symmetrical? If not, in which direction is it skewed?

skewed to the right

When you are doing a chi-square test for goodness of fit, what are the hypothesis H0 and the alternative hypothesis Ha?

A. The H0 is that the population percents are equal to the set of hypothesized percents. The Ha is that the population percents do not equal that set of hypothesized percents.

What are the rule of thumb conditions for the use of the chi-square goodness of fit test?

A. All individual expected counts are at least 1 and no more than 20% of the expected counts are less than 5.

If a chi-square goodness of fit test yields a significant result, what should you inspect before you interpret the results?

A. You see which observed counts deviated the most from the expected ones -- in other words, you see which cells contributed the most to the chi-square that was calculated. You take these observations into account when interpreting your results.

Two-way tables describe relationships between two categorical or continuous variables?

categorical

When there are multiple comparisons that can be made, what two steps are often carried out?

A. First an overall test for evidence of any differences among the parameters being compared, and then a follow-up analysis to decide which parameters differ and to estimate how large the differences are.

When doing a chi-square test to compare several proportions, the first step is to set up the table with the numbers in it being (proportions of success and number of trials, or counts of the number of cases falling into each category).

A. Counts of the number of cases falling into each category.

When there are two categorical variables being displayed in an r by c table (with r rows and c columns), each of the r x c possible categories into which the observations may fall is called a _____ of the table.

cell

When we are comparing the proportion of successes for three treatment conditions, what null hypothesis would we use?

A. That the proportion of successes is the same among all three conditions, i.e. that p1 = p2 = p3.

When comparing the proportion of successes for three treatment conditions, what would be the alternative hypothesis?

A. That not all the proportions are equal.

In testing Ho via chi-square with a two-way table, we compare the observed counts with the expected counts. Evidence against Ho consists of observed and expected counts that are far from each other or close to each other?

far from each other

How do you compute the expected count in a certain cell of a two-way table?

A. The expected count is the (row total * column total)/table total.

The calculation of the expected value for a cell of a two-way table assumes what relationship between the row and column variables is (disjoint or independent).

independent

When you want to test the statistical significance of the deviation of observed from expected counts, in a two-way table, using chi-square, how do you compute the chi-square statistic?

A. chi-square is the summation of the (observed count - expected count)^2/expected count. The summation is over all r * c cells of the table.

Large values of chi-square are evidence for, or against Ho? Why?

A. Against. This is because chi-square will be bigger, the bigger are the deviations of observed counts from those that would be expected under Ho.

How many degrees of freedom do you have in a chi-square test with an r * c two-way table?

A. (r-1)(c-1)

True or False: when doing chi-square tests, the p-value is always the area under the distribution curve that is to the right of the observed chi-square, and never the area to the left.

A. True. For the chi-square distribution, the farther you go to the right, the more you have deviated from the null hypothesis. The value most consistent with the null hypothesis is 0, which is the left end of the domain for the function. To get the probability of results as deviant as, or more deviant than, the obtained results, you look at the probability under the curve to the right of the obtained results. (This includes the probability exactly at the obtained results, but since chi-square is a continuous function, the distinction between "above" and "at or above" is not meaningful.)

What cell counts are required for doing a chi-square test for homogeneity of populations?

A. The same as for tests of goodness of fit: all expected counts are 1 or greater, and no more than 20% of the expected counts are less than 5.

In the special case of a two-by-two table (r=2 and c=2), how many cell counts need to be 5 or greater in order to do a chi-square?

all four of them

How many degrees of freedom would be used for a 3 by 2 table?

A. (3-1)*(2-1) = 2

After having done an overall test rejecting the hypothesis that all the proportions are equal, what should be done?

A. A follow-up analysis that asks which cells most contribute to the deviation from expectations under the null hypothesis. You can do this informally by observation; there are more formal methods that do significance tests and confidence intervals for the individual proportions.

True or False: the chi-square tests the hypothesis that "the row and column variables are not related to each other," even when it is difficult to conceive of the groups defined by the rows and columns as different populations, i.e. when you are dealing with the relation of some variables in one population.

true

True or False: for a chi-square test of association/independence of variables, you compute the expected counts just as in the other situations: the row total * column total/ table total.

true

True or False: converting table entries to percents is not necessary for the computation of chi-square, but it does help to shed light on the association among the variables.

true

For a chi-square test of association/independence of variables, what is the null hypothesis?

A. That the variables are independent, or that there is no association between them.

True or False: the distinction between tests of homogeneity of populations and tests of association/independence is that in the first, there is a sample from each of two or more populations, and in the second, there is a single sample from a single population.

A. True. (However, distinguishing whether there is one or more than one population involved in a study can be a debatable procedure. If you collect a sample of people, some of whom are wealthy and some of whom are poor, can you argue that you have sampled some individuals from the population of poor people and some from the population of rich people? Or have you drawn from one population of people, who simply differ in one variable? Fortunately, the chi-square test is done in the same way regardless of the outcome of such a debate.)

When there is a two-by-two table, and you wish to compare two proportions, how will a two-sided z test for equality of proportions and a chi-square test compare with respect to the p values that result?

A. The same p values will result.

If there is a two-by-two table and you wish to compare two proportions, which test is usually recommended, between a z test and a chi-square, and why?

A. The z test has the advantages that it is related to a confidence interval for the difference in proportions, plus you can do a one-sided test if desired.

Reading Quiz 14 Flashcards

(60 cards)