Chi-Squared test Flashcards by oisin mcelwain

What is meant by predicting categorical outcome variables?

in which category an entity falls

How well did you know this?

Not at all

Perfectly

What is used to measure categorical values numerically?

Frequencies

How well did you know this?

Not at all

Perfectly

What is the Chi-squared test used for?

defining whether there is a relationship between two categorical variables

How well did you know this?

Not at all

Perfectly

What does the Chi-Squared test compare to assess this?

It is comparing the observed frequencies with the expected frequencies.

How well did you know this?

Not at all

Perfectly

What formula does the chi-squared test use?

𝑋2 = ∑ (𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑠𝑐𝑜𝑟𝑒 − 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒)^2 /𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

How well did you know this?

Not at all

Perfectly

How do you calculate the expected score?

row total * column total /n

How well did you know this?

Not at all

Perfectly

How do you calculate the degrees of freedom for the chi square?

(r-1) (c-1)

How well did you know this?

Not at all

Perfectly

In order to use the chi squared distribution with the chi-squared test, what is required?

In order to use the chi-squared distribution with the chi-squared statistic, there is a need for the expected value in each cell to be greater than 5.

How well did you know this?

Not at all

Perfectly

If the expected value is not greater than 5, what can be done?

Fisher’s exact test can be used.

How well did you know this?

Not at all

Perfectly

What is an alternative to the chi-squared statistic?

Likelihood ratio statistic

How well did you know this?

Not at all

Perfectly

What does the Likelihood ratio statistic utilise?

Comparing the probability of obtaining the same data under the null hypothesis

How well did you know this?

Not at all

Perfectly

What distribution does the Likelihood ratio use?

Chisq distribution

How well did you know this?

Not at all

Perfectly

What error does the Chi-sq distribution tend to make (when) and how can this be corrected?

The chi-square statistic tends to make a type-I error if the table is 2 x 2. This can be corrected for by using Yates’ correction

How well did you know this?

Not at all

Perfectly

What assumptions does the chi-sq test carry? (3)

One assumption the chi-square test uses is the assumption of independence of cases. Each person, item or entity must contribute to only one cell of the contingency table. Another assumption is that in 2x2 tables, no expected value should be below 5. In larger tables, not more than 20% of the expected values should be below 5 and all expected values should be greater than 1.

How well did you know this?

Not at all

Perfectly

What is the result of not meeting this expected values assumption?

leads to a reduction in test power.

How well did you know this?

Not at all

Perfectly

What is meant by the residual?

Study These Flashcards

The residual is the error between the expected frequency and the observed frequency.

How is the standardised residual calculated?

Study These Flashcards

observed-expected / sqrt(expected)

How do individual standardised residuals have a direst relationship with the test statistic?

Study These Flashcards

the chi-square statistic is composed of the sum of the standardized residuals.

What is used to give an effect size in chi sq

Study These Flashcards

Cramer’s V can give an effect size

In odds tables, what is usually used as the effect size?

Study These Flashcards

odds-ratio (times A occurred/ times A did not occur)

What is the actual odds ratio?

Study These Flashcards

odds of event A divided by the odds of event B

When is the phi test accurate?

Study These Flashcards

2 x 2 contingency tables ( for measuring associations from 0-1)

What should be used for phi test outside 2x2?

Study These Flashcards

Contingency coefficient

What shortcomings does the contingency coefficient and what attempts to mend this?

Study These Flashcards

between 0-1 but seldom reaches upper limit and so Cramer V corrects for this

If expected values are below 5 what are the recommended options if you have more than 2 variables? (4)

(1) collapse the data across the variables (preferably least likely to have an effect (2) collapse levels of one of the variables (3) collect more data (4) accept the loss of power

Chi-Squared test Flashcards

(25 cards)