Chi-square tests Flashcards

Chi-square test of association and goodness of fit, binomial and sign tests

1
Q

What are hypotheses?

A
Testable statements (not questions) which predict a relationship between variables
Variables need to be named precisely and consistently
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are two assumptions made by parametric tests, and what happens when these are violated?

A

Assume normal distribution and at least interval level data

Violation leads to erroneous interpretation of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When would you use a non-parametric test?

A

When you don’t have a normal distribution, and when you have categorical (nominal/ordinal) data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why should non-parametric tests be used with care?

A

Not as powerful as parametric tests and can fail to detect some differences especially when sample size is low. Need to ensure large sample to detect smaller effects and demonstrate significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of test is chi-square? What assumptions are made when using it?

A

Non-parametric
Categorical data (coded as numbers), frequencies in each category
Assumes categories mutually exclusive and independent i.e. participants must only be in one
Expected frequencies must be >5 in each cell of contingency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can the chi-square test of association/independence be used?

A

To investigate whether two variables are associated e.g. does gender influence smoking frequency?
Compare observed frequencies to expected frequencies predicted from the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the calculation used to find expected frequencies?

A

Expected frequency = (column frequency total x row frequency total)/total sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the formula for the chi-square calculation?

A

chi-square = SUM(((observed frequencies - expected frequencies) squared)/expected frequencies)
A bigger chi-square value represents greater divergence from the null i.e. stronger association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In order to assess significance of our chi-square value we need two things: our degrees of freedom and our alpha value, and we can use these to find the CRITICAL VALUE which our obtained value needs to exceed to be significant. How do you calculate degrees of freedom for association tests?

A

df = (R-1) x (C-1) where R = rows and C = columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When would we use the Chi-Square Goodness of Fit test?

A

On UNRELATED data i.e. where every participant yields data for one single category and we are comparing different levels of ONE VARIABLE
Can be used to compare PROPORTIONS of a population distribution e.g. if there is a gender bias in the computing department

We essentially want to see whether data depart significantly from a theoretical distribution, often one that has more than just two theoretical values i.e. not just 50:50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we calculate the goodness of fit?

A

We once again use observed and expected frequencies - observed is the number of participants measured in individual categories e.g. gender categories, while expected is the frequencies predicted by the null hypothesis (these can vary depending on specific null i.e. are we finding whether the genders are equal, in which case expected would be 50:50, or are we interested in whether the ratio in the one department is representative of the population ratio, in which case we would have to know population proportions and multiply the sample group by those proportions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Goodness of Fit formula?

A

Same as for association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why do we divide by expected frequency?

A

So chi-square value is not influenced by variation in the size of the expected frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we calculate the degrees of freedom for goodness of fit tests?

A

df = C -1 where C is the number of categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a goodness of fit test, what happens if we have more than 2 categories?

A

We cannot determine exactly which group difference meant that the chi-square value was significant and thus we simply say that obtained frequencies differed from those expected by the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When would we use a Binomial test?

A

When a measurement classifies items into one of TWO categories and we are dealing with data in the form of frequencies (similar to chi-square but chi can have more than two categories)

17
Q

What are two assumptions made by the Binomial test?

A

Independence of observations and assumes each category has at least 10 values in it

18
Q

What kind of question would be answered by a Binomial test?

A

.”Does the proportion of counts in a coin toss falling into group A differ from the proportion in group B?”
Probabilities associated with each group can be considered p and q, and we say that, assuming the null hypothesis to be true, p=p(heads)=0.5 and q=p(tails)=0.5
If p=0.8 and q=0.2, under a p-value <0.05, we can say that the coin is biased

19
Q

When can the Sign Test be used?

A

To compare two conditions in a REPEATED MEASURES design
Doesn’t measure how much change has occurred but does demonstrate DIRECTION e.g. better/worse, increase/decrease
Calculation procedure is the same as for the binomial test

20
Q

How do you calculate expected frequencies in a one-row chi-square analysis?

A

N/k where N is total number of cases and k is the number of cells to average across

21
Q

How can effect size be estimated for 2x2 chi-square analyses?

A

Using the Phi coefficient/Cramer’s V
Cohen produced some effect size conventions for Cramers V that depend on degrees of freedom e.g. if df=1, a small effect would be 0.1, medium would be 0.3 and large would be 0.5

22
Q

When can a goodness of fit test sometimes be used?

A

To estimate whether a large sample approximates to a normal distribution (a requirement for some significance tests)
This involves calculation of z-scores
We can see whether observed frequencies depart from the ideal frequencies to a significant degree

23
Q

What can you do if you have unavoidably low frequencies in a chi-square?

A

Use Fisher’s Exact Test which calculates probability straight from frequencies

24
Q

How could you avoid low expected frequencies?

A

1) Avoid low samples for one category e.g. be wary of gathering data on left-handers
2) Avoid low samples overall - try to aim for above 20
3) Obtain significance at p less than or equal to 0.01 - decision to reject the null at this level is far more secure, far less chance of type 1 errors

25
Q

What must always be done when interpreting chi-square analyses?

A

ALWAYS use two-tailed values for chi-square even if the hypothesis was directional