Lecture 14 - Analysing Categorical Data Flashcards

Question 1

Q

What is analysis of categorical data?

Answer

A

We sometimes want to predict which category someone falls into
E.g. traitor or faithful
We can create a contingency table and perform a chi-square test on the data
Do people fall into a category more often than we expect them to?

Question 2

Q

What are contingency tables?

Answer

A

A table of frequencies for how often an observation occurs in a category (how many people chose which option)
Categories must be mutually exclusive and exhaustive (no overlap)

Question 3

Q

What is a Chi-square test?

Answer

A

Devised by Karl Pearson in 1900, also known as Pearson’s chi-square
Calculates how often a particular observation falls into a category based on how many were expected by chance

Question 4

Q

What is the null hypothesis in a chi-square test?

Answer

A

The frequencies observed were expected by chance

Question 5

Q

What is the alternative hypothesis in a chi-square test?

Answer

A

The frequencies observed reflect real differences in categories

Question 6

Q

What are the assumptions of a chi-square test?

Answer

A

Independence - each person can only contribute to one cell of a contingency table
Expected frequencies - all expected counts should be greater than 1 and no more than 20% of expected counts should be less than 5
If violated, power is reduced
Terms ‘values’, ‘frequencies’ and ‘count’ interchangeable

Question 7

Q

What happens when expected frequencies are violated?

Answer

A

Results in a loss of power
Several options:
Use an ‘Exact’ test instead (e.g. Fisher’s or MLR)
Collapse/remove data across one variable
Collapse levels of one variable
Collect more data
Accept the loss of power

Question 8

Q

How do you calculate a chi-square test by hand for one IV?

Answer

A

Three steps:
(1) Calculate expected frequencies
(2) Calculate Chi-Square value based on observed and expected frequencies
(3) Compare Chi-Square value against a critical values table

Question 9

Q

How do you interpret chi-square critical values tables?

Answer

A

To interpret the table we need to know our degrees of freedom, and our desired alpha value
Degrees of freedom = number of categories-1
Reject H0 when Χ2observed > Χ2critical

Question 10

Q

How do you calculate a chi-square test by hand for two IVs?

Answer

A

With two IVs, the difference will be in calculating the expected values in each case
To calculate expected frequencies for two IVs, we need to calculate expected frequencies of specific cells
Degrees of freedom = (number of rows-1) x (number of columns-1)

Question 11

Q

How do you conduct a chi-square test for one IV in SPSS?

Answer

A

For one IV
Analyse -> non-parametric tests -> legacy dialog -> chi-square test

Question 12

Q

How do you conduct a chi-square test for two IVs in SPSS?

Answer

A

For two IVs
Analyse -> descriptive statistics -> crosstabs (need to click ‘statistics’ to ask for Chi-square test)

Question 13

Q

How do you report the chi-square test for one IV?

Answer

A

E.g. “The number of people choosing to be Traitors or Faithfuls can be seen in Table/Figure ‘X’. This distribution is significantly different to chance (χ2(1)=5.4, p=.02).”

Question 14

Q

How do you report the chi-square test for two IVs?

Answer

A

E.g. “There was a significant association between a viewer’s favorite Netflix show and where they were from. (χ2(1)=5.44, p=.02, Cramer’s v= .301). Whilst people from the UK preferred to be a Faithful, people from the USA preferred to be a Traitor’”

Question 15

Q

What is a binomial test?

Answer

A

Compares observed and expected frequencies for variables with only two levels
E.g. Are there more participants in our sample from the USA than what we would expect by chance?

Question 16

Q

How do you conduct a binomial test in SPSS?

Answer

A

AnalyseNon-parametric testsLegacy dialogsBinomial

Question 17

Q

When should binomial tests be performed compared to chi-square?

Answer

A

Binomial tests should be performed on variables with two levels, Chi-Square tests should be performed with more than two levels of a given variable, or more than two variables