lecture 16 - chi-square and contingency squares Flashcards

1
Q

Chi-squared and contingency tables

A

mainly used when have 2 nominal variables
- Contingency tables - independence of two nominal variables
- Can also use chi-square in the “Goodness-of-fit Test” (but not on the current course)
- difference between categories of a single nominal variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A two-by-two contingency table

A

Two lecturers (Jones & Brown) each offered two courses in one year. If the popularity of the lecturers is independent of the popularity of the courses then:
A - any difference in the popularity of the courses should be the same for both lecturers, i.e. J1 - J2 = B1 - B2.
B - any difference in the popularity of the lecturers should be the same for both courses, i.e. J1 – B1 = J2 - B2

1st nominal variable is the lecturer
2nd nominal variable is the course

we are asking if the variables are indepednet so what happens on one variable doesnt affect what happens on the other 50. if there is a difference in popularity of courses that difference is the same for both lecturers = independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

examples of no contingency between two nominal variables

A

1 - no matter what course jones is more popular - but variables are independent as level of popularity is same for both courses
2 - jones and brown are some popularity
course 2 more popular than course 1
variables - if dependent as difference is same for both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

examples of a contingency between two nominal variables - not independent

A

1 - the popularity of the course reverses as you swap between lecturers eg each lecture has a speciality in each course and have same popularity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A two-by-two contingency table

A

are these variables independent of each other

Two lecturers (Jones & Brown) each offered two courses in one year. The numbers of students signing up to the two courses seemed to be neither a function of the popularity of Jones or Brown, nor the content of the course, but which course Jones and Brown taught

100 on course 1
200 on course 2

measure their popularity by asking how many people sign up to go to each lecture as a function of each course assuming every student can only sign up to one course

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Does the popularity of Jones and Brown
depend on which course they teach?

A

course 2 - jones and brown are equally popular

course 1 - brown has more people than jones

variables are not entirely independent. the difference in the popularity of the lectures is not exactly the same for both courses

we need to know what would have happened by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Observed frequencies and generating expected frequencies

A

H0 - attendance does not depend on who gives the course - null hypothesis

jones
course 1 - 20
2 - 100
total - 120

brown
1 - 80
2 - 100
total - 180

row totals
1 - 100 (1/3 of total)
2 - 200 (2/3 of total)
total - 300 (total)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

if H0 is true

A

1/3 of Jones’ total students should attend course 1
2/3 of Jones’ total students should attend course 2
1/3 of Brown’s total students should attend course 1
2/3 of Brown’s total students should attend course 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

expected frequencies

A

jones
1 - 1/3 x 120 = 40
2 - 2/3 x 120 = 80
total = 120

brown
1 - 1/3 x 180 = 60
2 - 2/3 x 180 = 120
total = 180

row totals
1 - 100 1/3
2 - 200 2/3
totals - 300 total

if Jones and brown popularity is independent of corse one and of course two popularity

what we expected by chance if null hypothesis was true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

chi-squared

A

Clearly observed & expected differ. Evaluate this as (O – E). But, squaring (O – E) gives “sensible” measure of departure.
Otherwise S (O – E) = 0 in all cases.
But “importance” of (O – E) depends on how large it is relative to E. So, also need to scale (O – E)2 by size of expected frequency. more informative than the absolute difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

increase sample

A

1003 students
differences go away so we need to scale our measure of difference from chance as a function of how big the things we expect overall are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

the chi-square c^2 statistic

A

X^2 = ∑ (0-E)^2/ E

0 has to be a whole number

E expected value for that cell doesnt have to be a whole number

  • O = Observed frequency
  • E = Expected frequency
  • (O – E)
    • Measures departure of O from E.
  • (O – E)2
    • Prevents departures above & below cancelling out.
  • Dividing by E
    Scales the degree of departure between O – E.

how to compute Is in notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Testing significance

A
  • Degrees-of-freedom
    (No. of rows – 1) ´ (No. of columns – 1)
    df = (2 - 1) ´ (2 – 1) = 1
  • Critical value = 3.84
    “Lecturer popularity depends on the course they teach
    (c2(1) = 25, p < .05).”

the bigger the difference from chance the bigger chi-squared will be
if null hypothesis is true we would expect a small value of chi-squared

we see if the chi-squared value Is larger than critical value
if it is we reject null hypothesis

if null hypothesis is true you expect the difference in one variable to be the same as the difference in the other variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Assumptions of the x^2 test - health warning

A
  • Each observation must be independent of the others.
    • E.g. one observation each from a number of subjects OR a number of observations from a single subject.
  • Data must be frequencies (i.e. counts of observations) NOT percentages or proportions.
  • Expected frequencies should be 5 or more.
    • Not a hard-and-fast rule. But tables of critical values inaccurate when expected frequencies low - especially when total number of observations low (less than 20) and/or df = 1. the lower your expected values are the less accurate your P value is.
    • Best to be careful when such conditions are met and consider an alternative test (or larger sample).
      You may see references to “Yates Correction” for when df =1. It is covered in the text, but don’t worry too much (it just makes the test a little more conservative) it doesnt correct the problem
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

General rule for expected frequencies

A
  • Expected frequencies for each cell is given by:
    E = Row Total ´ Column Total / Overall Total
    Note: can have as many categories per factor as you like.
    works for nay table

row total for row cell is in and column total for column cell is in - picture in notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Latane and Dabbs (1975)

A

In a study of helping behaviour in bystanders, Latane and Dabbs (1975) wanted to know whether the help given to a stranger depended on the gender of the bystander. They instructed confederates to walk into a lift and drop coins just after the lift started. The gender of the passenger in the lift was noted, along with the frequency with which they helped to pick up the coin.

female
help - 300
no help - 1003

male
help - 370
no help - 950

graph table and calculation in notes

17
Q

Latane and dabbs - revisited

A

In a study of predicted helping behaviour in bystanders, the Class of 2019 wanted to know whether predictions of help given to a stranger depended on the gender of the bystander. They were described a scenario where a person walked into a lift and dropped coins just after the lift started. Subjects were asked if they would help. The gender of the passenger in the lift was noted, along with the frequency with which they predicted they would help to pick up the coins.

female
help - 160
no help - 4

male
help - 20
no help - 5

18
Q

Beyond 2 x 2 table

A

A researcher wanted to know whether a genetically modified mouse (thought to be highly anxious) would behave differently to normal mice in a maze. The mice were trained to run from a start box to a goal. On test they had three options, one was the training route, a second was a shortcut, and the third was a dead end.

calculation for notes

19
Q

Testing significance

A
  • Degrees-of-freedom
    (No. of rows – 1) ´ (No. of columns – 1)
    df = (2 - 1) ´ (3 – 1) = 2
  • Critical value = 5.99
  • Observed value = 20
    “Strain of mouse affects maze behaviour,
    (c2(2) = 20, p < .05).”

significance

20
Q

Contingency revisited

A

Two lecturers (Jones & Brown) each offered two courses in one year. If the popularity of the lecturers is independent of the popularity of the courses then:
A - any difference in the popularity of the courses should be the same for both lecturers, i.e. J1 - J2 = B1 - B2.
B - any difference in the popularity of the lecturers should be the same for both courses, i.e. J1 – B1 = J2 - B2

A - any difference in the popularity of the courses should be the same for both lecturers, i.e. J1 - J2 = B1 - B2.
B - any difference in the popularity of the lecturers should be the same for both courses, i.e. J1 – B1 = J2 - B2
Note: A & B are mathematically equivalent
Take J1 - J2 = B1 - B2 Add J2 to both sides
J1 - J2 + J2 = J2 + B1 - B2 Subtract B1
J1 - J2 + J2 - B1 = J2 + B1 - B1 - B2 Cancel out
J1 - B1 = J2 - B2 Which is the same as B

21
Q

associations between two categorical variables

A

With categorical variables we can’t use the mean or any similar statistic because the mean of a categorical variable is meaningless: the numeric values you attach to different categories are arbitrary, and the mean of those numeric values will depend on how many members each category has. Therefore, when we’ve measured only categorical variables, we analyse the number of things that fall into each combination of categories (i.e., the frequencies)

to test an association we can use chi-square test - compare the frequencies you observe in certain categories to the frequencies you might expect to get in those categories by chance

When we predict a continuous outcome from categorical predictors (e.g., the linear model) the model we use is group means, but we can’t work with means when we have a categorical outcome variable (see above) so we work with frequencies instead. We use ‘expected frequencies’.A simple way to estimate the expected frequencies would be to say ‘We’ve got 200 cats in total, and four categories, so the expected value is 200/4 = 50’. This approach would be fine if, for example, we had the same number of cats that had affection as a reward as we did cats that had food as a reward, but we didn’t: 38 got food and 162 got affection as a reward. Likewise, there are not equal numbers that could and couldn’t dance. To adjust for these inequalities, we calculate expected frequencies for each cell in the table using the column and row totals for that cell. By doing so we factor in the total number of observations that could have contributed to that cell. The following equation, in which n is the total number of observations (in this case 200), shows this process:

model ij = Eid = row total x column total j / n

22
Q

fishers exact test

A

The chi-square statistic has a sampling distribution that is only approximately a chi-square distribution. The larger the sample is, the better this approximation becomes, and in large samples the approximation is good enough to not worry about the fact that it is an approximation. In small samples, the approximation is not good enough, making significance tests of the chi-square statistic inaccurate. This is why you’ll often read about the chi-square test needing expected frequencies in each cell to be greater than 5 (see Section 19.5). When the expected frequencies are greater than 5, the sampling distribution is probably close enough to a chi-square distribution for us not to worry. However, when the expected frequencies are too low, it probably means that the sampling distribution of the test statistic is too deviant from a chi-square distribution to be accurate.

Fisher came up with a solution to this problem called Fisher’s exact test (Fisher, 1922). It’s not a test as such, it’s a way to compute the exact probability of the chi-square statistic in small samples. This procedure is normally used on 2 × 2 contingency tables (i.e., two variables each with two options) and with small samples. It can be used on larger contingency tables and with large samples, but there’s no point because it was designed to overcome the problem of small samples, and in larger contingency tables it becomes computationally intensive and your computer might have a meltdown.

23
Q

the likelihood ratio

A

An alternative to Pearson’s chi-square is the likelihood ratio statistic, which is based on maximum-likelihood theory. The general idea behind this theory is that you collect some data and create a model for which the probability of obtaining the observed set of data is maximized, then you compare this model to the probability of obtaining those data under the null hypothesis. The resulting statistic is based on comparing observed frequencies with those predicted by the model. The computation is

Lx^2 = 2 ∑observed ij ln ( observed ij / model ij)

where I and j are the rows and columns of the contingency table and ln is the natural logathrim

24
Q

yates correction

A

When you have a 2 × 2 contingency table (i.e., two categorical variables each with two categories) then Pearson’s chi-square tends to produce significance values that are too small (it tends to make a Type I error). Yates suggested a correction to the Pearson formula (usually referred to as Yates’s continuity correction). The basic idea is that when you calculate the deviation from the model (observedij − modelij in equation (19.2)) you subtract 0.5 from the absolute value of this deviation before you square it. Put simply, you calculate the deviation, ignore whether it is positive or negative, subtract 0.5 from it and then square it. With Yates’s correction applied Pearson’s equation becomes

x^2 = ∑ (|observed ij - model ij|- 0.5)^2 / model ij

Note that the correction lowers the value of the chi-square statistic and, therefore, makes it less significant. There is a fair bit of evidence that this adjustment overcorrects and produces chi-square values that are too small. Howell (2012) provides an excellent discussion, if you’re interested; all I will say is that although the correction is worth knowing about, it’s probably best ignored