W10 Chi-square Flashcards

1
Q

One way chi-square

A
  • Chi square is a test used on frequency data
  • Chi square allows us to test research questions with categorial data
  • In the one-way chi square we can ask research questions like
  • > Are people/things distributed evenly across the categories of the variable?
  • > We can also test out other distributions (we will come back to this)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Chi-square test

A
  • chi-square analyses involve working out the expected likelihood of something happening (what frequencies you should get if the null hypothesis is true) and comparing it with the observed frequencies (what frequencies you do get)
  • chi-square tells you how well the expected and observed frequencies match … for this reason it is often referred to as a test for “goodness of fit” (do the observed frequencies “fit” the expected frequencies?)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Calculation of x^2

A
  • look up image

* look up example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Expected frequencies for one way

A

what do we expect? what is the null?

  • uniformly distributed -> equiprobable distribution
  • distributed in accord with theory
  • distributed in accord with previous observed frequencies
  • normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Comparing obtained statistic against chi-square distribution (null)

A

H0: in the population, the distribution across categories = expected frequencies
-> therefore any discrepancy = chance

  • how big can the discrepancy be before we reject possibility of H0 being true?
  • use the standard cutoff of .05
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sampling distribution of x^2

A
  • x^2 is a family of distributions dependent on df
  • however, df depends on number of cells, not number of participants
  • > df = k - 1
  • x^2 values always positive
  • can increase x^2obt by increasing sample size
  • > e.g. could double x^2 value by doubling sample size but x^2crit remains the same
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Two way chi-square / chi-square test of independence

A
  • Relations between variables can also be tested with the chi-square test
  • > formula for chi-square the same, new formula for fe
  • The null hypothesis is that the two variables are distributed independently
  • > referred to as x^2 test of independence
  • > Also x^2 contingency
  • > One variable must not depend on another
  • > two variables are considered to be independent when the frequency distribution for one variable has the same shape for all levels of the second variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The logic of the test of independence

A
  • By ignoring the frequencies within the body of the table and just focusing on the marginals, we are viewing the data from the perspective of the null (H0)
  • > if H0 is true, then the marginals are all we need to understand the variables as they would be distributed independently
  • > How do we obtain the necessary expected frequencies?
  • > we use the frequencies expected under independence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Expected frequencies for two way

A
  • if the variables are independent, the relative proportions one variable should be mirrored in each level other variable
  • > example for attendance (less than perfect, perfect) and course outcome (passed, failed)
  • > 70% of those with perfect attendance would pass and 30% would fail, 70% of those with less than perfect attendance would pass and 30% would fail
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Calculating fe for two-way table

A

For each cell: fe = (column total x row total) / grand total

  • we have essentially duplicated the table and replaced the obtained frequencies (i.e. the actual data) with the expected frequencies that represent the way the data ‘ought to look’ under the null (i.e. when H0 is true)
  • > then calculate x^2 using previous formula and find df statistic
  • > in two way chi square, df = (rows - 1)(columns - 1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Assumptions of x^2

A
  • independence of observations
  • > a participant must fall in one and only one category - they cannot be counted twice
  • > thus you cannot use chi-square on repeated measures design
  • size of expected frequencies
  • inclusion of non-occurrences
  • > computations must be based on all participants in the sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Assumptions of x^2 - size of expected frequencies

A
  • all expected cell frequencies should be at least 5, if they aren’t you shouldn’t use a x^2 test
  • small expected frequencies produce few possible values of chi-square obtained x^2 but we compare to continuous distribution
  • the greater the degrees of freedom, the more lenient this requirement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Limitations of chi-square

A
  • restricted to <2 variables
  • > larger designs routinely analysed using multivariate generalisations of these approaches that are largely based on log-linear models and are best covered in their own course
  • > just as MR is to simple regression, log-linear models are to chi-square
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Effect size

A
  • a sig x^2 does not indicate strength of the relationship
  • if your sample size is large enough, sometimes very small effects will be detected as sig
  • but we can convert x^2 to a ‘measure of association’ that tells you how big the effect is
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Effect size: Phi

A
  • phi for 2 x 2 chi square
  • interpreted as Pearson r
  • correlation between two variables, each of which is a dichotomy
  • if chi-square is sig, so is phi
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Measure of association: phi

A
  • Although phi is a measure of association between two variables, rather than effect size, these concepts are related
  • > so for example, if phi = .50, you could say that there is a moderate level of association between the two variables
  • > or for example, if phi = .65, you could say there is a moderate to high level of association between the two variables
17
Q

Variance and phi

A
  • ‘Symmetric measures’ table, the phi value is squared to give a variance value
  • this variance value represents the variance of the model?