week 1 SCM (chi-squared) Flashcards

1
Q

why do we analyse frequencies when using at categorical variables?

A
  • the numerical values you attach to different categories are arbitary
  • this means that the mean of a categorical variable is meaningless
  • because of this, we analyse frequencies of each category
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the rows and columns of a contingency table?

A
  • the columns are the conditions (i.v)
  • the rows are the categories of the measure (d.v)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the general idea of the chi-squared test

A

it compares the frequencies you observe in certain categories to the frequencies that you might expect to get in those categories by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the chi-squared equation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the equation for expected values, used in the chi squared eqution?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the degrees of freedom formula for chi squared tests?

A

(row total-1) *(column total-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the degrees of freedom for a contingency table with two columns?

A

1

this is because (r-1) * (c-1)

so

(2-1) * (2-1) = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what theory is the likelyhood ratio statistic based on?

A

The maximum likelyhood theory

this means that the probability for obtaining the observed set of data is maximised

this model is then compared to the probability of obtaining those data under the null hypothesis

therefore the resulting statistic is comparing the observed frequencies with those predicted by the maximised model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

when would we use a likelyhood ratio statistic over a chi squared?

A

When the samples are small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what happens to the chi squared distribution as the degrees of freedom increases?

A

the peak of the curve moves to the right and the distribution spreads out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what does greater degrees of freedom mean in terms of how high the chi squared value has to be

A

the more degrees of freedom the higher the chi squared value has to be to be statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is a problem with the chi squared test?

A
  • the sampling distribution of the test statistic has an approximate chi squared distribution
  • the larger the sample the better the approximation becomes
  • however, in small samples the approximation is not good enough making the statistical significance test of the chi squared innacurate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what sample size is required for a chi squared?

A
  • the expected frequencies in each cell must be greater than 5 for the chi squared significance test to be accurate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. what is the degrees of freedom of the likelyhood ratio?
A

the same as chi squared (rows-1)(columns-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a type 1 error and a type 2 error?

A

TYPE 1 ERROR= rejecting the null hypothesis when its actually true

TYPE 2 ERROR= failing to reject the null hypothesis when its actually false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what type of error does 2x2 contingency tables on the chi squared test tend to make?

A

type 1 error

this is because it tends to produce significance values that are too small

17
Q

what is yates continuity correction?

A
  • a correction to the chi squared formula to correct the fact that 2x2 contingency tables tend to make a type 1 error
  • you subtract 0.5 from the numerator in the formula before you square it
  • this lowers the value of the chi squared statistic and therefore makes it less significant
  • some argues that this overcorrects and produces chi squared values that are too small
18
Q

what are assumptions of the chi squared test?

A
  1. The chi squared test does NOT rely on assumptions that the data is continuous and normally distributed like other tests do
  2. data must be independent and contribute to only one cell of the table. this means you cannot use chi squared on repeated measures designs
  3. the expected frequencies of each cell should be no less than 5
19
Q

what is a standardized residual?

A

a residual is the observed value - the predicted value

a standardised residual is a residual divided by its standard deviation

20
Q

what two things can we used to break down the chi squared test statistic?

A

standardised residuals (z-scores)

or

effect sizes (

21
Q

for larger contingency tables, what assumptions should we make for the chi squared test?

A
  • no cell should have an expected frequency below 1
  • up to 20% of expected frequencies can be below 5 but it will result in a loss of statistical power
  • if you find yourself in this situation consider using fishers exact test
22
Q

what is an odds ratio?

A

the ratio of the two categories

23
Q

what is a significant standardized residual?

A

values outside of +/- 1.96

24
Q

what technique would we use to analyse larger contingency tables with 3 or more variables?

A

log linear analysis

25
Q

What is parametric statistics?

A

• Parametric statistics, such as r and t, rest on estimates of population parameters (x for μ and s for σ ) and require assumptions about population distributions (in most cases normality) for their probability calculations to be correct.

26
Q

What are the two main uses of chi squared?

A

Goodness of fit (involving a single independent variables)

Test for independence (involving multiple independent variables)

27
Q

LINK FOR HELPFUL WORKSHEET TO UNDERSTAND CHI SQUARED

A

https://www.cedu.niu.edu/~walker/statistics/Chi%20Square%202.pdf

28
Q

What is the difference between residuals, standardised residuals and adjusted standardised residuals?

A
  • The residual is O- E (observed - expected value)
  • The standardised residual is O - E / square root of E. The mean of the standardised residual is 0 and the standard deviation is 1. If the standardised residual for a cell is beyond the range of +2 then that cell can be seen as a major contributer of the overall chi-square value
  • The adjusted standardised residuals are standardised residuals that are adjusterd for the row and column totals