Non-Parametric Tests Flashcards

1
Q

Problems with parametric tests

A

Strong assumptions e.g. Normality e.g. N large enough to invoke CLT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are non-parametric tests?

A

Tests valid over a wide range of distributions and can be carried out making far fewer assumptions about the random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the most simple non parametric test called?

A

Wilcoxon Sign test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What type of data does the sign test analyse?

A

Matched pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Briefly describe the process of setting up sign test

A

Assign + if 1st value > 2nd
- if 1st value < 2nd
Construct a Bernoulli trial for each individual
Under H0, p=0.5. Repeated Bernoulli = binomial
W ~ B(n, 0.5). P (W is what we observe)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For sign test how do we calculate the p value if the test is two sided?

A

We work out probability using binomial e.g. 9C0 (0.5)^9 + 9C1 (0.5)^9 = 0.02
P value for 2 sided test = 2 x 0.02 = 0.04
0.04 < 0.05 (alpha) therefore we reject H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

For a sign test if n>25, how do we work out the probability?

A

Invoke CLT so W bar is approx normal with M=0.5 and sigma^2 = 0.5^2 / n.

Z = (W/n - 0.5) / (sqrt 0.5^2 / n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Problem with sign test

A

Ignores magnitude - treats a large negative difference the same as a small negative difference. Collapse everything to 0 or 1 = lots of information thrown away = low powered test when n is small. More likely to make a type 2 error of accepting H0 when it’s false. So we often find an insignificant test statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we do with zero differences in the sign test?

A

We discard them and then reduced n by 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The sign test & sign rank test are only applicable for…

A

Matched pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the sign rank test differ from sign test?

A

It accounts for magnitude of the difference as well as sign

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe sign rank test

A

Rank absolute differences in ascending order of magnitude
If two values have the same magnitude, assign the average rank
Sum up R+ and R- separately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is our test statistic for the sign rank test?

A

T = MIN {R+, R-}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Under H0 for the sign rank test, what is E(T) and V(T)

A
E(T) = n(n+1)/4
V(T) = n(n+1)(2n+1)/24
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What n<25, how do we work out our CVs for the sign rank test?

A

Use the tables given in the formula sheet. Correct value of alpha dependent on 1/2/ sided test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When do we reject H0 for the sign rank test? Why?

A

If our test statistic < CV

As we are minimising

17
Q

If n>25, what do we do for the sign rank test?

A

Invoke CLT = approx normality

Our test statistic is given by [T - n(n+1)]/4 / sqrt [n(n+1)(2n+1)/24]

18
Q

Limitation of sign rank test

A

Ignores spread of data - if highest absolute difference is 2, given rank n. If highest is 100, still given rank n. This may compress or stretch data. Less powerful than a parametric test, but more than just the sign test.

19
Q

When n>25 for sign rank test, when do we reject?

A

Reject if p value is less than the significance level (same as usual hypothesis testing)

20
Q

When is the Mann-Whitney test applicable?

A

We can use it even if we don’t have matched pairs. Use for independent random samples for difference in means.

21
Q

How do we rank equal magnitudes in the sign rank test?

A

Average rank e.g. If two numbers are to be ranked 4&5, give the, both rank 4+5/2 = 4.5

22
Q

Describe the Mann Whitney test

A

Rank all observations n1 + n2 but preserve the colour
Equal values given an average rank
Sum of R1
Work out U(see formula sheet), E(U) & V(U) and then test statistic
If n>25, approx normal.

23
Q

When do we reject for Mann Whitney test?

A

If n>25, approx normal = Z test
Double p value
Reject if p value is less than the significant level

24
Q

When do we use goodness of fit test?

A

Where we have discrete outcomes into k categories (can also use for continuous data but need to put into discrete categories first)

25
Q

Describe goodness of fit test

A

Calculate Ei = npi for each category K
See formula sheet to calculate test statistic using simpler version
Follows a chi squared distribution.
Reject if test stat > CV given by chi squared

26
Q

When do we reject H0 for goodness of fit test?

A

If the test statistic is greater than the CV given by chi squared distribution

27
Q

What distribution do we get the CV from for a goodness of fit test?

A

Chi squared

28
Q

DOF for CV for goodness of fit test =

A

DOF = K - 1 where K is the number of different categories

29
Q

What is a condition for the goodness of fit test to be appropriate? How can we solve it?

A

Ei should not be <5 for any category; if it is aggregate two categories.

30
Q

What is H0 usually for goodness of fit test ?

A

H0 = all outcomes equally likely
So Pi = 1/k
Ei = n/k for each category

31
Q

What data are contingency tables used for?

A

Where we have a two way table with K categories in A and H in B, so we have KH cross classifications.

32
Q

Why don’t we use hypothesis testing or ANOVA instead of contingency tables?

A

Hypothesis tests limited to two groups

ANOVA allows >2 groups but requires assumption of normality.

33
Q

What are contingency tables another form of?

A

Goodness of fit test but with a two way table rather than one

34
Q

What is H0 for contingency table?

A
H0 = variables are not related 
H1 = variables are related
35
Q

How do we work out the expected values for contingency tables?

A

Eij = n pij

Since under H0 variables are independent, pij = p(i) x p(j)

36
Q

What distribution do we get CVs from for contingency tables?

A

Chi squared

37
Q

How do we work out degrees of freedom for contingency tables?

A

DOF = (r - 1)(c - 1)

r=Number of rows
c=number of columns

38
Q

When do we reject H0 for contingency tables?

A

If test statistic is greater than CV given by chi squared distribution