Non-Parametric Tests Flashcards by Ella Grant

Problems with parametric tests

Strong assumptions e.g. Normality e.g. N large enough to invoke CLT.

How well did you know this?

Not at all

Perfectly

What are non-parametric tests?

Tests valid over a wide range of distributions and can be carried out making far fewer assumptions about the random variable.

How well did you know this?

Not at all

Perfectly

What is the most simple non parametric test called?

Wilcoxon Sign test

How well did you know this?

Not at all

Perfectly

What type of data does the sign test analyse?

Matched pairs

How well did you know this?

Not at all

Perfectly

Briefly describe the process of setting up sign test

Assign + if 1st value > 2nd
- if 1st value < 2nd
Construct a Bernoulli trial for each individual
Under H0, p=0.5. Repeated Bernoulli = binomial
W ~ B(n, 0.5). P (W is what we observe)

How well did you know this?

Not at all

Perfectly

For sign test how do we calculate the p value if the test is two sided?

We work out probability using binomial e.g. 9C0 (0.5)^9 + 9C1 (0.5)^9 = 0.02
P value for 2 sided test = 2 x 0.02 = 0.04
0.04 < 0.05 (alpha) therefore we reject H0

How well did you know this?

Not at all

Perfectly

For a sign test if n>25, how do we work out the probability?

Invoke CLT so W bar is approx normal with M=0.5 and sigma^2 = 0.5^2 / n.

Z = (W/n - 0.5) / (sqrt 0.5^2 / n)

How well did you know this?

Not at all

Perfectly

Problem with sign test

Ignores magnitude - treats a large negative difference the same as a small negative difference. Collapse everything to 0 or 1 = lots of information thrown away = low powered test when n is small. More likely to make a type 2 error of accepting H0 when it’s false. So we often find an insignificant test statistic.

How well did you know this?

Not at all

Perfectly

What do we do with zero differences in the sign test?

We discard them and then reduced n by 1.

How well did you know this?

Not at all

Perfectly

The sign test & sign rank test are only applicable for…

Matched pairs

How well did you know this?

Not at all

Perfectly

How does the sign rank test differ from sign test?

It accounts for magnitude of the difference as well as sign

How well did you know this?

Not at all

Perfectly

Describe sign rank test

Rank absolute differences in ascending order of magnitude
If two values have the same magnitude, assign the average rank
Sum up R+ and R- separately

How well did you know this?

Not at all

Perfectly

What is our test statistic for the sign rank test?

T = MIN {R+, R-}

How well did you know this?

Not at all

Perfectly

Under H0 for the sign rank test, what is E(T) and V(T)

E(T) = n(n+1)/4
V(T) = n(n+1)(2n+1)/24

How well did you know this?

Not at all

Perfectly

What n<25, how do we work out our CVs for the sign rank test?

Use the tables given in the formula sheet. Correct value of alpha dependent on 1/2/ sided test.

How well did you know this?

Not at all

Perfectly

When do we reject H0 for the sign rank test? Why?

Study These Flashcards

If our test statistic < CV

As we are minimising

If n>25, what do we do for the sign rank test?

Study These Flashcards

Invoke CLT = approx normality

Our test statistic is given by [T - n(n+1)]/4 / sqrt [n(n+1)(2n+1)/24]

Limitation of sign rank test

Study These Flashcards

Ignores spread of data - if highest absolute difference is 2, given rank n. If highest is 100, still given rank n. This may compress or stretch data. Less powerful than a parametric test, but more than just the sign test.

When n>25 for sign rank test, when do we reject?

Study These Flashcards

Reject if p value is less than the significance level (same as usual hypothesis testing)

When is the Mann-Whitney test applicable?

Study These Flashcards

We can use it even if we don’t have matched pairs. Use for independent random samples for difference in means.

How do we rank equal magnitudes in the sign rank test?

Study These Flashcards

Average rank e.g. If two numbers are to be ranked 4&5, give the, both rank 4+5/2 = 4.5

Describe the Mann Whitney test

Study These Flashcards

Rank all observations n1 + n2 but preserve the colour
Equal values given an average rank
Sum of R1
Work out U(see formula sheet), E(U) & V(U) and then test statistic
If n>25, approx normal.

When do we reject for Mann Whitney test?

Study These Flashcards

If n>25, approx normal = Z test
Double p value
Reject if p value is less than the significant level

When do we use goodness of fit test?

Study These Flashcards

Where we have discrete outcomes into k categories (can also use for continuous data but need to put into discrete categories first)

Describe goodness of fit test

Calculate Ei = npi for each category K See formula sheet to calculate test statistic using simpler version Follows a chi squared distribution. Reject if test stat > CV given by chi squared

When do we reject H0 for goodness of fit test?

If the test statistic is greater than the CV given by chi squared distribution

What distribution do we get the CV from for a goodness of fit test?

Chi squared

DOF for CV for goodness of fit test =

DOF = K - 1 where K is the number of different categories

What is a condition for the goodness of fit test to be appropriate? How can we solve it?

Ei should not be <5 for any category; if it is aggregate two categories.

What is H0 usually for goodness of fit test ?

H0 = all outcomes equally likely So Pi = 1/k Ei = n/k for each category

What data are contingency tables used for?

Where we have a two way table with K categories in A and H in B, so we have KH cross classifications.

Why don't we use hypothesis testing or ANOVA instead of contingency tables?

Hypothesis tests limited to two groups | ANOVA allows >2 groups but requires assumption of normality.

What are contingency tables another form of?

Goodness of fit test but with a two way table rather than one

What is H0 for contingency table?

``` H0 = variables are not related H1 = variables are related ```

How do we work out the expected values for contingency tables?

Eij = n pij Since under H0 variables are independent, pij = p(i) x p(j)

What distribution do we get CVs from for contingency tables?

Chi squared

How do we work out degrees of freedom for contingency tables?

DOF = (r - 1)(c - 1) r=Number of rows c=number of columns

When do we reject H0 for contingency tables?

If test statistic is greater than CV given by chi squared distribution

Non-Parametric Tests Flashcards

(38 cards)