L8: Nonparametric testing Flashcards
When should you use nonparametric tests?
- When assumptions are violated
- e.g. strong non-normality
- When the variable is ordinal
- e.g. when playing Mario Kart
- When unsure about outliers
Parametric vs Nonparametric
Differences?
Look at figure 1.
Important note: Nonparametric tests aren’t capable of handling non-random samples.
(Last row isn’t as important because we now have computers that can handle everything.)
Whats the most common solution to analyse weirdly distributed data?
Ranking the data
All nonparametric tests involve ranking to overcome distributional problems.
What is ranking?
A way of handling data that allows you to deal with extreme data.
You assign ranks to the data, going from lowest score to the highest score.
This means that you can handle outliers well.
Look at figure 2 to see what I mean
The last column has the same ranking despite very different scores on each variable.
How do you deal with ties when ranking individuals?
Find the mean ranking of the individuals that have tied.
Look at figure 3 for an example.
What is the general procedure for nonparametric tests?
- Assumption: Independent random samples
- Hypothesis
- H0: equal population distributions (implies equal mean ranking)
- HA: Unequal mean ranking (two sided)
- HA:Higher mean ranking for one group
- Test statistic is difference between mean or sum of ranking
- Standardize test statistic to normal sampling distribution
- Calculate P-value one or two sided
- Conclude to reject H0 if p < alpha
What is the Wilcoxon rank-sum test
Nonparametric version of independent 2 sample t-test
Also known as the Mann-Whitney U test
Whats the main expectation of a Wilcoxon rank-sum test
By ranking all values and then summing the ranks per group, one would expect under the null hypothesis that the sum of ranks is approximately equal
The next few flashcards are how to perform a rank-sum test. It’s kind of confusing so stay with me.
(flip the card for good news)
JASP does this all for you, so just try and comprehend how the test works, rather than memorise it.
After summing the ranks per group, How do you find W?
Pick the group with the lowest sum rank score.
If the group sizes are unequal, W is the group with the smallest amount of people.
JASP produces a value called U. It is W - W.min
W.min = the minimum value of W.
How do you find W.min?
The minimum value depends on the sample size.
If each group has 10 participants, the lowest possible sum ranked score would 1+2+3…+10 = 55.
So, W.min = sum(1:n), with n being your group size.
How do you find the Mean W under the null hypothesis?
Look at figure 4.
It depends on your sample size. H0 assumes that the sum rank score will be equal, so do some maths trickery, and shabang.
Besides the formula in figure 4, how else could you find the Mean W?
You’ve found W.min, if you find the maximum value of W, you can average those two values and get the mean.
To find the max value of W, do the same as W.min, but with the highest ranks (11:20 in the case of 2 groups of 10.)
Almost there, calculate the standard error of W, (so standard deviation thingymabob)
Look at Figure 5.
(two points.
1. I don’t know why the bottom is 12, johnny didnt say, the book didnt say, it just is
2. You don’t need to know the formula, i’m just going step by step)
Alas, you have the mean W, the SE of W, and the W of your data (being the lowest sum rank score)
What do you do now?
Hint, block 1 stats is coming to haunt you
Thats right! Calculate the Z-score!
Figure 6 shows you it.
Its the same as if you were doing it with the mean value of a t-test, but instead of the mean, its W
Pretty nifty, right?
(end my suffering)
What do you do with your prized Z-score?
Calculate the p-value to see if your data is significant.
Make sure to specify if its one sided or two sided based on your hypothesis.
If its significant, you’re done!
Hol’up, did you seriously forget to calculate the effect size… -.-
Do better.
Whats the name of the effect size you use. How do you calculate it
(IMPORTANT)
Rank-biserial correlation
Formula 7
(You don’t need to know how to calculate it, but remember its name and what type of number it is)
It formats your W as a correlation.
Therefore, the closer to -1 or 1 it is, the higher the effect is.
How can you convert your rank biserial correlation into a percentage of support?
((1+rbs)/2) x 100% , e.g. an effect size of 0.29 = 65% of the ranked data support the idea that …