L08-L09 Non-Parametric Tests Flashcards
Under what conditions are non-parametric tests used for comparison of data?
Also known as distribution-free tests
Used when assumptions for parametric tests are NOT met.
Place fewer restrictions on underlying distributions of samples
- Do NOT assume normal distributions
- Do NOT assume equal variances
What are the advantages of using non-parametric tests as compared to parametric tests for comparison of data?
1) Do not incorporate all the restrictive assumptions characteristic of parametric tests
2) Use ranks, rather than actual values of observations, thus:
- Can be performed relatively quickly for small samples
- Less sensitive to measurement error and outlying values due to use of ranks
- Suitable for ordinal data, in addition to non-normally-distributed continuous data.
What are the disadvantages of using non-parametric tests?
Less powerful than parametric tests IF the assumptions underlying a parametric test are indeed satisfied
- ~95% as powerful as the analogous parametric tests
- Require a larger sample size for non-parametric test if same statistical power is needed
- e.g. if parametric tests require 19 observation to achieve a certain statistical power, non-parametric tests require 20 observation to achieve the same statistical power.
State the purpose behind the hypothesis testing of Wilcoxon rank-sum test (i.e. Mann-Whitney U test).
To test the H0 that the two population medians corresponding to the two random samples are equal.
If H0 is true, there is NO difference between the medians of the two underlying populations.
- Thus, we would expect the ranks to be distributed randomly between two groups.
- Average ranks for each of the samples should be approximately equal i.e. the distribution of W is approximately standard normally distributed Z if n is at least 10; otherwise refer to other distribution tables for small samples
- Test this hypothesis by calculating Z statistics and respective p-value, compare against alpha and conclude if there is a statistically significant difference or not.
State the assumptions when using Wilcoxon rank-sum tests.
1) The samples are random samples of their populations.
2) The two underlying populations are independent.
Between “exact sig.” and “asymptotic sig.”, which test statistics should be used to determine if there is a statistically significant result when performing a Mann-Whitney U test?
Exact sig. as far as possible.
E.g. of how to write conclusion of Wilcoxon rank-sum test.
There is no statistically significant difference between the median (IQR) age of the individuals attending the Chinese calligraphy and Taiji classes [57.0 (50.0 - 62.0) vs 60.5 (56.3 - 66.8), p = 0.107] at a significance level of 0.05.
State the purpose behind the hypothesis testing of Kruskal-Wallis test.
To test the H0 that all the population medians corresponding to the random samples are equal.
- H0: All the medians of the underlying populations are the same.
- H1: Not all the medians of the underlying populations are the same OR The medians of at least two of the underlying populations are different.
If H0 is true, all the medians of the underlying populations are the same.
- Thus, we would expect the ranks to be distributed randomly among the groups.
- Average ranks for each of the samples should be approximately equal i.e. the distribution of H approximates chi-square distribution if n is at least 5 for each sample; otherwise, refer to other distribution tables of critical values for small samples
- Test this hypothesis by calculating H statistics and respective p-value, compare against alpha and conclude if there is a statistically significant difference or not.
State the assumptions when using Kruskal-Wallis test.
1) The samples are random samples of their populations.
2) The two underlying populations are independent.
E.g. of how to write conclusion of Kruskal-Wallis test.
At a significance level of 0.05, NOT all the median times to sleep after administration of the known sedative (control), the high dose and low dose of the experimental compound are the same.
OR
At a significance level of 0.05, the median times to sleep in AT LEAST TWO of the three populations are different.
If p-value < 0.05, proceed w/ post-hoc analysis (i.e. multiple comparisons procedures) using Wilcoxon rank-sum test for each pairwise comparison to determine where the differences lie.
- At this stage, we only know that difference exists BUT we do not know where the difference are.
- If p-value >= 0.05, end conclusion here.
How are post-hoc tests conducted in Kruskal-Wallis test?
1) Conduct pairwise comparisons using Wilcoxon rank-sum test, and
2) Use Bonferroni adjustment to determine if post-hoc tests are significant.
State the purpose behind the hypothesis testing of Wilcoxon signed-rank test.
To test the H0 that the median of the underlying population of differences in values of each pair is zero.
If H0 is true, the median of the underlying population of differences = 0.
- Thus, we would expect the sample to have an approximately equal number of positive & negative ranks AND the sum of positive ranks to be comparable in magnitude to sum of negative ranks.
- If n = no. of pairs with non-zero differences is at least 16 for each sample, the distribution of T is approximately standard normal Z; otherwise, refer to other distribution tables of critical values for small samples
- Test this hypothesis by calculating Z statistics and respective p-value, compare against alpha and conclude if there is a statistically significant difference or not.
State the assumptions when using Wilcoxon signed-rank tests.
1) The samples are random samples of their populations.
2) The two underlying populations are paired, thus the underlying distribution of the differences is symmetric.
Outline how the test statistic for Wilcoxon rank-sum test is computed.
1) Pool observations from both samples (i.e. consider all observations together).
2) Rank all observations from smallest to largest, regardless of sample designation.
- Tied (i.e. identical) observations are assigned the same rank, equal to the average of the ranks they would have been assigned had there been no tie
3) Compute the sum of the ranks for each sample
4) Compute Z statistic by which the smaller sum of ranks = W (refer to slides for calc.)
Outline how the test statistic for the Kruskal-Wallis test is computed.
1) Pool observations from both samples (i.e. consider all observations together).
2) Rank all observations from smallest to largest, regardless of sample designation.
- Tied (i.e. identical) observations are assigned the same rank, equal to the average of the ranks they would have been assigned had there been no tie
3) Compute the sum of the ranks for each sample
4) Compute H statistic, BUT unlike Wilcoxon rank-sum test, there is NO need to identify smaller sum of ranks (refer to slides for calc.)