Statistics Theory L11 = Alternatives To T-Tools Flashcards
When you can’t use t-tools for your data even after log-transforming (i.e., when the t-distribution assumptions have been grossly violated), what tools to use then? (2)
- Rank-sum test for 2 independent samples.
- Signed-rank test for paired samples.
When to use these alternatives? (2)
- When outliers are present.
- When the sample sizes are too small to assess distributional assumptions.
Rank-sum test for 2 independent samples?
= a two-sample test where ratio/interval data are transformed to ordinal (rank) data.
Rank-sum test attributes? (4)
- Resistant alternative to the 2-sample t-test.
- Almost as good with normal distributions.
- Better with extreme outliers.
- Best for simple situations/comparisons.
Aspects we talk about under the Rank-sum test for 2 independent samples? (4)
- Rank transformation.
- Rank-sum statistic.
- Finding p-value with the normal approximation.
- CIs based on the Rank-sum statistic.
Rank transformation?
= where we replace each observation with its rank in the combined sample.
Why/How is this transformation type different from the others? (2)
- A single transformed value depends on all the data, so a transformed value of Y = 63.925 could have a different rank value depending on the other values in the dataset.
- No reverse transformation, like exp for log.
So why use ranks?
We use ranks to transform our data to a scale (ordinal) that eliminates the importance of the population distribution.
Purpose of using ranks? (3)
- Resistant to outliers; values of the original sample might change drastically, but the ranks do not.
- Other transformations change the distribution of the sample, but don’t change the ranks of the values.
- More easily accommodates censored observations.
Steps to calculate the Rank-sum statistic? (5)
(1) List all observations from both samples in increasing order (1st column).
(2) Make a 2nd column (Labels) where you record which group each observation came from.
(3) Make a 3rd column (“Order”) with integers from 1 to n1+n2.
(4) Make a 4th column (“Rank”) where you replace draws/ties with an average order.
(5) Calculate t = sum of ranks for group 1.
NB under the Rank-sum statistic? (3)
- We can generate a histogram of the T values under this scenario, to judge how likely our sample T would be under a hypothesis of no difference between groups.
- Histogram = the “randomization distribution of T”.
- This distribution can be approximated by the normal distribution in most circumstances, unless n < 5 or there are many ties/draws.
Rank-sum test for 2 independent samples is AKA? (2)
- Wilcoxon test.
- Mann-Whitney test.
Eg of Rank-sum test for 2 independent samples?
Cognitive Load Theory in Teaching example.
Cognitive Load Theory in Teaching example? (5)
(1) List all observations from both samples in increasing order (Y = response variable, Time).
(2) Identify sample membership of each value (column “Group”) [R converted groups to 1 (Conventional) or 2 (Modified); not sure why].
(3) Write down an increasing sequence for list order (column “Order”).
(4) Modify the orders, if necessary, by giving the average order to tied values (column “Rank”).
(5) Add the ranks from all observations in one group (M or 2).
Steps to finding the p-value with the normal approximation? (2)
- Part 1: Sampling distribution.
- Part 2: Calculate p-value.
Part 1: Sampling distribution steps? (6)
[a] Calculate the average & SD of ranks for combined sample (call those Ř, SR).
[b] Compute theoretical “null hypothesis” mean & SD for T
(i) Centre: mean (T) = n1 Ř
(ii) Spread: SD (T) = SR √(n1n2) / (n1 + n2)
(iii) Calculate the z statistic with continuity correction to accommodate integer data
z = [observed rank sum + 0.5 - mean (T)] / SD (T)
(iv) Shape: Approximately normal if n is large & not too many ties/draws.
Part 2: Calculate p-value steps? (4)
(i) Ř = 14.5 ; SR = 8.20.
(ii) Theoretical “null hypothesis”:
n1 = n2 = 14.
mean (T) = (14)(14.5) = 203.
SD (T) = (8.20) √[(14)(14)] / (14 + 14) = 21.70.
(iii) z = [137 + 0.5 - 203] / 21.70 = -3.01
(iv) After this, look at the distribution curve. Where does it lie?
- It’s a one-sided conclusion.
CI’s based on the Rank-sum statistic?
Use the normal approximation and the appropriate R function (look up “wilcox.test”).
Other alternatives for 2 independent samples besides the Rank-sum test for 2 independent samples? (2)
- Permutation/Randomisation tests.
- Welch’s t-test of unequal variances.
When is the Rank-sum test for 2 independent samples ineffective to use? (2)
- When the sample sizes are < 5.
- When there are too many ties (judge at your own discretion).
Therefore, when can we use Permutation/Randomisation tests? (3)
- When sample sizes are small (<5).
- When there are too many ties.
- When the dataset is unbalanced (uneven number of observations in the 2 groups).
Permutation/randomisation tests?
= any test that finds a p-value as the proportion of regroupings - of the observed n1 + n2 numbers into two groups of size n1 + n2 - that lead to test statistics as extreme as the observed one. We can change the test statistic to whatever would be useful for answering the question.
Eg of Permutation/Randomisation test?
Shuttle O-ring failures example.
Shuttle O-ring failures example? (4)
A way to approximate this “permutation distribution”
(i) Randomly rearrange “warm” and “cool” group labels.
(ii) Calculate the rank-sum statistics for the random groups.
(iii) Do this many, many times to generate a sampling distribution for the rank-sum statistic under the null hypothesis of no difference.
(iv) Compare observed rank-sum to the distribution of random differences.
Welch’s t-test attributes? (5)
- Unequal variances must be met.
- Uses separate estimates of the standard deviation instead of a pooled standard deviation.
- Even if populations are normal, the exact sampling distribution is unknown.
- Can be approximated with a t-distribution & Satterthwaite’s approximation for the degrees of freedom.
- Works about the same as a standard t-test.
NB for Welch’s t-test?
Differences in the variance between populations might be just as important as differences in the mean (there may be interesting reasons for the difference).
- Two distribution graphs: one with a wide body length & another with a narrow body length.
Signed-rank test for paired samples?
= the rank-based alternative to the paired t-test.
Signed-rank test for paired samples attributes? (2)
- Takes pairs of values, calculates the differences between them & converts the differences to pluses & minuses.
- Eliminates the effect of outliers.
Signed-rank test for paired samples steps? (4)
(i) Compute differences between pairs.
(ii) Drop zeros (reduce n).
(iii) Order the absolute differences from small to large.
(iv) Sum the ranks for the positive differences (S).
Calculations for the p-value using the normal approximation of the Signed-rank test for paired samples? (3)
- mean (S) = [n(n+1)] / 4
- SD (S) = √ [n(n+1)(2n+1)] / 24
- z = [S - mean (S)] / SD (S)
- Use z distribution to get p-values.
Eg of Signed-rank test for paired samples?
Schizophrenia example.
Schizophrenia example? (10)
(i) Compute the difference between each pair (column “Diff”).
(ii) Drop zeros from the list (no zeros in this example).
(iii) Order the absolute differences from smallest to largest and assign them their ranks
1. . . n (or average rank for ties) (column “Rank.diff”).
(iv) The signed-rank statistic (S) is the sum of the ranks from the pairs for which the difference is positive.
S = 1.0 + … + 8.0 + 9.5 + 11.0 + … + 15.0 = 110.5.
(v) mean (S) = [15(15 - 1)] / 4 = 60.
(vi) SD (S) = √ [15(15+1)(2(15)+1)] / 24 = 17.61.
(vii) z = [110.5 - 60] / 17.61 = 2.87.
(viii) 1-sided p from z distribution: p = 0.002.
(ix) Conclusion: Convincing evidence …
(x) Scope of inference.
NB for Signed-rank test for paired samples? (2)
- The less ties we have, the less informative the analysis.
- Don’t take the negative (-) into account when ranking the differences.
Practical/Biological significance VS Statistical significance? (5)
- As n increases, t-ratio increases & p-value decreases.
- No matter how small or large an estimated value is, a very large sample will produce a p-value suggesting “strong evidence”.
- In observational studies ask: “What is an important effect/difference to detect?” & “Have we detected that effect?”.
- We can’t answer these questions with p-values & tests of statistical hypotheses.
- Instead we use estimates & 95% CIs.