week 6 - wilcoxon rank-sum and wilcoxon sign rank-test Flashcards
what are parametric tests
- t-tests and ANOVA
- Rely on the assumption that the data comes from a normally distribution population
- What if the assumption of normality is violated?
- You can conduct non-parametric tests
- Non-parametric tests are more flexible:
- Data does not need to come from a normally distributed population
how do non-parametric tests work
- do not use the values of the dependent variable
- data is ranked using the values of the DV
- analysis then carried out on the ranks
what don’t we just always perform non-parametric tests
- non parametric tests have less statistical power
- more likely to result in type II error
how do we assess normality in designs with two independent groups
- Q-Q plots
- shapiro-wilk test
- data in each group should follow a normal distribution
what are Q-Q plots
- Plots values you would expect to get if the distribution was normal against your observed values.
- Expected values are a straight diagonal line observed values are dots
- If normally distributed, dots fall mostly on top of line
- if not normally distributed points do not fall on line
what is the interpretation of the Shapiro-Wilk test
- p ≤ .05 data is not assumed to come from a normally distributed population
- parametric tests not appropriate
- consider non-parametric tests
- p > .05 data is assumed to come from a normally distributed population
- parametrics tests may be appropriate - proceed with other assumptions checks
which approach is better
- shapiro-wilk less subjective but likely to give significant p-value frequently with large sample sizes
- consider both outcomes together
- if the group sample size is <50 rely on the shapiro wilk
- if group sample size is >50 rely on Q-Q plot
what is the theory behind the Wilcoxon rank sum test
- order the DV from small to large
- rank the DV from small to large
- add up ranks per group
- correct for the number of people in groups
- minus the mean rank from each groups sum of ranks
what if at least one group doesn’t meet the normality assumption
- conduct non-parametric alt of unrelated samples t-test (Wilcoxon rank-sum test)
- also used if there is a two independent groups with no repeated measures
- Mann-whitney test
what is the direction of the results
- parametric - report mean
- non-parametric - median is typically prefered
what is the exact method
- we know the null hypothesis is true
- how often is the difference that appears as large as the difference in the true data
what is the monte carlo method
- creates lots of databases that are the same as the sample
- assigns groups randomly
what is the normal approximation with continuity correction
- assumes that the sampling distribution of the W statistic is normal
- produces a standard error
- can be used to calculate z and than a p-value
- applies a continuity correction
which is the default method
- Sample size < 50 in all group and no tied ranks = exact method
- Sample size ≥ 50 in any group OR tied ranks = normal approximation with continuity correction
why is my W different when calculated manually
- If you were to calculate W manually, W should be the smallest value (sum of ranks – mean rank)
- In R, W is reported for the first factor level
- Doesn’t affect significance, so not something you need to worry about
- You can just report what R outputs
what are the assumptions of normality
- the difference between timepoint 1 and timepoint 2 should be normally distributed
- calculate a difference score for each participant and assess normality of this difference score
what is the Wilcoxon signed-rank test
- Alternative to the related samples t-test
- Appropriate if you have a design with only two repeated measures (all participants contribute data at both timepoints)
what is the theory behind the Wilcoxon signed-rank test
- calculate the difference between the conditions
- note the sign of the difference
- calculate the ranks
- add up the positive and negative ranks separately
does the V differ depending on how I enter the variables in the code
- V is always the sum of positive ranks but whether ranks are positive or negative will differ depending on the way you enter the variables into the function