W9 non-parametric approaches Flashcards
1
Q
Parametric v non-parametric approaches
A
- parametric tests also make assumptions about the distribution of the population from which the data were randomly sampled
- all the tests we’ve looked at so far have all assumed that the data is normally distributed
- > e.g., t-test assumes that the sampling error is distributed normally around m
- non-parametric tests do not make a priori assumptions about the specific shape of the distribution—hence they’re also known as distribution-free tests
2
Q
Advantages of non-parametric tests
A
- they do not require assumptions of normality and homogeneity of variances (severely skewed data can be analysed with nonparametric statistics)
- ideal for analysing data from small samples (small samples are often skewed and can’t be rescued by the central limit theorem)
- generally easier to calculate–require less computation
- use of ranks reduces effect of extreme outliers
3
Q
Ranking in non-parametric approaches
A
- ranking merely involves the ordering of a set of scores from the smallest to the largest
- the smallest is given the rank of 1, the second smallest is 2, the 50th is 50
- provides a standard distribution of scores with standard characteristics
4
Q
Spearman’s Rho
A
- Spearman’s rho (rS) is calculated using Pearson’s r formula - the difference is that the data is ranked
- it can be used when you have two continuous variables, but one (or both) is badly skewed due to extreme scores
this is handy if:
- the data naturally falls in ranks
- there are extreme scores in your sample
- there is a monotonic relationship between the variables
5
Q
The null hypothesis
A
- the goal of any non-parametric test is to establish overall differences between two (or possibly more) distributions, not to identify the differences between any particular parameters
- as a result, H0 is more general
- > samples come from identical populations, not just populations with the same mean
- > rejecting H0 means that populations differ (perhaps not just on the basis of their central tendency - i.e., the mean)
6
Q
Point biserial correlation
A
- Pearson’s r is appropriate for describing linear relationships between two continuous variables
- if one variable is genuinely dichotomous, score one level of that variable as 0 and the other as 1 (or 1 and 2, or any two numbers)
- > compute correlation using Pearson’s r formula
- > referred to as the point-biserial correlation (rpb)
7
Q
interpretation of point biserial correlation
A
- absolute value of rpb reflects strength of the relationship
- but the sign of the correlation depend on scoring (0 or 1; or 1 and 2)
- r2pb interpreted the same way as r2
- test for significance in same way as for Pearson’s r
8
Q
rPB and t test
A
- alternatively we could examine the relationship between a dichotomous and a continuous variable using an independent groups t-test
- result of test of significance of rPB and t-test will be identical