Normality, Bias & Transformations Flashcards

1
Q

Assumptions of parametric tests

A

PT based on normal distribution assume= additivity & linearity, normality, homogeneity of variance & independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Normal distribution

A

Relevant to parameters, confidence intervals around a parameter & null hypothesis significance testing

This assumptions tends to get incorrectly translated as ‘your data needs to be normally distributed’ but not the full story

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When does the assumption of normality really matter

A

Small samples (central limit theorem allows us to forget about this assumption on larger samples)

As long as sample is fairly large, outliers are greater concern than normality

Once you get 50 pp, sample looks ND even if data isn’t really

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Parametric testing & outliers

A

Most parametric testing (estimating parameters) based on stats like means & SD- means they’re heavily biased by outliers

EDA takes account of outliers by using robust methods, also emphasises visualising & studying the data on its own terms to see what’s actually going on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Homogeneity of variance/homoscedasticity

A

When testing several groups of pp, samples should come from pops with the same variance

Can affect parameters & null hypothesis significance testing

In correlational designs, variance of the outcome variable should be stable at all levels of predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Assessing homoscedasticity/ homogeneity of variance

A

1) Levene’s tests= tests if variance in different groups are same (significant if variances not equal)
2) variance ratio= 2 or more groups, VR= largest variance/smallest variance, if VR<2, homogeneity can be assumed
3) graphs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ways of spotting normality

A

1) Kolmogorov-Smirnov/ Shapiro-wilk test= test if data differ from normal distribution (if significant=non-normal data)
2) graphical displays= p-p plot (normal if points fall close to line), histogram/stem & leaf plot
3) values of skew/kurtosis= will both be 0 in ND

Don’t need to worry about these checks in large samples (hundreds) due to central limit theorem, analyses more robust to ND violations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kurtosis

A

A measure of whether there are non-smooth spikes at particular places in the distribution

You want your skewness/kurtosis scores to be less than 1 or (-1)
Score of 0 suggests data is ND

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Transforming data

A

1) log transformation= reduces positive skew, also known as natural log (ln)
2) square root transformation= reduces positive skew, useful for stabilising variance
3) reciprocal transformation= dividing 1 by each score reduces impact of large scores, reverses the scores, can avoid this by reversing scores before transformation

Worth trying a few different transformations then choosing the one that looks the best
Make sure transformation is worthwhile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cautious against transforming

A

Hinders the accuracy of F
Transforming the data changes the hypothesis being tested
In small samples, difficult to determine normality one way or another
Consequences for the statistical model of applying the wrong transformation could be worse than consequences of analysing untransformed scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly