Normality, Bias & Transformations Flashcards

Question 1

Q

Assumptions of parametric tests

Answer

A

PT based on normal distribution assume= additivity & linearity, normality, homogeneity of variance & independence

Question 2

Q

Normal distribution

Answer

A

Relevant to parameters, confidence intervals around a parameter & null hypothesis significance testing

This assumptions tends to get incorrectly translated as ‘your data needs to be normally distributed’ but not the full story

Question 3

Q

When does the assumption of normality really matter

Answer

A

Small samples (central limit theorem allows us to forget about this assumption on larger samples)

As long as sample is fairly large, outliers are greater concern than normality

Once you get 50 pp, sample looks ND even if data isn’t really

Question 4

Q

Parametric testing & outliers

Answer

A

Most parametric testing (estimating parameters) based on stats like means & SD- means they’re heavily biased by outliers

EDA takes account of outliers by using robust methods, also emphasises visualising & studying the data on its own terms to see what’s actually going on

Question 5

Q

Homogeneity of variance/homoscedasticity

Answer

A

When testing several groups of pp, samples should come from pops with the same variance

Can affect parameters & null hypothesis significance testing

In correlational designs, variance of the outcome variable should be stable at all levels of predictor variable

Question 6

Q

Assessing homoscedasticity/ homogeneity of variance

Answer

A

1) Levene’s tests= tests if variance in different groups are same (significant if variances not equal)
2) variance ratio= 2 or more groups, VR= largest variance/smallest variance, if VR<2, homogeneity can be assumed
3) graphs

Question 7

Q

Ways of spotting normality

Answer

A

1) Kolmogorov-Smirnov/ Shapiro-wilk test= test if data differ from normal distribution (if significant=non-normal data)
2) graphical displays= p-p plot (normal if points fall close to line), histogram/stem & leaf plot
3) values of skew/kurtosis= will both be 0 in ND

Don’t need to worry about these checks in large samples (hundreds) due to central limit theorem, analyses more robust to ND violations

Question 8

Q

Kurtosis

Answer

A

A measure of whether there are non-smooth spikes at particular places in the distribution

You want your skewness/kurtosis scores to be less than 1 or (-1)
Score of 0 suggests data is ND

Question 9

Q

Transforming data

Answer

A

1) log transformation= reduces positive skew, also known as natural log (ln)
2) square root transformation= reduces positive skew, useful for stabilising variance
3) reciprocal transformation= dividing 1 by each score reduces impact of large scores, reverses the scores, can avoid this by reversing scores before transformation

Worth trying a few different transformations then choosing the one that looks the best
Make sure transformation is worthwhile

Question 10

Q

Cautious against transforming

Answer

A

Hinders the accuracy of F
Transforming the data changes the hypothesis being tested
In small samples, difficult to determine normality one way or another
Consequences for the statistical model of applying the wrong transformation could be worse than consequences of analysing untransformed scores

Normality, Bias & Transformations Flashcards

(10 cards)