Week 3 & 4 Flashcards
Parametric
assess group means
normal distribution
can deal with unequal variances across groups
generally more powerful
non-parametric
assesses group medians
don’t require normal distribution
can handle small sample size
parametric test assumptions
additivity and linearity
normality
homogeneity of variance
independence of observations
additivity and linearity
outcome is a linear function of the predictors X1 and X2, and the predictors are added together
outcome y is an additive combination of the effects of X1 and X2
Assessing linearity
observed vs predicted values (symmetrically distributed around diagonal line)
residuals vs predicted values (symmetrically distributed around horizontal line)
fixing non-linearity
apply non linear transformation to variables
add another regressor that is a nonlinear function (polynomial curve)
examine moderators
central limit theory
as the sample size increases towards infinity, the sample distribution (NOT DATA) approaches normal distribution
skewness
how symmetrical the data is
positive: scores bunched at low values, tail pointing to high values
negative: scores bunched at high values, tail pointing to low values
kurtosis
how much the data clusters either at the tails/ends or peak of the distribution
leptokurtic
heavy tails
platykurtic
light tails
normality checks
Q-Q plot compares sample quantiles to quantiles of normal distribution; normal= forms straight line
Shapiro wilkes test: tests if data differs from normal distribution; normal=p>.05, data does not vary significantly from a normal distribution
histogram
homogeneity of variance
all groups or data points have same or similar variance
equal distribution above and below horizontal line on residual vs predicted plot= homoscedasticity
heteroscedasticity would be cone shapes
Independence
residuals unrelated
if non-independent: downwardly biased SE (too small) and incorrect statistical inference (p values <.05 when they should be >.05)
Univariate outlier
outlier when considering only the distribution of the variable it belongs to
bias mean and inflate SD