Test Assumptions Flashcards
ANOVA Assumptions
• Each group is approximately normal
o (look at histograms/normal quartile plots; can handle some non-normality, but not severe outliers)
• Standard deviations of each group approximately equal
o (ratio of largest to smallest sample standard deviation must be less than 2:1)
• Populations have the same variance
• Samples are independent
Simple/Multiple Linear Regression Assumptions
• Mean of distribution of error is 0
o Equal on both sides of the line
• Distribution of error has constant variance
o Variance of the response variable is same regardless of the value of X. The spread of Y shouldn’t change depending on the value of X – should basically be constant.
• Distribution of error is normal
o Noise follows a normal distribution
• Errors are independent
o For every observation (subject/sample in study) the deviation from regression line is independent from one subject to the next (error overestimating one person’s weight is independent of error estimating another person’s weight).
o Assumption not met: when measuring the same stock price from one day to next there will be temporal correlation. If taking BP measurement on one day and same subject the following day those are not independent. Places where time comes in is most common place where assumption suspect.
Unpaired t-test assumptions (independent samples)
o Equal variance (variance of two populations is equal)
• (Assume standard deviations of 2 samples are equal)
o Normality (underlying population is normally distributed)
o Samples are independent (drawn from 2 independent populations; no overlap between group members)
pooled t-test assumptions
• Pooled t-test
o Equal variance (Assume that the variance of the measure is the same in both groups)
• (Assume 2 different samples have the same SD)
o Normality (underlying population is normally distributed)
paired samples t-test assumptins
o Normality (underlying population is normally distributed) o Equal variance
one sample t-test assumptions
o Normality (underlying population is normally distributed)
chi square test assumptions
Both variables are categorical.
Observations are randomly sampled from the population(s).
Observations are independent
Each observation is in exactly one category per variable. (no overlapping categories)
Sample sizes are large, with E(Fe) at least 5 for every cell
logistic regression assumptions
binary dependent variable
little to no multicollinearity
independent observations
independent errors