5 - Bias Flashcards
What do outliers influence? (Bias Statistics)
Mean, thus the overall linear model
What can occur if outliers are too influential? (Bias Statistics)
Residuals
How can we detect outliers? (Bias Statistics)
- Box plots
- Histograms
How do you standardise the residuals? (Bias Statistics)
- Convert scores into z-scores
- 95% should fall between ±2 and ±2.5
- Anything more/less is an extreme score
What does Cook’s distance measure? (Bias Statistics)
The influence a single case has on the model as a whole
What value is considered too influential for Cook’s distance? (Bias Statistics)
More than +1 or less than -1
What two factors make a linear model appropriate? (Bias Statistics)
- Linearity
- Additivity
What are the two types of spherical errors? (Bias Statistics)
- Homoscedastic
- Independent
What is the rule of a homoscedastic error? (Bias Statistics)
The spread of errors across the model should be consistent of the predictor
What is the rule of independence of errors? (Bias Statistics)
Error in predictor (residual) for one case should not be related to an error in a different case
What does homogeneous variance look like? (Bias Statistics)
Standard deviations (or variance) for each point are similar and a line fits with ease
What does heterogeneous variance look like? (Bias Statistics)
Standard deviations (or variance) for each point is drastically different
What does a normal distribution look like? (Bias Statistics)
- Straight horizontal line
- Points are randomly plotted
What does non-linearity look like? (Bias Statistics)
Rainbow shape
What does hetroscedacity look like? (Bias Statistics)
Triangle shape
What does hetroscedacity and non-lineraity look like? (Bias Statistics)
U shape
What is normality important for and why? (Bias Statistics)
- Sampling distributions
- If not, then confidence intervals will not fit
What does central limit theorem tell us? (Bias Statistics)
Sampling distributions tend to be normal if the sample size is big enough
What is one way data problems can be corrected and how is it done? (Bias Statistics)
- Bootstrap
- Random sampling and replacement
- Work out the middle of the bootstrap estimates