W8 multiple regression assumptions Flashcards

1
Q

Regression assumptions

A

Normality

  • residuals (errors) are normally distributed with mean = 0
  • often said, centred around 0 (zero)

Homoscedasticity
- constant variance of residuals (across predicted scores)

Independence of errors
- The residuals are uncorrelated with Y

Linearity of the relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Predicted and actual score

A

We predict the actual (Y) score from the Xk predictors—what Y “should” be
A predicted score (Y′) is derived from this
Unless the correlation is perfect, Y ≠ Y′,
Y - Y′ = e
e = error/residual
= whatever leftover in Y that Y′ doesn’t account for
Y = Y′ + e
Y = Predicted + Residual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Positive or negative residual

A

If the regression equation has underestimated the actual score, the residual will be positive

If instead the true score was overestimated, the residual will be negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why residuals matter

A

The residuals can take on distinctive patterns if there is something systematic and amiss.

If there were only true score and random error in our data and we hadn’t omitted any important predictors of our outcome, we would expect nice neat, normally distributed residuals centred around zero…
-> Anything else means that there might be a problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Homoscedascity assumption

A

-> When we have our residuals correlated with predicted scores
-> And we have random error (i.e., error is independent of association with the criterion)
-> And we haven’t left out important predictors
-> We should have a rectangular shape
We have met the assumption of homoscedascity
*look up image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Analysing homoscedascity

A
  • The residuals should be evenly scattered above and below zero
  • The range of residuals around zero should be narrow, the larger the range, the worse the prediction
  • If however our residuals look like a funnel or a fan then this suggests that we have not met the assumption of homoscedascity
  • > The distribution of residuals across the range of predicted values of Y is not even
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What not meeting homoscedascity suggests

A
  • Such a pattern suggests that there is worse prediction at one point of predicted values of Y than at another point of predicted values of Y
  • Can indicate skew in one or more predictors, but not always
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to address violations of homoscedascity

A

We may find one or more outlying data points on one or more variables that are unduly influencing the regression analysis and leading to potentially erroneous conclusions

OR

One or more predictors may deviate greatly from normality—skewness and kurtosis. Again, this may lead to errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Homoscedascity and deviation from normality

A

One or more predictors may deviate greatly from normality—skewness and kurtosis. Again, this may lead to errors.

a) In this case, we generally apply one or more transformations of the data (in turn, not concurrently).
b) We replace the actual variable with the transformed one in the regression.
c) Certain transformations may help alleviate skewness (caution, hit and miss!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly