Further Techniques in Multiple Linear Regression Flashcards
What does linear regression assume?
That an independent predictor variables and the outcome dependent variable are related linearly.
How do we fix a non linear regression model?
- By transforming the variables in the model
- By fitting polynomial relationship instead of a straight line
How does polynomial fitting work?
- The principle of fitting a straight line applies, but we are just turning one predictor variable into two or more
- Instead of predicting y on x, we predict y on x, and x^2 (and possibly x^3, x^4, etc)
List an advantage of polynomial fitting.
Allows us to deal with obvious non-linear relationship without having to specify in advance what the appropriate transformation would be
How do we fit a quadratic model?
- We transform x to x^2
- y = β(0)+ β(1)x+ β(2)x^2 + e
How do we interpret quadratic terms?
- Where x > 0 and x^2 < 0, y is increasing in x at first, but will eventually turn around and be decreasing
- Where x < 0 and x2 > 0, y is decreasing in x at first, but will eventually turn around and being increasing
What is an advantage of transformed data?
- Often less skewed
- Outliers are less extreme
List three examples of transformed data.
- Logarithms
- Inverse
- Square root
What is an interaction effect?
When the effect of an explanatory variable depends on the level of another explanatory variable.
Give an example of an interaction effect.
- Assume male happiness increases with years of marriage, whereas females happiness decreases with years of marriage
- The relationship between happiness and years of marriage may be linear, but it would not be independent of sex.
Write the regression equation of a model with interaction terms.
y = β0+ β(1)x^1+ β(2)x^2+ β(3) (x(1)x(2)) + e
where β3 (x1x2) is a multiplicative term of the two main effects
What is the most common centring method?
Mean centring
How can we ensure B(0) is interpretable?
By changing x by centring the age variable.
What changes when you carry out mean centring?
The intercept which now corresponds to the average age, e.g. x(centred), not x=0