Multiple Linear regression Flashcards
Understand Multiple linear regression
What is the difference between simple and multiple linear regression?
- Simple only takes into account one co-variate
- Multiple takes into account more than one co-variate
What is the best way to display the results of multiple linear regression?
Using a scatter plot
What is the general equation for predicting y given any x in multiple linear regression?
Y = Bo + B1X1 + B2X2 + … + BnXn
What are Factor variables?
Variables fitted as factors allows the response to vary with the value of the co-variate
What are Treatment contrast variables?
one level forms the base line, remaining levels have corresponding co-coefficients
What are Dummy Variables?
Variables that switch on (x=1) and switch off (x=0)
What are the value types co-variates take and what type does the response take in multiple linear regression?
- Continuous
- Categorical (use dummy variables)
- The response takes continuous values
What does the ordinary R^2 value mean?
Describes the absolute fit of data without taking into account different numbers of co-variates
What does the adjusted R^2 value mean?
- Describes the absolute fit of data, taking into account the different number of co-variates
- No longer readily interpret-able as it no longer lies between 0 and 1
What are some of the problems with model selection in multiple linear regression?
- Including too few variables throws away data
- Including too many variables raises the standard error and p value substantially
- Too simple or too complex models have poor predicting abilities
What is used to test if Variables are colinear?
Variance Inflation Factors (VIF)
How is VIF calculated?
one divided by one minus the R^2 value
What is the average VIF score at which models selection should be altered?
When VIF score is greater than 5
What is the best way of dealing with colinearity?
- Checking the VIF score of the suspected colinear variable
- remove the covariate from the model if the VIF score is greater than 5
What is the hypothesises of multiple linear regression in regards to factor coefficients?
Ho: all coefficients are equal to 0
H1: at least one coefficient is not equal to 0