Toolkit 2 Multiple Regression Flashcards
What does each X variable do
It is the independent variable that we hypothesise and contributes to explaining the variation in Y
What is y hat the best estimate of?
Is the best estimate of the true values of the dependant y variable
Ideally, should the contributions of the x variables be independent fro, each other or not?
Yes they should be independent so that their effects are simply additive
Is there an upper limit on the number of x variables?
No. However it is recommended to keep the number within reasonable bounds
What does the regression equation actually describe ?
A multi dimensional hyper plane
B0 is the intercept but what does it intercept?
It intercepts the Y axis
What do the other coefficients describe?
They describe the partial slopes of the plane in each of the respective x dimensions
What does the e at the end of the equation term do?
Is the error term used to capture random errors in the observed Y values
Does correlation imply causality?
No, alone it can’t establish causal relationships. There may be many reasons for correlations
1) accidents from the sampling of random variables
2) measured variables may be causally related to unknown, variables that weren’t sampled
3) variables may not be recognised trends / patterns over time and space
What does positive spatial autocorrelation mean?
It means that proximate values tend to be similar
What does multicollinearity show?
It makes it difficult to produce a good model.
What is the tolerance?
The % of the variance in a given predictor that can’t be explained by other predictors
What does the variance inflation factor show?
When the tolerances are close to 0 there is a high multicollinearity and the standard error of the regression coefficients will be inflated. A variance factor greater than 2 is usually considered problematic. A VIF Of 10 or more will indicate multicollinearity problems
What does partial correlation show?
The correlation between the dependent variable and an independent variable when the linear effects of the other independent variables in the model have been removed from both