7 - Multiple Regression Flashcards
What is multiple regression?
A way of predicting a dependent variable from a set of independent variables
How many regression coefficients (b’s) does a multiple linear regression have?
One per independent variable
What are the assumptions of multiple regression?
- Independence
- Normality
- Homoscedasticity
- Linearity
- Multicollinearity
What is the assumption of multicollinearity?
Two predictors should not be too highly correlated (0.8 or 0.9)
When will a multiple regression model have a high R^2?
When the regression model has variables that are statistically significant
How can an F-Test be used to test R^2?
An F-Test can be used to test competing models by testing the difference between R^2s
What test can be used to test each partial regression coefficient, and how?
A t-test FINISH THIS BECAUSE IDKK
What are the 3 ways to build regression models?
- Hierarchical
- Simultaneous Forced Entry
- Stepwise
Describe the hierarchical approach to building regression models.
Independent variables are entered in stages (based on theory)
Describe the simultaneous forced entry approach to building regression models.
All independent variables are entred together.
Describe the stepwise approach to building regression models.
Independent variables are entered according to some order. This could be in order of size/ correlation with the dependent variable or in order of significance.
What R function can you use to determine the confidence intervals on the estimates of a regression model?
confint()
What R function can you use to determine the standardized Beta coefficients of a regression model?
lm.beta() from the QuantPsyc package
What R function can you use to compare the fit of 2 regression models?
anova()
What is the Akaike Information Criterion?
A method for evaluating how well a model fits the data it was generated from.
What R function can be used to generate AIC?
extractAIC()
What AIC values indicate a better model fit?
Smaller values
What is the root mean square error?
It is the square root of the variance of the residuals. In other words, it measures how far the predicted values are from the observed values in a regression analysis. (How concentrated the data are from the line of best fit).
In what cases is the root mean square error most useful?
In cases where the model’s aim is to predict values.
What root mean square errors are preferred?
Smaller values (0 indicates perfect fit).
What R function can be used to get the root mean square error?
performance() from performance package.