BA 5 - Multiple Regression Flashcards
Use of multiple regression
To identify linear relationships between three or more variables
Equation
y^ = a + b1x1 + b2x2 + … + bkxk + e
Tools used for analyzing multiple regression
Because graphing is complicated or impossible, we rely on numerical values and residual plots (for one variable at a time).
Adjusted R^2
Adjusted R^2 = adjustment factor X R^2
R^2 never decreases when independent variables are added to a regression model, so the adjustment factor compensates for the increase that was due solely to the addition of a new variable.
Residual plot analysis
To check if the relationship between independent and dependent variables is linear and significant.
i. Separate scatter plot for each independent variable;
ii. Look for patterns of heteroskedasticity and nonlinearity; and
iii. Also examine p-values of independent variables,
Multicollinearity
Multicollinearity occurs when there is a strong linear relationship among two or more of the independent variables.
If a variable is significant in a single variable model and becomes insignificant in a multiple regression model, it’s likely that there is a multicollinearity between two or more variables.
When multicollinearity is detected
i. Check if dropping one of the collinear variables increases the Adjusted R^2; or
ii. Increase sample size.
If using the model for forecasting, multicollinearity is not an issue; if using the model to understand the net effects of independent variables, it’s an issue.
Dummy variables
For categorical (rather than quantitative) data. The number of dummy variables should be one fewer than the options in the category. The option that is not included will have the value of '0', and is known as the 'base case'
Lagged variables
Used to capture the ongoing effects of a given variable.
The lag period is based on managerial judgement.
Drawbacks:
i. each lagged variable reduces the sample size by one; and
ii. if it doesn’t increase the model’s explanatory power, it decreases the Adjusted R^2.
Gross and net relationships between variables
Gross - affected by any variable related to the independent variable;
Net - controls for other factors.
EXCEL lag tip
Don’t check labels for lagged!