Multiple Regression & Issues In Regression Analysis Flashcards
Heteroskedasticity
When the differences between the actual and predicted relationship in a regression are not random; the errors depend on the value of the independent variable.
Results in consistent parameter estimates, but biased (up or down) standard errors, t statistics, and F statistics.
How to test for hereroskedasticity
Regress the squared residuals from the estimated regression equation on the independent variables in the regression; Breusch-Pagan chi-square test = n * R^2
How to correct for heteroskedasticity
Use robust (white-corrected) standard errors
Multicollinearity
When some linear combinations of the independent variables in a regression model are highly correlated, the standard errors of the independent coefficient estimates become quite large, even though the regression equation may fit rather well.
Classic symptom of multicollinearity is a high R^2 (and significant F statistic) even though the t statistics on the estimated slope coefficients are insignificant.
Other symptoms of multicollinearity: when the sign on one of the regression coefficients is wrong; when the correlation coefficient between 2 of the independent variables is >0.7
To correct, drop one of the correlated variables.
Durbin-Watson test for serial correlation
For no serial correlation, DW is approximately equal to 2. If DW…?
Durbin-Watson test ~= 2(1-r)
Use Hansen method to adjust standard errors to correct serial correlation
Assumptions of multiple regression model
A linear relationship exists between the dependent and independent variables.
The independent variables are not random, and there is no exact linear relation between any 2 or more independent variables.
The expected value of the error term, conditional on the independent variable, is zero.
The variance of the error terms is constant for all observations.
The error term for one observation is not correlated with that of another.
The error term is normally distributed.
Coefficient of determination (R^2)
Percentage of variation in Y that is explained by the set of independent variables. R^2 increases as the number of independent variables increases–this can be a problem. The adjusted R^2 adjusts R^2 for the number of independent variables.
Ra^2 = 1 - [ ( (n-1) / (n-k-1) ) * (1 - R^2) ]
Common misspecifications of the regression model
Omitting a variable
Variable should be transformed
Incorrectly pooling data
Using lagged dependent variable as independent variable
Forecasting the past
Measuring independent variables with error