Regression Flashcards
RSS + SSE = ?
What do they stand for?
= Total variation (SST)
RSS: Regression sum of squares
SSE: Sum of squared errors
What does a low p-value indicate about the significance?
Significance is high
What does mean regression sum of squares (MSR) and mean squared error (MSE) equal?
MSR = RSS/k MSE = SSE/(n-k-1)
What is the equation for F-Stat?
F = MSR/MSE
What is the equation for R squared?
R squared = RSS/SST = SST-SSE/SST
= Explained regression / Total variation
Is Ra squared less than or equal to R squared?
Adjusted R squared is less then R squared
What phenomenon occurs When the variance of the residuals are not the same across all observations?
Heteroskedasticity. Unconditional occurs when variance is not related the value of the independent variable. Conditional is when it is related to the independent variable.
How do you test for heteroskedasticity?
Breusch-Pagan test.
What is the formula for the Breusch- Pagan chi-square test?
BP chi-square test = n x Rsquared of residuals.
This is the Rsquared from a second regression of the squared residuals from the first regression on the independent variables.
How do you correct for heteroskedasticity?
Either use White-corrected standard errors or use generalised least squares.
How do you test for serial correlation?
Durbin Watson statistic. If DW < d1 error terms are positively serially correlated. If d1< DW < du the test in inconclusive. If DW > du there is no evidence that error terms are positively correlated.
How do you correct for serial correlation?
Use Hansen method to adjust the coefficient standard errors.
Or improve specification of the model.
What is multicollinearity?
When two or more independent variables in a multiple regression are highly correlated with each other
When the t-test indicates that none of the individual coefficients are significantly different from zero, F-test is statistically significant and the Rsquared is high what regression problem does this potentially indicate?
Multicollinearity
What distribution is a probit model based on?
And what distribution is a logit model based on?
What models are these used for?
Probit - normal distribution
Logit - logistic distribution
Used for qualitative dependent variable models. Results in estimates of the probability that the event occurs.