Reading 12: Multiple Regression Flashcards
Multiple Regression Equation
This shows several Betas for several independent variables that “predict” the dependent variable.

What type of test needs to be performed in order to test the significance of one of the Betas? And what is the formula?
A t-statistic is done to check the significance
H0: Beta = 0
If t > t-critical : Reject H0
t-statistic =

How can the significance be tested of individual components in the multiple regression?
Perform a t-statistic.
if t > t-critical reject H0
In the formula Beta tested agains the idea that the real beta = 0
How can the significance be tested of a regression analysis as a whole?
An F-Test is to be performed. An F-test is performed as following:
F= MSR / MSE
MSR= RSS / k
MSE = SSE / [n-(k+1)]

How is the Sum of Squared Errors (Residuals) calculated?
(SSE) with df = k
= Actual observation - Expected Observantion

How is the Regression Sum of Squares calculated?
RSS is the explained variation by the independent variables.
= Expected Y - average observation Y

F - Test is done how?

What is R2?
R2 is a measure of goodness of fit of the estimated regression of the data.
R2 = Explained Varation / Total Variation
R2 = RSS / SSE
What is the R2 - Adjusted?
This adjusted version of the R2 is done as this statistic is not automatically increasing when another variablle is added and is adjusted for degrees of freedom.

A. What is Hetroskedasticity?
B. What are the consecquences?
C. How can we test for it?
D. How can it be corrected?
A. Hetroskedasticity is a non-constant variance of the errors accross the data set.
B. The Regression coefficients are not affected. ONLY the standard error of regression. -> in general lower erros lead to higher t-values -> leads to lower p-values.
C. Tested by the Breusch-Pagan test.
D. Corrected by:
- White corrected standard erros -> adjusts the standard errors of the LRM.
- Generalized least squares
How and why is the Breusch - Pagan Test performed?
Done to test for conditional Hetroskedasticity in the regression.
dit is an Chi-statistic
X2 = n R2
with k degrees of freedom (independant variables)
What is Serial Correlation (Autocorrelation)?
This means that there is correlation among the error terms. This mostly happens in a time-series regression.
Serial Correlation (Autoregression) A:
- Consequences?
There is Positive Serial Correlation:
- A positive error for one observation increases the chance of a positive error for another observation.
- We assume that the sign of the error term tends to persist from one period to the other.
Consequences of serial correlation: is an incorrect standard error of regression.
F-test may be inflated by underestimated MSE
T-test will be inflated and p-values show significance where not appropriate.
Serial Correlation (Autoregression) B:
How to test for this?

We test with the Durbin and Watson test.
DW can be estimated by
DW = 2 (1-r)
H0 = NO serial correlation

Serial Correlation (Autocorrelation) C:
When do we reject the H0: No serial correlation?
DW > Du -> Fail to reject the null hypothesis
DW < Dl -> Reject the null hyphotesis
Dl u -> inconclusive test.
IF WE REJECT: UNDERESTIMATED ST. ERRORS

How to correct for Serial Correlation (Autocorrelation)?
Two Ways:
- Adjust the coefficient standard error (Hansen Method)
- Modify the regression equation itself.
Hansen’ Method corrects for both Serial Correlation & conditional Hetroskedasticity
Multicollinearity (A)
What is it?
Multicollinearity occurs when two or more independent variables are highly correlated
If the independent variables are perfectly correlated regression is impossible and is called perfect collinearity.
Multicollinearity (B)
What is the problem generated by multicollinearity in a regression analysis?
- Presence does not affect the consistency of the OLS estimates / regression Coefficients
- Due to multicollinearity its impossible to distinguish the individual impacts of the independent variables on the dependent variable.
- Inflated OLS standard errors for the regression coefficients. Hence T-statistic very small -> no ability to reject H0
Multicollinearity (C)
How to detect Multicollinearity?
How to correct?
No statistic to measure it however:
If F-test is significant and High R2 and t-tests are insignificant
Correct by ommitting one variable.
Summary of Heteroskedasticity, Serial Correlation, Multicollinearity:

Model Misspecification?
- One or more important variables could be omitted from the regression
- One or more of the regression variables may need to be transformed before estimating the regression
- The regression model pools data from different samples that should not be pooled
- Independent variables are correlated with error terms (violation)