Multiple Linear Regression - Testing Flashcards
What type of hypothesis test is used with multiple linear regression and why is this necessary?
Joint hypothesis test imposing two restrictions - necessary because it accounts for the fact that Beta1 and Beta2 has a covariance.
What is the null and alternate hypothesis of a joint test?
H0: Beta1=0 and Beta2=0
H1: Beta1=!0 and/or Beta2=!0
What statistic is used for a joint test and why is it used?
F-statistic - average of the two squared t-stats adjusting for correlation in the t-statistics
What happens to the F-distribution as q (the number of regressors) increase?
The distribution shifts to the right
Where is the p-value of a f-distribution graphically?
The area under the curve to the right of the F-statistic
When do you reject the joint null hypothesis?
When the p-value < alpha
Interpretation of rejecting the null
If the null is rejected it is basically saying that the regression is statistically useful Or one of the cases is not true
Basically the hypothesis of Beta1 and Beta2 = 0 means that none of the regressors explains any of the variation in Y except for the constant and thus, if this were to be the case we would be rejecting the entire regression model
Process for homoskedastic-only f-statistic
- Run the regression with the imposed restrictions and compute the sum of squares (SSR)
- Run the unrestricted regression and find the SSR
- If SSRunrestricted is < SSRrestricted the the null is rejected
- Find the R squared for each and then use the formula
Testing single restrictions involving multiple coefficients
ie. H0:Beta1=Beta2 vs. H1: Beta1=!Beta2
Transform the regression by adding two new variables one being beta multiplied by one X, (added) the other by the other X (subtracted) e.g. Beta2X2i and -Beta2X1i
from this you can then factorise the Betas and the X terms to get a transformed Beta and a transformed coefficient which are:
1. Beta 1 - Beta 2
2. X1i - X2i
we then say that 1 is its own variable and so is 2 - then do a joint test of these variables as done before
Confidence sets
Related to confidence intervals which shows us the set population values for which the coefficients cannot be jointly rejected
It is the point of the coefficients with an area (ellipse) around it
What is the aim of control variables?
There to hold all other factors fixed in obtaining an unbiased estimate of the coefficient of interest - if we include a sufficient set of control variables we are able to remove omitted variable bias
Methods on deciding what the variable of interest is (3)
- Policy - may try to alter an X to result in an outcome Y
- Testing economic theory - policy may predict a relationship beteen an outcome Y and a variable of interest X
- Exploring new phenomena - investigating a plausible link between an outcome variable and a variable of interest X
Two specifications to consider when choosing what control variables to include
- Base specification - regressore which are the key set of control variables - determined by expert knowlede, economic theory or policy discussion of variables that determine Y
- Alternative specification - regressors that are less obvious as control variables but ones that are still requred to be checked whether they have an impact on the coefficient of interest.
4 pitfalls when using Rsquared and adjusted Rsquared
- An increase in Rsquared or adjusted Rsquared does not necessarily mean that an added variable is statistically significant.
- A high Rsquared or adjusted Rsquared does not mean that the regressors are a ture cause of the dependent variable.
- A high Rsquared or adjusted Rsquared does not meant that there is omitted variable bias in the coefficient of interest.
- A high Rsquared or adjusted Rsquared does not necessarily mean you have the most appropriate set of regressors.