3. MLR - Estimation Flashcards
What are the advantages of a multiple regression over a simple regression?
- Incorporate more explanatory factors into the model
- Explicitly hold fixed other factors that otherwise would be in the error term u
- Allow for more flexible functional forms
What are slope parameters?
Parameters other than the intercept
How do the methods used change for a regression that has a quadratic function?
Mechanically there will be no difference in using the mehtod of OLS to estimate the parameters but there is a difference in the interpretation of the coefficients
How can a model be more flexible?
A model can contain logs and quadratic forms suggesting different relationships between the explanatory variable and the dependent variable
What is the partial effect?
Where you fix one variable and see what happens to y when the other variable changes
What do residuals measure?
The difference between the true value and the fitted value
What is Frisch-Waugh theorem?
1) Regress the explanatory variable x on all other explanatory variables and obtain the residuals r^
2) Regress the dependent variable y on the resulas r^ to obtain B^
Why does this procedure work?
- The residuals from the first regression is the part of the explanatory variable that is uncorrelated with the other explanatory variables
The slope coefficient of the second regression therefore represents the isolated effect of the explanatory variable
What are the standard assumptions for the multiple regression model?
MLR.1 - Linear in parameters
MLR.2 - Random sampling
MLR.3 - No Perfect collinearity
MLR.4 - Zero conditional mean
What is perfect collinearity?
If an independent variable is an exact linear combination of the other independent variables then we say the model suffers from perfect collinearity and it cannot be estimated by OLS. Some exact relationship between the regressors.
What is key to remember about the MLR.3 assumption - no perfect collinearity?
Assumption MLR.3 does allow the independent variables to be correlated; they just cannot be perfectly correlated
Why, in a multiple regression model is the likelihood of the assumption of zero conditional mean more likely to hold?
Because fewer things end up in the error term u
What does over-specifying the model mean?
One (or more) independent variable is included in the model even though it has no partial effect on y in the population
Why do we use OLS estimates?
Because they are good enough on average that they will be equal to the true value
How do OLS estimates relate to population parameters?
The OLS estimators are unbiased estimators of the population parameters
How do you interpret unbiasedness?
The estimated coefficients may be smaller or larger depending on the sample that as a result of it being random however on average they will be equal to the values that characterise the true relationship
What is the issue regarding overspecifying the model?
Including irrelevant variables may increase sampling variance thus reducing the precision of the estimates but it has no effect on the level of unbiasedness
Alongside poor specification of our model, when else can MLR.3 Perfect collinearity fail?
Assumption MLR.3 also fails if the sample size, n, is too small in relation to the number of parameters being estimated.
What is underspecifying the model and why is it problematic?
Excluding a relevant variable or underspecifying the model causes the OLS to be biased
Between simple and multiple regression analysis, which one is more likely to have omitted variable bias?
Simple
What are EXOGENOUS explanatory vairables?
When assumption MLR.4 (Zero conditional mean) holds
What are ENDOGENOUS explanatory variables?
If xj is correlated with u for any reason, then xj is said to be an endogenous explanatory variable. The term “endogenous explanatory variable” has evolved to cover any case in which an explanatory variable may be correlated with the error term.
What is the difference between MLR.3 and MLR.4?
Assumption MLR.3 rules out certain relationships among the independent or explanatory variables and has nothing to do with the error, u. You will know immediately when carrying out OLS estimation whether or not Assumption MLR.3 holds. On the other hand, Assumption MLR.4—the much more important of the two—restricts the relationship between the unobserved factors in u and the explanatory variables. Unfortunately, we will never know for sure whether the average value of the unobserved factors is unrelated to the explanatory variables.
What notation change do we observe when we know we are underspecifying our model?
We use the symbol “~” rather than “^” to emphasise that ߘ1 comes from an underspecified model.