Section 4 Flashcards
In a multiple regression model, how many explanatory variables are there and how many parameters to estimate are there?
K-1 explanatory variables
K parameters to estimate
What does βj where j=2…k represent?
Each β (other than β1) represents PARTIAL slope coefficients
Tf β2 measures change in mean Y per unit change in X2 (Ceteris paribus)
What is the technique for minimising sum of squares in multiple regression analysis?
Find derivatives dS/dβi, set all equal to 0 and solve
What is assumption must be modified to ensure that multiple regression OLS estimators are still BLUE?
Assumption 4 must be modified: it now must be extended across all explanatory variables in the model, so EACH regressor is uncorrelated with the error
What is assumption must be added to ensure that multiple regression OLS estimators are still BLUE?
No exact collinearity can exist between any of the variables (tf no exact linear relationship between any of the regressors)
Another name for exact collinearity?
Pure collinearity
Note: exact collinearity is very rare
How are the sampling distributions for the OLS estimators distributed?
Normally:
β(hat)j ~ N(βj,σ^2)
How do we know the sampling distributions of the OLS estimators are unbiased?
The means equal the true (but Unknown) values
What is the use of the R(bar)^2 statistic?
R^2 statistic is often used to compare models with the same dependent variable to see which model is better at explaining it. However, as more explanatory variables are added to a model, R^2 will naturally increase therefore can lead to incorrect conclusions about models. To penalise the use of the extra explanatory variables we use the R(bar)^2 statistic
The R(bar)^2 statistic will only increase if new variables ADD to the analysis
What test statistic, with how many degrees of freedom, would be used for a test involving one parameter? (Eg. βj=βj*) how would you do a test of significance?
T statistic, n-k DofF
Sub in βj*=0
See 4.4.1 to learn how to do it
Give two examples of testing a linear restriction?
Testing if Σparameters=1
Or
Testing if β2=β3
How would you go about testing if β2=β3 (LINEAR RESTRICTION) via a hypothesis test? (H0? Distribution of it? Test statistic?)
H0: β2-β3=0
Distribution: β2-β3 ~N(β2-β3, σ^2β2 +σ^2β3 -2cov(β2β3))
SEE 4.4.2
(T statistic)
What test statistic, with how many degrees of freedom, would be used for the testing of joint restrictions? (Eg. β2=β4=0)
F statistic with q DofF for numerator and n-k for denominator, where q=number of restrictions(number of β involved) and n=sample size and k=number of β in total
See 4.4.3 and learn it!
See more complex example for tests of joint restrictions in notes
Now
Thing to remember when doing an F-test on joint restrictions?
Don’t worry about two/one tailed test, if it says 5% level of significance then use 5% graph
What is the test of overall significance(test of significance of regression) and what is the H0 and H1 for it?
The test that all k-1 slope parameters are jointly equal to 0
H0: β2=β3=…=βk=0
H1: at least one is not equal to 0
What form has the restricted model for the test of overall significance?
Yi=β1+εi
How do you then do the test of overall significance?
Same as the joint restrictions test with q=k-1 (doesn’t test significance of beta intercept)
What is the alternative test to the test of overall significance?
Test:
H0: R^2=0
H1: R^2 not equal to 0
How can you test a single restriction hypothesis using F-statistic? (And note)
Use same method and equation as normal F statistic test, with q=1
Note: will find F-value=(t-value)^2
Define multicollinearity?
The situation where explanatory variables are highly correlated BUT not exactly
What is a consequence of multicollinearity?
Assumption for BLUE is there should be no exact MC.
We are considering ‘close’ but not exact MC, tf doesn’t strictly violate assumptions BUT there are still consequences for OLS estimators since although they still may be the ‘best’ doesn’t necessarily mean they are good (see page 14)
5 signs of multicollinearity?
1) parameter estimates have large σ^2
2) may find very small t statistics
3) despite insignificant variables in model, R^2 may still be high (contradictions t vs R^2)
4) coefficient estimates may have signs contrary to what theory predicts
5) estimators may be highly sensitive (small Δdata values-> largeΔOLS estimates)
Best 3 ways to detect multicollinearity?
1) compare R^2 with t stats
2) compare correlation coefficients between explanatory variables
3) run additional regressions between explanatory variables then look at R^2 (see page 15)
4 ways to deal with multicollinearity?
1) remove problem variable(s)
2) changing/extending sample data
3) changing functional form of model
4) use previous studies to get some β values then use model to calculate other β values
Problem with removing variable(s) to deal with MC?
Could lead to misunderstanding of model
Problem with using other study’s β values to deal with MC?
Their data may be inaccurate/different to yours
SEE alternative function forms in notes
Now
What is a log linear and log semi-linear model? What is a special feature of a log linear model?
Log linear model - sides in βln(X) or ln(Y) form
Parameters can be interpreted as elasticities!
Semi log linear - only left side (ln(Y)) is in log form
Explain what the difference is between doing a joint test of restrictions and doing several individual ones?
Joint test: tests whether both beta 1 and beta 2 are 0
Individual tests: tests whether beta 2 is 0 when beta 4 is free to be whatever it is estimated to be and vice versa
Explain how doing a joint test might lead to a different conclusion to doing 2 individual ones?
If two regressors beta 1 and beta 2 are highly correlated with each other (multicollinearity), then when doing a t test on X1 when X2 is already in the model, X1 is likely to appear insignificant since its ADDITIONAL explanatory power is negligible, and vice versa. Therefore for a t test they may both come out as insignificant and hence be excluded from the model
For a complex example of joint restriction testing, what do you have to remember?
When rearranging the model, the final form of the rearrangement should only have terms with betas or errors, if any variables with neither in front they must be moved to the left to create a new variable Z (see notes example)
Also some X variables may be attached to the same beta, these must be replaced with a new variable W