Exam Notes Flashcards
What does ANOVA Stand for?
Analysis of Varience
What distribution is used in a t-test what are degrees of freedom?
T distribution
n-k-1
K is regressors
n= observations
What does H0:Bj=0 imply?
That the regressor in question has no statistical significance.
Show the difference for H1 of a one and two-sided t-test.
H1: Bj>0 or Bj<0 (One-sided)
H1: BJ not = 0 (Two-Sided)
At what point do you reject the null hypothesis?
When t>C.
What is the critical value used for a two-sided test?
(1-alpha)/2
What do we assume when rejecting nulls?
That the alternative is two-sided when we reject.
What is the formula for a t-test. Suppose you are testing Bj=Aj
(Bj-Aj)/se(Bj)
What is the formula for confidence intervals?
BJ+- C.se(Bj)
Where C=1-alpha/2 in a tn-k-1 distribution.
How is the p-value calculated?
Find the t stat then find the percentile that the t-stat is within. P-val is the probability that this t-stat would be observed if the null was true.
Show the derivation of SE(B1-B2)
See notes
How to test B1=B2
Know that H0: B1–B2
and that B1+B2 = 0 =thy1
Then change regression to include thy 1 and test thy 1’s significance.
What is the point of a joint significance test?
It is to see if there is any warrant in including the regressors in the model. Basically is the increase in SSR too much.
What is the formula for the F-test?
F=(SSRr-SSRur)/q)/(SSRur/n-k-1)
Where q is the number of restricted regressors.
K includes the intercepts
What if you don’t have RSS, how can you run an F-test
Use the R2 formula.
F = (R2ur-R2r)/q/(1-R2ur)/n-k-1
What is the formula for overall significance?
F=(R2/k)/(1-R2)/n-k-1
If only one exclusion is being restricted then F=t^2
what is:
- Type 1 error
- Type 2 error
Type 1: Prob of null rejection when it’s true. (size of test) sig level = alpha
Type 2: prob of accepting a false null. (power of test) 1- prob type 2.
What are the marginal fx and elas
See notes and seminar for interpretation
What does a qualitative variable do?
It describe features of a data set that are not quantifiable.
What can a dummy independent variable do?
Allow the intercept or the slope to change due to different points in the data.
For example;
-oil crisis, financial crisis, drought.
Show a model with a single dummy variable inside
y=B0+sigmanaughtd+b1x+u
What formula should you use to interpret a coefficient on a variable when the dependent variable is a log variable.
When the coefficient is over 0.2, use 100*(e^coeff-1)
Make sure to put the minus in if the coefficient is negative.
What must you remember if you are making seasonal dummies for a quarterly data set?
You must remember to only include 3. Let Q1 be represented by the intercept. It is known as the base category, the other three coefficients are compared against it.
How do you put a dummy for multiple categories?
Imagine everyone is either:
- HS dropout
- HS grad
- College Grad
And you want to compare HS and College grads to HS dropouts.
You would include two dummy variables;
hsgrad=1 if only has grad, 0 other wise
colGRAD=1 if col grad and 0 otherwise.
Now the effect of HS dropout can be seen if the other two variables are 0.
How can dummy variables be used for an interaction term. Give an example.
Imagine if you want to see that impact on some random dependent variable of being both married and female.
You would create three dummies:
- Female dummy
- Married Dummy
- Female*Married dummy.
Then would have coefficients
a1 a2 a3
respectively.
So then if you wanted to see the effect.
The formal model is as follows;
y=b0+a1fem+a2married+a3fem*married+B1x+u
Single male:
B0+b1
Single female:
B0+a1+B1
Married female:
B0+a1+a2+a3+B1
Married Male:
B0+a2+B1
What happens if dummy variables are interacted with continuous variables?
Then it allows the model to differ by both intercept and slope.
What does a coefficient imply in the LPM
It means you are x amount more likely to achieve success in the model.
What are the pros and cons of LPM
Pros:
- Easy to estimate and read
- More robust than a probit or logit model.
Cons:
- Predictions may give probability over 1.
- Assuming the effect is linear may be restrictive.
- Violates the assumption of homoscedasticity.
As Var(y|x)=P(x)*(1-P(x))
and obvs p x can change.