Econometrics Flashcards
Explain what it means for an estimator to be consistent
An estimator of a parameter is consistent if its distribution gets more and more concentrated around the true value of the parameter as the sample size increases. This answer would get almost full points. For maximum points also give the mathematical definition of a consistent estimator
When is an unbiased estimator consistent?
An unbiased estimator is consistent when its variance goes to 0 as the sample size increases, because in that case the distribution of the estimator gets more and more concentrated around the true value of the parameter as the sample size increases.
In the linear regression model, what are the assumptions required for the OLS estimators of the regression coefficients to be consistent?
see lecture notes, Topic 5. MLR1-4 are sufficient for consistency of OLS. A complete answer would also say that MLR4 can be replaced by MLR4’ which is weaker than MLR4. Always remember to explain in an answer what these assumptions mean, remember that your answers must be understandable by someone who is not necessarily familiar with our notation.
Carefully explain the meaning of the two entries “F(2,30)=33.12” and
“Prob>F=0.0000” in the STATA output above.
F(2,30)=33.12 is the sample realization of the F test statistic for the overall significance of the regression. That is the test statistic is a a random variable with an F(2,30) distribution, and its realization in the given sample is 33.12. The null hypothesis is β1=β2=0 against the alternative that β1 and β2 are not both zero. Prob > F gives the p-value, that is the probability that a random variable with an F(2,30) distribution is larger than 33.12. In this case the p-value is 0.0000 (i.e. 0 up to 4 decimal places), meaning there is a lot of evidence against the null hypothesis.
What is the exact formula used for the log Lin model?
(exp(𝛽̂)−1)100
Log-lin?
e.g. ln(Y) = Beta0 + Beta1(X). This will be a constant semi-elasticity model. Estimate: Increasing X by one unit increases y by (100x Beta1) Percent, Ceterus paribus.
Log-log model?
ln(Y) = Beta0 + Beta1(X). Regression coefficients are interpreted as constant elasticities (model). Estimate: Increasing X by 1% increases Y by (Beta1)%, Ceterus paribus
Lin-Lin model?
Linear regression model
e.g. Y = Beta0 + Beta1(X). Increasing X by 1 unit, increases Y by Beta1 units. Ceterus paribus
When should the R squared comparison not be used?
When the two models have different dependent variables.
Causal effect of x on y?
How does variable y change if variable x is changed
but all other relevant factors are held constant“
The zero conditional mean independence assumption implies what? and what does this mean?
E(y|x) = E( Beta0 + Beta1X + u|x)
=Beta0 + Beta1X + E(u|x)
= Beta0 + Beta1X.
E(u|x)=0
This means that the conditional mean of the dependent variable can be expressed as a linear function of the explanatory variable.
Stronger than saying u and x are uncorrelated but less string than saying they are independent.
Difference between dependent and independent variable?
Dependent (y axis) is being measured, Independent (x axis) is being changes.
SST?
Total sum of squares, represents total variation in the dependent variable (y)
SSE?
Explained sum of squares, represents the variation explained by regression.
SSR?
Residual sum of squares, represents the variation not explained by regression.
Relationship between SSR, SST, SSE? What are they known as?
SST = SSR + SSE, Total variation = explained part + unexplained part. Known as the measures of variation.
R squared measure and its meaning?
R squared = SSE/SST = 1 - SSR/SST. Measures the fraction of the total variation explained by the regression, the higher R squared, the closer Yi are to the regression line, so more linearity. Says nothing about casualty though.
Can can R squared be shown to equal?
the square of corr(x,y).
Assumptions in the SLR model? Explain
SLR.1 (Linear in parameters) SLR.2 (Random sampling) SLR.3 (Sample variation in explanatory variable) SLR.4 (Zero conditional mean) SLR.5 (Homoskedasticity) Look at sheet for equation.
Theorem 2.1?
Unbiasedness of OLS. SLR1 - 4
Theorem 2.2? Explain
Variances of OLS estimators. SLR1-5. The sampling variability of the estimated regression coefficients is higher the larger the variability of the unobserved factors and is lower the higher the variation off the explanatory variable
Explain MLR.1 (Linear in parameters).
In the population, the relationship between y and the explanatory variables is linear
Explain MLR.2 (Random sampling)
We have a random sample of size n from the population.
The data is a random sample drawn from the population
MLR.2 implies the data are a representative sample from the population
Explain MLR.3 (No perfect collinearity)
In the sample (and therefore in the population), none
of the explanatory variables is constant and there are
no exact linear relationships among the explanatory variables
Explain MLR.4 (Zero conditional mean)
The value of the explanatory variables must contain no information about the mean of the unobserved factors
Explain MLR.5 (Homoskedasticity)
The value of the explanatory variables must contain no information about the variance of the unobserved factors
Root MSE?
The standard error of the regression
What is a negative of focusing on the R squared value when adding regressors?
Can never fall when a regressor is added, so might lead to the including f silly regressors.
The differences between SLR and MLR assumptions?
all the same aside from MLR3 (No perfect collinearity)
Why does MLR3 differ from SLR3?
In the sample (and therefore in the population), none
of the explanatory variables is constant and there are
no exact linear relationships among the explanatory variables.
Remarks on MLR.3
The assumption only rules out perfectcollinearity/correlation bet-ween explanatory variables; imperfect correlation is allowed
If an explanatory variable is a perfect linear combination of other explanatory variables it is superfluous and may be eliminated
In practice violations of MLR.3 are rare unless a mistake has been made in specifying the model. Stataand other regression packages will indicate a problem.
Theorem 2.3?
(Unbiasedness of the error variance)
Gauss-Markov assumption?
MLR1-MLR5
Why is MLR4 more likely to hold than SLR4?
More independent variables mean it is less likely t=something will end up in the error variable.
How does the error variance affect the variance of an independent variable?
A high error variance increases the sampling variance because there is more „noise“ in the equation
A large error variance necessarily makes estimates imprecise
The error variance does not decrease with sample size
Feta squared
How does the sample variation affect the variance of an independent variable?
More sample variation leads to more precise estimates
Total sample variation automatically increases with the sample size
Increasing the sample size is thus a way to get more precise estimates
SSTj
How do Linear relationships among the independent variables affect the variance of an independent variable?
The higher Rj (the better explanatory variable xj can be linearly explained by other independent variables), the higher the sample variance.
1-R squared
MLR.6 ?
Normality of error terms. It is assumed that the unobserved factors are normally distributed around the population regression function. The form and the variance of the distribution does not depend on
any of the explanatory variables.