Econometrics - OLS Flashcards
What is the general equation for an ordinary least squares (OLS)?
Yi = b0 + b1Xi + ui
What is the total sum of squares equal to?
explain sum or squares + residual sum of squares (ui)
What is the interpretation of the slope coefficient in OLS?
When all X variables are 0, Y will be the value of the slope coefficient
What is the interpretation of OLS when Y and X are in levels?
A one unit increase in X will lead to a b unit increase in Y
What are the 5 classical assumptions?
- Error has a zero conditional mean (implies no relationship between error and explanatory variables)
- Linear in the parameters (b1 can’t be a power of X)
- Error term has constant variance (homoskedasticity)
- Error terms cannot be correlated (serial correlation if correlation of errors is not zero)
- Independence of X and u for all periods
What are 3 practical assumptions?
- number of observations should be larger than the number of regressors/degrees of freedom
- X must take different values
- Normality of random error
What is the additional OLS assumption needed for time series data?
Strict exogeneity - the error term is uncorrelated with each explanatory variable in every time period, so that unbiased estimates are yieled
What is the Gauss Markov Theorem?
OLS is BLUE - Best Linear Unbiased Estimator
Best = minimum variance
Linear = can be proven mathematically that OLS yields results that are linear estimates
Unbiased = expected value of the estimated value would be equal to the true underlying value
What is the problem with the Gauss Markov Theorem?
Often faced with failing of 3 BLUE assumptions
What other property is relied on when Gauss Markov is violated?
Consistency - if you have a large sample and the variance of the estimator becomes smaller AND the value of the estimator approaches the parameter value, then the estimator is consistent
How is the t-stat for an coefficient estimate calculated?
estimate of coefficient / standard error of estimate of coefficient (*assuming a t-distribution with n-2 degrees of freedom)
When can you reject the null hypothesis using 5% sig level?
if p-value < 0.05 then reject null
What does R-squared show? How can you interpret its value?
Goodness of fit, if R-squared is large then best fit line “fits” sample data closely.
R-squared can be interpreted as % of variation explained by model e.g. 0.6 = 60% explained
How is R-squared calculated?
Explained sum of squares / total sum of squares
What is adjusted R-squared?
Takes into account number of observations and number of explanatory variables
What is the formula for the adjusted R-squared?
1-(1-R-squared)*((n-1)/(n-p-1))
How do you interpret a multiple linear regression (multiple OLS)
If X1 changes by 1 unit, then Y will change by b2 units holding all other X fixed
What is the general F-test reported by software?
Test on overall significance of all explanatory variables in regression
What test does a restricted/unrestricted model to test for whether additional variables have explanatory power use?
F-test
How do you calculate the F-stat?
F-stat = [(RSSr - RSSUu)d] / [RSSu / (n-k) ]
where r = restricted and u = unrestricted
What is multicollinearity?
High linear relationships between explanatory variables, movements in one X are closely matched by moves in other X
What are the problems with multicollinearity?
Isn’t possible to estimate effectively the separate effects of X variables, as standard errors become large.
Variances will be large, as will confidence intervals and have statistical insignificance
What is the difference between perfect and imperfect multicollinearity?
Perfect is where 2 Xs are exactly linearly related, it is v rare and usually due to a dataset compilation error. Imperfect is where Xs are linearly related to a high degree but not an exact relationship
What are the 2 ways you can detect multicollinearity?
Partial (pairwise) correlation coefficients OR variance inflation factors (VIFs)
What are the solutions to multicollinearity?
More/better data
Re-specify model (add/drop variables)
What is the Chow test used for?
To test whether the coefficients of a model remain constant over the sample, whether there is a structural break
What is the null and alternate hypothesis of the Chow test?
h0: coefficients are constant across sample
h1: at least one coefficient changes
How is the test statistic for the the Chow test calculated?
CHOW = [(RSSr - RSS1 - RSS2)/K]/[(RSS1+RSS2)/(n-2k)
where r=whole sample, 1=period 1 subsample, 2 = period 2 subsample
How do you interpret coefficients in a log-log OLS?
A 1% change in X is associated with a b% change in Y
How do you interpret coefficients in a log-linear OLS?
A 1 unit increase in X is associated with an change of 100b% in Y
How do you interpret coefficients in a linear-log OLS?
A 1% change in X results in a b/100 units change in Y
How do you calculate the elasticity in a linear OLS?
b * (x/y)
How do you calculate elasticity in a log-log OLS?
b
How do you calculate elasticity in a log-linear OLS?
bx
How to calculate elasticity in a linear-log OLS?
b*(1/y)
What does the RESET test for? How does it work?
Functional form misspecification. Takes the squared fitted values and adds them back into regression and tests for significance of additional variables. Null hypothesis is that the coefficient on additional squared terms = 0 (no misspecification)
What test underpins the RESET test?
F-test
What does the Davidson-Mackinnon Test check?
Used to decide whether to use an X variable in levels or logs
What is the critical value for a 1-tailed t-test at 5% sig?
2.015
What is the critical value for a 2-tailed t-test as degrees of freedom exceeds 100 at 5% sig?
1.96
When do you reject the null that the coefficient = 0 (test for significance?
i.e. null = b is not significant
When the t-stat is greater than the critical value of 1.96