Week 5 Flashcards
Linear Probability Model (LPM)
P(yi = 1) = pi = E(yi) = xi’β
What are the problems with LPM?
1) Distribution of error terms not normal (discontinuous)
2) Heteroskedasticity
3) OLS is unbiased and consistent, but inefficient
4) OLS ignores restriction 0<=pi<=1 s.t. fitted values can lay outside [0,1]
What is the transformation from linear to non-linear model for p?
P(yi = 1) = F(xi’β)
Logit model
εi ~ LOG(0,1)
F(xi’β) = 1 / (1 + exp(-xi’β))
Probit model
εi ~ N(0,1)
F(xi’β) = Φ(xi’β)
What is the difference between the logit and probit model?
β(logit) ≈ 1.8β(probit)
σ(logit) = π/3
How can the parameters of logit/probit models be estimated?
ML: yi~Bernoulli(pi)
f(yi) = 1 - pi, if yi = 0
= pi , if yi = 1
L(β) = f(y1, ..., yn) = Πf(yi) = Πpi^(yi)(1-pi)^(1-yi) ℓ(β) = Σyilog(pi) + (1-yi)log(1-pi)
What is the ML estimate’s distribution and properties?
β^ ~ N(β, V^)
V^ = (Σpi(1-pi)xixi’)^(-1)
Consistent and asymptotically normal
What is identification?
A vector of parameters is identified by a given data set, model, and estimation method if, for that data set, the estimation method provides a unique way to estimate the parameters in the model.
Unless there is collinearity, parameters in the linear model are always identified
Identification issues commonly arise in non-linear models; if not identified it is impossible to pin down the estimates of the true value -> impose restrictions
What are the 2 identification issues?
1) The variance σ^2 of the latent variable is unidentified
2) The threshold T is unidentified
What is the marginal effect of the logit, profit and LPM model?
LPM: βj
Logit: P(yi=1)P(yi=0)βj
Profit = Φ(xi’β)βj
Odds Ratio
P(yi=1) / P(yi=0)
Describes the importance of x in terms of determining the probability of outcome 1 in terms of the probability of outcome 0
Odds Ratio - Logit
P(yi=1)/P(yi=0) = exp(xi’β)
0 -> two outcomes equally likely
+ -> outcome 1 more likely
- -> outcome 0 more likely
If all variables are demeaned, what does the sign of β denote?
Average preferred choice
Diagnostics of logit/probit models
Residuals: ei = yi - F(xi’β) = yi - pi^
Standardized residuals: ei = (y - pi^) / sqrt(pi^(1-pi^)
McFadden R^2: 1 - ℓ(β^)/ℓ(β0^) => 0<=R^2<=1 (β0^ is model with only intercept)
larger value => more that the regressors explain outcome