11 Binary dependent variables Flashcards
linear probability model
regression model with a binary dependent variable
Disadvantages of the linear probability model
- Predicted probability can be above 1 or below 0!
- Error terms are heteroskedastic
nonlinear probability models
Pr(Y = 1) = G(Z)
with Z = β0 + β1X1i + ··· + βkXki
and 0≤G(Z)≤1
Probit: G(Z) = Φ(Z)
Using the cumulative standard normal distribution function Φ(Z )
Logit: G(Z) = 1 / (1 + e^{-Z})
Using the cumulative standard logistic distribution function
Remember:
F(z) = Pr(Z ≤ z)
the method used to estimate probit and logit models
Maximum Likelihood Estimation (MLE)
The models are nonlinear in the coefficients, so they can’t be estimated by OLS.
likelihood function
The likelihood function is the joint probability distribution of the data, treated as a function of the unknown coefficients.
maximum likelihood estimator (MLE)
The maximum likelihood estimator (MLE) are the values of the coefficients that maximize the likelihood function.
MLE’s are the parameter values “most likely” to have produced the data.
If Yi is binary, then E(Yi | Xi) =
If Yi is binary, then E(Yi | Xi) = Pr(Yi = 1 | Xi)