Binary responses Flashcards
what does the line B0 +B1X mean?
Linear regressor with a single regressor
Y = B0 +B1X+ u
The conditional mean of Y given X
E[Y|X]
what does B1 mean?
Linear regressor with a single regressor
Y = B0 +B1X+ u
The partial affect of X on E[Y|X] (on conditional mean of Y given X)
^E[Y|X] / ^X
^ = average
What hoes the predicted value ^Y mean?
^Y(estimation of Y)
A predicted conditional mean of Y given X
^E[Y|X] - ^b0 +^b1X
what does the line B0 +B1X mean?
E[Y|X] - b0+b1X
P(Y=1|X)
Probability of Y +1 is conditional on X
Linear probability model (advantages)
- Simple to estimate and to interpret
- inference is the same as for multiple regression (need heteroskeddasticity - robust standard errors)
Linear probability model (disadvantages)
- A LPM says that the change in the predicted probability for a given change in X is the same for all values of X
- Also, LPM predicted probabilities can be < 0 or > 1!
These disadvantages can be addressed by using a nonlinear probability model: probit and logit regression
a nonlinear functional form is used when we want:
- P(Y=1|X) to be increasing in X for b1 >0,
2. 0
Differences between linear and nonlinear functions (FORMULAS)
Linear: P (Y+1|X) = b0 + b1X
Non-linear: P (Y+1|X) = F(b0 + b1X)
F is a cumulative distribution function of a random variable
Probit regression model
P (Y=1|X) = o| ( b0 + b1X)
o| is the cdf of standard normal distribution
b0 + b1X is the “z value” of the probit model
Why use normal cdf? (probit regression)
- two properties are satisfied
- easy to use
- straightforward interpretation of “z value”
- a latent variable model with normal error u
Probit regression model (formula
P (Y=1|X) = o| ( b0 + b1X)
F(x) = [ 1 / 1 + e^ (b0 + b1X) ]
cdf of logistic distribution
- logit and probit use different probability functions, the coeficients (b’s) are different in logit and probit
- in practice, logit and probit are very similar - since empirical results typically don’t hinge on the logit/ probit choice, both tend to be used in practice
P (Y=1|X) = o| ( b0 + b1X)
how can we estimate b0 and b1 ?
- Non linear least square
- Maximum likelihood estimation (MLE)
what is the OLS estimator?
It is the coefficient values that minimize the squared residuals.
min E [ Y - ( b0 + b1X) ]^2
Nonlinear lest square
- no explicit solution exist in general
- Solved numerically using the computer (specialized minimization algorithms)
- In practice, MLE is used more ( a more efficient estimator )
min E [ Y - o| ( b0 + b1X) ]^2