regression with a binary dependent variable Flashcards
when Y is binary what is the linear regression model called and why?
it is called the linear probability model because pr(Y=1|X) = B0+B1X
what is the predicted value for the linear probability model?
the predicted value is a probability
what is B1 equal to in the linear probability model?
B1= the difference in probability that Y =1 associated with a unit difference in x
what is the formula for B1 in the linear probability model?
B1= [Pr(Y=1|X=x+change in x) -Pr(Y=1|X=1)]/change in X
what are the advantages of the linear probability model>
simple to estimate and interpret
inference is the same as for multiple regression ( need heteroskedacity-robust standard errors)7
what are the disadvantages of the linear probability model>
a LPM says that the changes in the predicted probability for a given change in X is the same for all values of X but that doesnt make sense
also LPM predicted probabilities can be <0 or >1
how can the disadvantages of the linear probability model be solved?
the disadvantages can be solved by using a nonlinear probability model such as probit regression or logit regression
what is the probit regression?
the probit regression models the probability that Y=1 using the cumulative standard normal distribution function Φ(z), evaluated at z=B0 +B1X
what is the equation of the probit regression model?
Pr(Y=1|X) =Φ(B0+B1X) where Φ is the cumulative normal distribution and z=B0+B1X
why use the cumulative normal probability distribution?
it provides an S shape which gives us what we need: Pr(Y=1|X) is increasing in X for B1>0 and 0 ≤Pr(Y=1|X)≤1 for all X
it is also easy to use as the probabilies are tabulated in the cumulative normal tables
it also has a relatively straightforward interpretation - B1 is the change in Z value for a unit change in X
what is the equation of the probit regression with multiple regressors?
Pr(Y=1|X1,X2) =Φ(B0+B1X+B2X2) where Φ is the cumulative normal distribution and z=B0+B1X1+B2X2
what is the B1 for probit regression with multiple regressors?
β1 is the effect on the z-score of a unit change in X1, holding constant X2 (when a causal interpretation is justified)
what is the logit regression model?
Logit regression models the probability of Y = 1, given X, as the cumulative standard logistic distribution function
what is the equation of the logit regression model>
Pr(Y=1|X ) = F(β0+β1X)
where F is the cumulative logistic distribution function:
F(β0+β1X) = 1/ (1+e^[-(B0+B1X)
how is the non linear least squares different to the OLS?
the non linear least squares extends the idea of the OLS to models in which the parameters enter nonlinearly