regression with a binary dependent variable Flashcards
when Y is binary what is the linear regression model called and why?
it is called the linear probability model because pr(Y=1|X) = B0+B1X
what is the predicted value for the linear probability model?
the predicted value is a probability
what is B1 equal to in the linear probability model?
B1= the difference in probability that Y =1 associated with a unit difference in x
what is the formula for B1 in the linear probability model?
B1= [Pr(Y=1|X=x+change in x) -Pr(Y=1|X=1)]/change in X
what are the advantages of the linear probability model>
simple to estimate and interpret
inference is the same as for multiple regression ( need heteroskedacity-robust standard errors)7
what are the disadvantages of the linear probability model>
a LPM says that the changes in the predicted probability for a given change in X is the same for all values of X but that doesnt make sense
also LPM predicted probabilities can be <0 or >1
how can the disadvantages of the linear probability model be solved?
the disadvantages can be solved by using a nonlinear probability model such as probit regression or logit regression
what is the probit regression?
the probit regression models the probability that Y=1 using the cumulative standard normal distribution function Φ(z), evaluated at z=B0 +B1X
what is the equation of the probit regression model?
Pr(Y=1|X) =Φ(B0+B1X) where Φ is the cumulative normal distribution and z=B0+B1X
why use the cumulative normal probability distribution?
it provides an S shape which gives us what we need: Pr(Y=1|X) is increasing in X for B1>0 and 0 ≤Pr(Y=1|X)≤1 for all X
it is also easy to use as the probabilies are tabulated in the cumulative normal tables
it also has a relatively straightforward interpretation - B1 is the change in Z value for a unit change in X
what is the equation of the probit regression with multiple regressors?
Pr(Y=1|X1,X2) =Φ(B0+B1X+B2X2) where Φ is the cumulative normal distribution and z=B0+B1X1+B2X2
what is the B1 for probit regression with multiple regressors?
β1 is the effect on the z-score of a unit change in X1, holding constant X2 (when a causal interpretation is justified)
what is the logit regression model?
Logit regression models the probability of Y = 1, given X, as the cumulative standard logistic distribution function
what is the equation of the logit regression model>
Pr(Y=1|X ) = F(β0+β1X)
where F is the cumulative logistic distribution function:
F(β0+β1X) = 1/ (1+e^[-(B0+B1X)
how is the non linear least squares different to the OLS?
the non linear least squares extends the idea of the OLS to models in which the parameters enter nonlinearly
what is the minimisation problem for the non linear Least sqaures
min_b0,b1 Σ[Y_i - Φ(β0 + β1X_i)]^2
how do we solve the minimisation problem for the nonlinear least squares?
calculus doesnt give us an answer
it is solved numerically by a computer
what is the likelihood function?
the likelihood function is the conditional density of Y1,..,Yn given X1,….,Xn treated as a function of unknown parameters B0 and B1
what is the maximum likelihood estimator (MLE) ?
MLE is the value of (B0,B1) that maximises the likelihood function. it is the value which best describes the full distribution of data
in large samples, what is the maximum likelihood estimator?
it is consistent, normally distributed and efficient (has the smallest variance of all consistent estimators)
what are the measures of fit for logit and probit?
1) the fraction correctly predicted = fraction of Ys for which the predicted probability is >50% when Y_i=1 or is <50% when Y_i=0
2) the pseudo R^2 measures the improvement in the value of the log likelihood, relative to having no X’s. the pseudo R^2 simplifies to the R^2 in the linear model with normally distributed errors
in large samples, what are the features of the probit likelihood with one X?
estimator of B0_MLE and B1_MLE are consistent, normally distributed and asymptotically efficient