WEEK 10 DUMMY DEPENDENT VARIABLE REGRESSION MODELS: LINEAR PROBABILITY, LOGIT AND PROBIT MODELS Flashcards
What properties does the dummy variable model have? D= Beta0 + Beta1*X + standard error? (5 properties )
i. We can observe whether D is 0 or 1 but we can never observe P (conditional probability given that X=x)
ii. For any X=x value, random error can take on only 2 different values (-Beta0 - Beta1x and 1-Beta0 - Beta1X) so it is a binary variable
Random errors not normally distributed (LR6 violated)
OLS estimators doesnt need LR6 and still valid at large samples
iii. Standard errors and D are inherently heteroskedastic because
Var (s.e | X) = P(1-P) = (Beta0 + Beta1X)(1-Beta0 - Beta1X)
^ depends on X
the homoskedasticity assumption LR3 is violated
can use HC standard errors or WLS method
iv. D= {0,1}, E( D |X) =P and 0
What is the logit and probit model and explain the similarity and difference between them
Logit: Logistic CDF leads to the so called logit model
Probit: Standard normal CDF leads to so called probit model
Little difference, in practice, choice between logit and probit models is matter of convenience
Both models are nonlinear in the parameters and cannot be estimated with OLS.
They are both estimated with maximum likelihood (ML) method. This determines estimates of unknown parameters by maximising the chance that process described by model generates data actually observed.
How can the logit model be linearised? How can we turn the logit model into a statistical model? What condition needs to be satisfied so logit can be estimated with ML?
a) By making use of the so-called odds ratio, defined as ratio of probability of success (D=1|X) to the probability of failure (D=0|X) i.e. P/(1-P)
Logarithm of the odds ratio is called logit, hence the name logit model
Logit: In(P/1-P) - linear in both X and in the Beta0 and Beta1 parameters
b) To make the model become a statistical one, it has to be augmented with an error variable
In(P/1-P) = Beta0 + Beta1X + standard error
Since P cant be observed we replace P with D
In (D/1-D) = In0
In(D/1-D) = in(1/0)
c) Logit model cannot be estimated with OLS, it an be esitmated with ML, granted sample size is big
Instead of the usual R^2 statistic and F-test of overall significance, what statistics do we use?
How is the first statistic analogous to R^2 and what is a good rule of thumb for the statistic?
How is the second statistic analogous to the F statistic in an OLS regression?
a) We would use the so-called McFadden R^2 statistic and the LR (likelihood ratio) statistic
b) McFadden is analogous to R^2 in the sense that it measures the level of improvement of model being estimated (unrestricted) over intercept only model (restricted)
in the interval {0,1) and the larger the better - cannot be interpreted like R^2 i.e. as the proportion of explained variation
Good rule of thumb: (0.2,0.4)
c) The LR (likelihood ratio) statistic, serves same purpose as F-statistic in an OLS regression. It can be used to test significance of the model. Asymptotically distributed as a chi variable with df equal to the number of slope parameters in the original i.e unrestricted model
Formulas for logit model
P= F(Z) = 1/(1+e^-z)
Z= Beta1 + Beta1X1 + … + BetakXk
In addition, slope coefficients are best interpreted by considering marginal effects of the independent variables on the probability of success
Marginal effect on probability of success is:
dP/dXi = f(Z) * Betai
f is the PDF
Logit model question
Example1, you would consider probabilty
Example 2: only the marginal effects of gre on probability of success
gre bar = 587.7, gpa-bar = 3.390, rank-bar = 2.485
Z = -3.45 + 0.0023 * 587.7 + 0.777*3.390 - 0.56 *1 = -0.024
P= (1/ 1+e^-z) = 1/(1+e^0.024) = 0.494
given sample means of gre and gpa, the probability of admission of the lowest ranked institutions is 0.494
If rank is 4:
Z = -3.45 + 0.0023 * 587.7 + 0.777*3.390 - 0.56 *4 = -1.704
P= (1/ 1+e^-z) = 1/(1+e^1.704) = 0.154
given sample means of gre and gpa, the probability of admission of the lowest ranked institutions is 0.154
Example 2
Z=-0.024,
dP/dX1 = f(Z) *\betahat_1 = e^-0.024/(1+e^-0.024)^2 * 0.0023 = 0.0057
Given the sample means of gre and gpa. An infinitesimal rise in the gre score is expected to increase the probability of admission to the least prestigious institutions by 0.00057
If exmaninng the estimated marginal effect of gpa on probability of admission, then change beta value
Why is the probit model more complicated than the logistic CDF?
This is because it is defined by a definite integral
Based on the probit model, what is the estimated probability of success?
formula seen in lecture
The area under standard normal curve to the left of Z and it can be obtained from the standard normal table
Probit model questions and examples
a) Probability of success
b) marginal effect of gre on probability of success
c) maginal effect of gpa on probability of success
a) Consider the sample means of gre and gpa and smallest and largest vallue of rank.
If rank =1
Z= -2.091 + 0.0014587.7 + 0.464 3.390 - 0.332*1 = -0.027
P= F(-0.027) around F(-0.03) = 0.488 (from standard normal table)
Given sample means of gre and gpa, probability of admission to the lowest ranked institutions is 0.488
b) if rank = 1, Z = -0.027
dP/dX1 = f(Z) * Betahat1= e^(-0.5Z^2)/sqrt(2pi) * Betahat1 = e^(-0.5)(-0.027)^2/sqrt(2pi) * 0.0014 = 0.00056
repeat for rank four
dP/dX1 = f(Z) * Betahat1= e^-0.5Z^2/sqrt(2pi) * Betahat1 = e^(-0.5)(-1.023)^2/sqrt(2pi) * 0.0014 = 0.00033
Given the sample means of gre and gpa, an infinitesimal rise
in the gre point is expected to increase the probability of
admission to the least and most prestigious institutions by
0.00056 and 0.00033, respectively.
Why cant the slope coefficients of the linear probability, logit and probit models be compared to each other directly?
Why cant MCF R^2 can directly compared form different tests?
Whats the difference of LPM and logit and probit
When is LPM preferred
They measure different things
i) logical signs of slope coefficients are same in these models
ii) LPM logit and proit coefficients has approximate relationships:
B_(logit) = 4Beta_LPM
Betahat_probit = 2.5Betahat_LPM
Betahat_logit = 1.6Betahat_probit
b) MCF R^2 and LR statistic of logit and probit model cannot be compared to each other. These model leads to similar though not necessarily identical inferences
c) In cases of logit and probit models, probability of success and change on margin depend on independent variables. In case of LPM, probability of success depends on actual values of independent variables, but marginal change doesnt
d) LPM is preferred when Pi remained between 0.2 and 0.8