BL12 Flashcards
How are Y variables treated when they are dummy variables?
Bernoulli random variables tf:
E(z)=p
V(z)=pq
What is the binary response model and what does it tell us?
E(y=1|x)=β0+β1x1+…+βkxk
Tells us that the probability of success, p(x)=P(y=1|x), is a linear function of the x(j)
What is the response probability? What do we know from this?
P(y=1|x)
since prob must add to 1:
P(y=0|x)=1-P(y=1|x)
What is a linear probability model (LPM)?
A MLRM with a binary dependent variable is a linear probability model because the response probability is linear in the parameters β(j)
What does β(j) measure in a LPM?
The change in probability of success when x(j) changes, ceteris paribus. The slope coefficient measures the change when x(j) increases by one unit
What will y(hat) be if we estimate it in the LPM?
The predicted probability of success
What assumption does the LPM violate? Why? How does this affect the analysis?
Homoskedasticity because:
When y is a binary variable, its variance, conditional on x, is V(y|x)=p(x)[1-p(x)] where p(x)is shorthand for probability of success.
This means that there must be heteroskedasticity in a linear probability model except in the case where the probability does not depend on any of the independent variables
Can still prove unbiasedness but t and F stats are invalid
3 drawbacks and 2 EV points on them of LPM?
1) Can lead to predicted prob. below 0 or greater than 1
2) LPM implies constant marginal effect of each explanatory variable.
3) Contains heteroskedasticity
EV:
1 and 2 aren’t big issues if estimating the middle range of the data
3 is fixed easily if samples are large enough using appropriate methods
Explain the difference between economic and statistical significance?
Statistical significance of a variable x(j) is determined by the size of the statistic entirely.
Economic/practical significance of a variables is related to the size AND sign of the statistic
For large samples, using a smaller significance means…
…economic and statistical significance will be more likely to coincide
In large samples, what can a large standard error indicate?
Multicollinearity
SEE
slides 27-30 stat and econ significance