Lecture 7 (MLE and limited dependent variables) Flashcards
What are the advantages and disadvantages of using an LPM to study the binary dependent variable?
- Advantages
- It is possible to directly interpret $\beta$ (one unit of $x$ increases the prob of $y$ by ..%)
- The IV approach is easier with LPM than with non-LPM
- Disadvantages
- We might predict probabilities outside the unit interval (we try to predict something non-linear with a linear model) since the LMP is not bounded. This comes from the fact that the marginal effect is constant over the range of $x.$
- Also, the variance depends on $x$ (from Palmes lecture)
Set up the latent index model and derive Pr(y=1)
use
y* = x’\beta+u
where y* is a latent variable
See notion and problem set
What is Maximum likelihood?
In OLS we choose the parameters that minimize SSR. In ML, given our
output (assumptions regarding probability distribution generating this) we choose
the parameter values maximizing the probability of the observed outcome.
What are the four steps in ML-estimation?
There are four main steps in maximum-likelihood (ML) estimation
- Derive the joint PDF of the sample (using the Bernoulli distribution)
- Obtain the likelihood function
- Take log of the likelihood function to obtain the log-likelihood function
- Take the partial derivative of the log-likelihood function w.r.t the parameter of interest (e.g., $\beta$) to obtain the ML-estimator from the FOC
Do all 4 steps MLE steps for a binary latent variable Y.
Lets say we have derived G(x’B)
See notion
When is ML the most efficient estimator?
- ML is the most efficient estimator if the distributional assumptions are
correct - But efficiency comes at the price of robustness because we make some parametric assumptions
- If some part of the model is mis-specified, e.g., the normality assumption is
not true, the ML estimator is inconsistent
With probit, when will the density for y|x be incorrect?
With probit, the density for $y|x$ will be incorrect if
- the error term $v$ is not normally distributed
- or the latent error is not independent from x
- or the latent variable model is not linear.
How do we interpret the marginal effect of probit or logit?
See notion
Explain and formulate an M-estimator
In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both OLS, non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation. GMM is a subclass of M-estimators.
See notion for formulation.
What can we say regarding consistency and the M-estimator?
Under the assumptions:
- Identification assumption: $\beta_0$ is the unique solution
- Law of large numbers
- q(x,\beta) is a continuous function
- The space is compact (closed and bounded)
If 1 & 2 holds,\hat\beta is consistent.
How can we learn about the asymptotics of an M-estimator?
We can also show that M-estimators are asymptotically normally distributed.
We do this using the score function.
We basically have the same procedure and get the same results as with OLS. Remember that OLS is a type of M-estimand.