Tutorial 6 - Binary Dependent Variables Flashcards
What are different approaches to estimate binary dependent variables?
- linear probability,
- logit,
- probit
- complementary loglog
What is a Linear Probability model?
- can be estimate by OLS (only variance is different)
- Var(ϵi|xi) = xi‘β (1 − xi‘β)
- > error term is heteroscedastic, adjust standard errors
- estimated values, ^y, can take any value, not restricted to [0; 1]-interval (not what we want to have)
- linear probability is still often used due to the easy interpretation and the flexibility of the linear model
What are latent variables?
latent variables are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables
How does the latent variable model look like?
- y* latent variable, continuos and unobserved variable driving the dependent variable
- binary outcome variable y
- y = 1 if y* > 0
- y = 0 if y* ≤ 0
- eg. preferred working time in hours for working full time or ability to cover credit in Euro
What is the probability for dependent variable y to be one under latent model?
Which question does F(x) answer?
F(x) answers the question which share of distribution (described by f(t) ) is smaller (or equal) to the value x
What is the distribution function for probit?
see below with where Φ(x) is the distribution function of the standard normal distribution -> probit
What is the distribution function for logit?
see below with Λ(x) = standard logistic function
What is complementary log-log model?
- Third alternative to logistic regression and probit analysis for binary response variables.
- Frequently used when the probability of an event is very small or very large -> has advantages for cases with average probabilities close to zero or one
- Unlike logit and probit, the complementary log-log function is asymmetrical.
What is the distribution function for complementary log-log model?
extreme value distribution function:
What type of estimator do linear probability, logit, probit and complementary loglog have?
What are the distribution functions for linear probability, logit, probit and complementary loglog?
What are the two ways of interpreting probabilistic models?
- average marginal effect
- marginal effect evaluated at average
Which Tests for Goodness of Fit can you use for models with binary dependent variables?
- Pearson’s test
- Hosmer-Lemeshow test
(same test statistic below, but different groups)
How can you apply Person’s test for goodness of fit?
- form m groups according to covariates:
- nj: number of observation in group j
- Yj: number of observations being one,
- ^pj: predicted probability of being one
- Sum of squared Pearson’s residuals (group residuals) approximately Χ²M−K distributed