Tutorial 6 - Binary Dependent Variables Flashcards

1
Q

What are different approaches to estimate binary dependent variables?

A
  • linear probability,
  • logit,
  • probit
  • complementary loglog
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Linear Probability model?

A
  • can be estimate by OLS (only variance is different)
  • Var(ϵi|xi) = xœi‘β (1 − xœi‘β)
  • > error term is heteroscedastic, adjust standard errors
  • estimated values, ^y, can take any value, not restricted to [0; 1]-interval (not what we want to have)
  • linear probability is still often used due to the easy interpretation and the flexibility of the linear model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are latent variables?

A

latent variables are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the latent variable model look like?

A
  • y* latent variable, continuos and unobserved variable driving the dependent variable
  • binary outcome variable y
    • y = 1 if y* > 0
    • y = 0 if y* ≤ 0
  • eg. preferred working time in hours for working full time or ability to cover credit in Euro
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the probability for dependent variable y to be one under latent model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which question does F(x) answer?

A

F(x) answers the question which share of distribution (described by f(t) ) is smaller (or equal) to the value x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the distribution function for probit?

A

see below with where Φ(x) is the distribution function of the standard normal distribution -> probit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the distribution function for logit?

A

see below with Λ(x) = standard logistic function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is complementary log-log model?

A
  • Third alternative to logistic regression and probit analysis for binary response variables.
  • Frequently used when the probability of an event is very small or very large -> has advantages for cases with average probabilities close to zero or one
  • Unlike logit and probit, the complementary log-log function is asymmetrical.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the distribution function for complementary log-log model?

A

extreme value distribution function:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What type of estimator do linear probability, logit, probit and complementary loglog have?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the distribution functions for linear probability, logit, probit and complementary loglog?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two ways of interpreting probabilistic models?

A
  • average marginal effect
  • marginal effect evaluated at average
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which Tests for Goodness of Fit can you use for models with binary dependent variables?

A
  • Pearson’s test
  • Hosmer-Lemeshow test

(same test statistic below, but different groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you apply Person’s test for goodness of fit?

A
  1. form m groups according to covariates:
    • nj: number of observation in group j
    • Yj: number of observations being one,
    • ^pj: predicted probability of being one
  2. Sum of squared Pearson’s residuals (group residuals) approximately Χ²M−K distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can you apply Hosmer-Lemeshow test for goodness of fit for models with binary dependent variables?

A
  1. form m groups according to predicted probability ^p (typically ten equally large groups, with 0.0-0.1, 0.1-0.2,…):
    • nj: number of observation in group j
    • Yj: number of observations being one,
    • ^pj: predicted probability of being one
  2. Test statistic is Χ²M−K distributed
17
Q

How do you calculate the Receiver Operating Characteristic-curve?

A
18
Q

What is sensitivity?

A
19
Q

What is specificity?

A
20
Q

What does the ROC-curve plot?

A
21
Q

How would you interpret the coefficient below in a linear probability model?

lm(default ~ age + sex + income + children, data=data)

sex estimate: - 0.057 P-value: 0.047

A

probability of default changes by -6 percentage points for males, keeping everything else constant. it is significant at 5% level.

22
Q

What is the Wald-test?

A

Joint test of significance: no explanatory variable (except the intercept β0) has explanatory power:

23
Q

What is the Wald-test test statistic?

A

with J degrees of freedom and J equals the number of restrictions (four in this example)