Tutorial 7 - Censored Variables Flashcards

1
Q

What is censored data?

A
  • Random sample
  • But partial information about the value of a variable—we know it is beyond some boundary, but not how far above or below it (or 0)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is truncated data?

A

Data are truncated when the data set does not include observations in the analysis that are beyond a boundary value. Having a value beyond the boundary eliminates that individual from being in the analysis.

  • -> no random sample!
  • -> no information on censored variables!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Tobit model used for?

A

The Tobit model is used in situations in which the dependent variable y is censored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are corner solutions?

A

The dependent variable y has bunches at certain points (typically zero) due to individual behaviour (corner solution)

  • Hours worked for women
  • Monthly spending on cigarettes
  • Amount invested in a project

Here, the term “censoring” is also used for corner solutions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the Tobit model assumptions for the dependent variable?

A

The latent variable y‡* = xβ + ϵ ϵ | x ~ N(0; σ²) iid
is assumed to :

  • be a linear function of the covariates,
  • with normally distributed, homoscedastic errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you express “left-censoring” or “censoring from below” (at zero) for observed variable y as a function of latent variable y*?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the expected value of the observed variable y in a Tobit model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the probability that y is positive in censored data?

A

with

  • Φσ= cdf of normal distribution with a standard deviation of σ and a mean of zero
  • Φ (.) = cdf of standard normal distribution (a standard deviation of one and a mean of zero)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In the tobit model, what is the expected value of y, given that y is positive?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In the tobit model, what is the expected value of y?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the partial effect of continuous regressor xK on the observed variable y in the Tobit model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the partial effect of continuous regressor xK on the latent variable y* in the Tobit model?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which partial effect is more interesting for which case of censored data?

A
  • for “real” censoring model, it is in most cases the effect on the latent variable
  • for corner solution model, it is the observed variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the average partial effect (APE) of a continuous variable xKon y?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the average partial effect (APE) of a discrete variable xKon y?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How would you interpret this average partial effect of the variable ‘kids05 = number of kids below 6’ on ‘hours = hours worked per year’?

A

The effect of an additional kid<6 ranges from -855.7 to -13.78, and the Average Partial Effect (APE) is -526.2 One additional child (younger of six) reduces averagely the yearly working time by approximately 526 hours.

17
Q

Can you compare these coefficients?

A

Coefficients of OLS vs. Tobit are only comparable in
signs, not in magnitudes! To compare magnitudes, we have to compare e.g. APEs (for OLS: the coefficient, for Tobit: the APE on observed value)

18
Q

How can you solve the problem of taking the logarithm of censored data (where 0 is a frequent value)?

A
  • For these observations, impute an artifical value of 1 -> Log(Investment + 1) is the dependent variable
  • This is a trick sometimes used in models with corner solutions at zero