Selection on (Un)Observables: Selection Correction Flashcards

Question 1

Q

Cluster-robust standard errors

Answer

A

Now, classic OLS makes two assumptions concerning this matrix: 1. that E[εi|X,D] = σ2 (equal variances, or homoscedasticity), 2. and that Cov[εi,εj|X,D] = 0, for all i ̸= j.
This latter assumption means that error terms (which are the deviations of the expected values of Y) for any two observations i and j, i,k = 1…6 are uncorrelated.

The problem now arises because the standard formulas (e.g. those used by Stata) to compute standard errors of the coefficients βˆ, δˆ assume that all η = 0. It does not affect βˆ, δˆ itself (no bias!). Fortunately, one can usually specify a ‘robust’ or ‘cluster robust’ option and it is all taken care of.

Question 2

Q

Binary outcome models

Answer

A

When my outcome Y is binary (either 0 or 1), then fit models ‘predicting’ the probability for individual i to have Y = 1.
Can assume the line on which these probabilities lie to take different functional forms.
- linear probability (OLS) assumes a straight line
- probit assumes the CDF of a normal distribution
- logit assumes the CDF of a logistic (very similar to normal)

Question 3

Q

Tobit Model

Answer

A

suitable for censured data. Essentially assumes my X to affect two things:
- likelihood of Y > 0
- value of Y provided that (or ‘conditional on’) Y > 0.
Predicted probabilities of a tobit are therefore
E[y|x] = Pr(y > 0|x) E[y|y > 0, x],

> not straight forward to interpret.

Question 4

Q

Problems with Sample selection

Answer

A

Estimating effects based on a sample that is not randomly drawn from the population can produce bias. Systematic selection into the sample on which data is available is such a case (‘sample selection’).
Examples: we only observe wages of people who actually work.

migration on earnings (decision to migrate likely to be driven by unobserved factors that also determine pay)
family holiday expenditure (number of kids affects decision to go on holiday, and how much is spent once on holiday)
institutions (decision to adopt certain institutions depends on factors that also matter for their effect once they are adopted)
Important: Difference to selection of treatment assignment is that here, selection determines whether we actually observe Y for certain subjects at all.

Question 5

Q

Sample selection bias

Answer

A

The bias arises through the error term (i.e. unobserved factors).
Take the education-wage example:
- lowly-educated people are most likely to have a job if they have good other skills
- such skills are usually unobserved, i.e. part of the error of our model
- and they affect the wage, which is the outcome Y
→ sample contains systematically more people with high unobserved skills
→ OLS (or any other uncorrected model) ends up producing biased coefficients

Question 6

Q

Sample Selection in causal graph

Answer

A

sample selection problem different from ‘causal situations’ looked at so far
here we actually want to know the effect of some X (education) on Y (wage), not the causal effect of D
problem is Y being unavailable if D = 0

Question 7

Q

Selection correction (basic idea)

Answer

A

explicitly model the selection process (‘selection stage’ or ‘selection equation’)
yields an estimate of the likelihood of every observation to be in the sample
this information is used to calculate the so called inverse Mills ratio (IMR)
in a separate equation we model the outcome of interest (‘outcome equation’)
including the IMR as a variable corrects for selection bias
think of the IMR as the correlation between error in the selection equation, and the error of the outcome equation without selection correction

Question 8

Q

Selection correction (mathematical) 1

Answer

A

consider first the selection equation:
Pr(di =1)=Φ(Zi +β)+εi,
which determines whether we observe a wage for individual i or not (di = 0, 1).
- Z is a set of independent variables, here including education level
- β is a coefficient vector
- ε is the error term of the selection equation
Consider now the wage equation:
w =α+Xγ+u, iii
α is a constant, X contains all or a subset of variables in Z , γ is a coefficient vector, u is the error term of the wage equation.

Question 9

Q

Slection correction (mathematical) 2

Answer

A

estimate the wage equation for all observations with a wage (all of whom have s = 1) then our γˆ is biased.
I.e., Cov (εi , ui )≠ 0, implying that also Cov (εi , Yi ) ≠ 0, which violates an assumption essential for unbiased estimation. However, if we now
1. estimate the selection equation to obtain a βˆ
2. calculate the so called inverse Mills ratio: IMR = ρ = φ(Zi βˆ) / Φ(Zi βˆ) 3. and estimate the wage equation with this as a variable, that is wi =α+X′γ+ρρ+ui,
the resulting γˆ is consistent (i.e. unbiased with large samples).
⇒ Intuition: in a non-randomly selected sample the IMR is an omitted variable, and inclusion of it takes out the omitted variable bias.

Question 10

Q

Confounders of causal sector effect estimation

Answer

A

wage determination processes may differ between the sectors
> differences in returns to skil; differences in regulation; trade-off between high pay and job security; symbolic rewards and intrinsic motivation may be substitutes for pecuniary rewards
most importantly, employees self-select into sectors according to
> preferences over high pay versus job security, symbolic versus monetary rewards, etc.
> their anticipated net utility in either sector (i.e. expected returns minus expected effort)
> trade-off between high pay and job security
> symbolic rewards and intrinsic motivation may be substitutes for pecuniary rewards
⇒ Many of these factors are generally unobserved and affect wages.

Question 11

Q

Roy model (aka ‘endogenous switching regression model’)

Answer

A

Consider a sector selection equation (public sector D = 1, business sector D = 0),
D = 1 if (log w1 −log w0)+Z′βS +εS

D = 0 if (log w1 −log w0)+Z′βS +εS

Rewrite this as a binary outcome model:

Pr(D = 1) = F[Z,βS,(log w1 − log w0),εS].

Consider further sector wages to be determined as follows:
log w1 = X′γ1 + u1, l

og w0 = X′γ0 + u0.

Question 12

Q

Features of Roy setup

Answer

A

the wage and selection equations are mutually dependent, i.e.
Di indicates which wage equation determines the wage of i
at the same time, the sector choice of i, Di , depends on log wi1 − log wi0

Question 13

Q

Sector wages: why not run 2 OLS?

Answer

A

problem: if we were to OLS-estimate a wage equation each for all public sector and all private sector employees
we’d have bias for the same reasons as in the Heckman model
suppose public sector jobs are sought-after, and generally highly-educated work there
some less-educated may also manage to get a public sector job, the ‘causes’ for this are most likely unobserved and thus in the error εS
since the same ‘causes’ usually affect wages, Cov (u0 , εS ) and Cov (u1 , εS ) ̸= 0
however, separate OLS-estimation of the wage equations implicitly assumes that
Cov(u0,εS) = Cov(u1,εS) = 0

Question 14

Q

How do you correct for sample selection in sector wages?

Answer

A

include the inverse Mills ratios, ρi , obtained from the selection equation, as an additional variable with coefficient ρ in the wage equations:
log w1 = X′γ1 +ρ1ρ1 +u1,

log w0 = X′γ0 +ρ0ρ0 +u0.

Question 15

Q

Interpret Roy results

Answer

A

Having unbiased coefficient estimates γˆ and ρˆ …
- … we can predict the public and private sector wages for particular value combinations of X
- for example, if in X we have age, schooling, and gender, we can predict wages log wˆ , log wˆ for a 30-year old, female, with Abitur in both sectors
10
⇒ the difference log(wˆ |X) − log(wˆ |X) is the effect (in percent, because of the 10
logs) on the wage of a switch from the private to the public sector for a person with characteristics X

Question 16

Q

2 possibilities of estimation using Heckman or ROy

Answer

Study These Flashcards

A

sequentially
1 a reduced-form selection equation
2 the selection-corrected wage equations
3 finally, the selection equation with endogenous wages
- pro: more transparent / intuitive
- pro: can get a coefficient on log wˆ0 − log wˆ1
- con: standard errors too small (→ bootstrap)
maximum likelihood (simultaneously; ‘Bayesian approach’)
- pro: more efficient and accurate SEs
- con: less transparent; need to formulate the likelihood function (can be tough)
- con: harder to back out coefficient on endogenous wage differential
- PRO: commonly accepted as superior → method used by van der Gaag et al. (1988)

Question 17

Q

Remarks on selection correction

Answer

Study These Flashcards

A

-used more widely in policy evaluation and labour economics
- ‘experts’ are never quite sure what it does. In particular, unresolved debate (between Nobel laureates and other leading economists!) over whether Heckman/Roy can
→ correct for selection on observables (X) only → or also for selection on unobservables
- consensus is that it works better with an ‘exclusion restriction’, i.e. Z contains more variables than X
- if these variables excluded from X are not affecting Y directly, the method becomes similar to instrumental variable estimation
- if X = Y ‘identification’ relies on strong assumptions about functional form
→ linearity and additivity of selection equation
→ normal distribution of errors of selection equation

Selection on (Un)Observables: Selection Correction Flashcards

(17 cards)