1. easy to estimate - OLS 2. easy to interpret coefficients 3. easy to solve endogeneity issue - IV

1. Ui is not normal 2. Ui is heteroscedastic 3. Pi is NOT bounded [0, 1]

HANDOUT 12 Flashcards by Ella Grant

4 different names for models when Y is not continuous

Limited dependent variable models
Binary choice models
Dummy dependent variable
Qualitative choice

How well did you know this?

Not at all

Perfectly

What is the observed variable?

Yi = 1 if vote
Yi = 0 if do not vote

How well did you know this?

Not at all

Perfectly

What is the latent variable?

Yi* = net-utility from undertaking the activity

How well did you know this?

Not at all

Perfectly

Y* =

Y* = Xi’B + €i in short form

How well did you know this?

Not at all

Perfectly

What is the problem with Y* the latent variable?

It is UNOBSERVED

- we do not know an individual’s net utility from undertaking an action

How well did you know this?

Not at all

Perfectly

How do we related Yi and Yi*?

Yi = 1 if Yi* >=0
Yi = 0 if Yi* < 0

How well did you know this?

Not at all

Perfectly

E(yi) based on bernoulli trial

E(yi) = p(Yi = 1) = pi

How well did you know this?

Not at all

Perfectly

V(Yi) based on bernoulli trial

V(Yi) = pi (1-pi) = P(Yi = 1) x P(Yi = 0)

How well did you know this?

Not at all

Perfectly

How can we rewrite E(Yi)?

E(Yi) = P(Yi=1) = P(Yi* >=0) = P(Xi’B + €i >=0)

= P(€i >= -Xi’B) = P(€i <= Xi’B) = F(Xi’B)

How well did you know this?

Not at all

Perfectly

Distribution of €i

Normal distribution - symmetric

How well did you know this?

Not at all

Perfectly

F(Xi’B) refers to

The cumulative distribution function - probability of being less than or equal to Xi’B under the distribution of €i

How well did you know this?

Not at all

Perfectly

Our Model in 2 equations

E(yi) = F(Xi’B)
Yi = E(Yi) + Ui
- this is always the case: y = its expected value + some error term

How well did you know this?

Not at all

Perfectly

What is F in a linear probability model?

F = a UNIFORM distribution
F(Xi'B) = U(L, U)

How well did you know this?

Not at all

Perfectly

3 facts about uniform distribtuion

centered at zero
distributed between lower and upper limit
all shocks equally likely

How well did you know this?

Not at all

Perfectly

Under a uniform distribution, what is F(Xi’B0?

F(Xi’B) = Xi’B

How well did you know this?

Not at all

Perfectly

Therefore, what is our model for LPM & how do we estimate it?

E(Yi) = F(Xi’B) = Xi’B
So: Yi = Xi’B + Ui
Estimate by usual OLS
Unless Xi endogenous –> IV estimation

How well did you know this?

Not at all

Perfectly

Interpret coefficient on X1 under LPM

B1 = change in P(Y=1) for a unit increase in X1, ceteris paribus.

How well did you know this?

Not at all

Perfectly

100B1 under LPM=

100B1 = percentage point change in P(Y=1) for 1 unit increase in X1

How well did you know this?

Not at all

Perfectly

B1 if X1 is a dummy variable under LPM

B1 = change in P(Y=1) for having the characteristic vs not having it, ceteris paribus

How well did you know this?

Not at all

Perfectly

3 advantages of LPM

easy to estimate - OLS
easy to interpret coefficients
easy to solve endogeneity issue - IV

How well did you know this?

Not at all

Perfectly

3 problems with LPM

Ui is not normal
Ui is heteroscedastic
Pi is NOT bounded [0, 1]

How well did you know this?

Not at all

Perfectly

Why is Ui not normal under LPM?

If Yi=0, Ui = -Xi’B
If Yi = 1, Ui = 1 - Xi’B
Only takes 2 values = cannot be normal

How well did you know this?

Not at all

Perfectly

Is non-normality of Ui under LPM an issue?

Study These Flashcards

NO - invoke CLT if n>=30

coefficients approx normal = do z tests and chi-squared tests

V(Ui) =

Study These Flashcards

V(Ui) = Xi’B(1 - Xi’B)

- Depends on i = heteroscedastic

Is heteroscedasticity of Ui under LPM an issue?

NO - just use robust standard errors.

Is Pi not bounded under LPM an issue?

YES - not well-defined | Pi = Xi'B - we cannot bound this between 0 and 1.

Logit model what is F + formula

``` F = logistic distribution F(.) = e^. / (1 + e^.) ```

Are probabilities bounded for logistic distribution?

YES - as Xi'B to infinity, F -->1 As Xi'B --> -infinity, F --> 0 As Xi'B --> 0, F-->1/2

How do we estimate a Logit model?

MAXIMUM LIKELIHOOD ESTIMATION

What does maximum likelihood estimation do? Coin flipping example.

``` Suppose we flip a coin 30 times and observe 18 heads. We then try to find the p(head) that maximises the chance of what we observed. Repeated bernoulli trials = binomial P(X=18) = 30C18 x p^18 x (1 - p)^12 max w.r.t p we get p* = 18/30 = 0.6 ```

If the sample is random, how can we write joint probabilities?

Just multiple the individual probabilities together | P(A n B) = P(A) x P(B)

Denote the joint density function as the likelihood function

L(.) = Pi i=1,...,n [F(Xi'B)]^Yi [1 - F(Xi'B)]^1-Yi

Take logs of likelihood function

ln(L(.)) = sum [yi ln(F(Xi'B))] + [(1-Yi) ln(1 - F(Xi'B))]

Simplified log likelihood function form for logit model

ln(L(.)) = sum i=1,..,n1 Xi'B -- sum i=1,..,n | ln(1 + e^Xi'B)

How does stata maximise the log likelihood function?

It partially differentiates the ln(L(.)) w.r.t beta and sets = 0. We find a unique solution for beta, but we cannot write a simply algebraic expression since the function is non-linear.

F for a PROBIT model + formula

F = a NORMAL distribution F = integral between --infinity & Xi'B/sigma (2Pi)^-0.5 exp(-Z^2/2) dZ

What do we assume about sigma for probit model?

Assume sigma = 1 | So we can estimate Beta and not just Beta/sigma.

Log likelihood function for probit model

ln(L(.)) = sum [yi ln(Phi(Xi'B))] + [(1-yi) ln(1 - Phi(Xi'B))]

Logit vs probit distributions

Logit = logistic distribution Probit = normal distribution - Very similar CDFs, but logistic is flatter in the tails.

partial derivative of E(Yi) w.r.t X1 for logit

``` dE(Yi)/dX1 = dCDF(Z)/Z x dZ/dX1 Z = B0 + B1X1i + B2X2i dZ/dX1 = B1 ```

Derivative of CDF =

PDF

How does the PDF differ near/away from mean?

Near mean: PDF (=slope of CDF) = large | At extremes: PDF = small

What is the PDF for a logit?

PDF of a logistic = CDF x (1 - CDF) | PDF = e^z / (1 + e^z)^2

Can we interpret B1?

NO - only a scaled version of B1 | B1 x PDF = change in P(Y=1) for unit increase in X1.

Impact of a dummy variable on CDF

Dummy variable = vertical displacement of CDF.

ME of a dummy variable =

difference between 2 CDFs at a certain value of X1.

How does ME of a dummy differ across distribution?

Near mean of X1 = larger ME | At extremes = smaller ME

PDF of a probit model

phi(Z) = (2Pi)^-0.5 exp(-Z^2 / 2) | - Take CVs from standard normal table

ME depends on i so how do we interpret?

Interpret at MEAN VALUES

3 properties of MLE estimator of bj

1. consistent 2. asymptotically normal 3. most efficient

What test do we do for 1 restriction?

Approx. Z test

What test do we do for multiple restrictions? What is DOF?

Chi-squared k | K = DOF = number of restrictions we are testing.

For a multiple restriction test, what is the test statistic formula?

LR =2[ln(Lu) - ln(LR)] Lu = log likelihood of unrestricted model LR = log likelihood of restricted model

R^2 formula for log likelihood. Why is it bad?

R^2 = 1 - [ln(Lw) / ln(L0)] | Where L0 = log likelihood if we only have an intercept. Bad as has no natural interpretation.

Describe goodness of fit tests

Yi^ = 1 if E(yi) > 0.5 & 0 otherwise Compare predicted and actual Y - Yi ≠ Yi^ due to Ui random shocks. Look at proportion of total correctly predicted.

Goodness of fit test if we adopt a simple/constant probability rule.

Yi^ = 1 if p > 0.5 & 0 otherwise p = sample proportion - We predict everyone to vote leave if the proportion voting leave > 0.5 across whole sample. - Compare this to E(yi) one and see the gain from using the model.

HANDOUT 12 Flashcards

(56 cards)