Binary Dep for kids Flashcards by Hans SSS

What three models do we have?

Logit, Probit, LPM

How well did you know this?

Not at all

Perfectly

How do we interpret Binary dependent variable regression

interpreted as a conditional probability function

How well did you know this?

Not at all

Perfectly

What is conditional probability

Conditional probability is the probability of one thing being true given that another thing is true

How well did you know this?

Not at all

Perfectly

Difference between Probit & Logit and LPM

Probit, Logit allows for non-linear relationship between dependent and regressors
Probit, Logit will be between 0 to 1

How well did you know this?

Not at all

Perfectly

•

How well did you know this?

Not at all

Perfectly

R2 Interpretation

No meaningful interpretation

- regression line never able to fit the data perfectly because y is binary and regressors are continious

How well did you know this?

Not at all

Perfectly

R2 relies on ___ which make it unusable

linear relationshipt between X and Y

How well did you know this?

Not at all

Perfectly

What measures the fit of the model

PseudoR2 measureas the fit using the likelihood function

How well did you know this?

Not at all

Perfectly

What is a good PseudoR2 value

Rule of thumb is between 0,2 and 0,4

How well did you know this?

Not at all

Perfectly

PsuedoR2 is also called

McFadden

How well did you know this?

Not at all

Perfectly

Standard errors in LPM are always

Heteroscedastic, so we use robust standard errors

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

When Y is the binary variable -> explain the regression

The population regression function shows the probability that Y = 1 given the value of the regressors

How well did you know this?

Not at all

Perfectly

Why is it called LPM

Because the probability that Y = 1 is a linear function of the regressors

How well did you know this?

Not at all

Perfectly

What is Probit and Logit regressions

They are regression models that are nonlinear when Y is used as a binary variable

How well did you know this?

Not at all

Perfectly

Difference between LPM and Probit & Logit

Probit & Logit regressions ensure that the predicted probability will be between 0 and 1

How well did you know this?

Not at all

Perfectly

Probit Regression uses …..

Cumulative Distribution function

How well did you know this?

Not at all

Perfectly

What is cumulative distribution function

Study These Flashcards

It is the probability that the variable takes a value less than or equal to X

What does Probit and Logit regression allow for that LPM doesnt

Study These Flashcards

Probit and Logit models allows for non-linear relationship between regressors and dependent variable.

Logit Model uses _________

Study These Flashcards

Logistic cumulative distribution function

Logit and Probit Models are appropriate when attempting to model ___

Study These Flashcards

a dichotomus dependent variable, e.g. yes/no, agree/disagree, like/dislike.

How does the Probit and Logit model look like? Shape

Study These Flashcards

S-Shape, y is between 0 and 1

Y axis shows

Study These Flashcards

We can think of the y-axis as originally having a value 0 to 1. But this value get transformed into the value of log(p/(1.p)). So if p (or y) was 0,5, the new value on this axis is log(0,5/(1-0,5))=0.

when p = 1, we get log(1) - log(0). This equals positive infinity, we now got both positive and negative infinity

where infinity come from

Study These Flashcards

when p = 1, we get log(1) - log(0). This equals positive infinity, we now got both positive and negative infinity

What is the z value

Rule of thumb: Z should be over 2 and p under 0,05 for H0 to be rejected estimated intercept divided by the standard error - the number of standard deviations the estimated intercept is away from 0 on a standard normal curve (Wald test)

Why cant we use Least Squares method

Intuitively we want to draw the best line with least squares as in OLS simple regressions, but our residuals go to infinity, so cant use Least Squares

Maximum Likelihood Intuitive of the mean

1. Imagine that you have a line of observed values. 2. Then imagine that you test every point on that line for where you get the highest likelihood of observing the data 3. when all areas are checked you pick the one that maximizes the likelihood

likelihood in statistics means

trying to find the optimal value for the mean or std for a distribution

How do wee find the best regression line

maximum likelihood

if p-value is < 0,05

there is a statistically significant association between the response variable and dependent

Consistency means:

Increased sample will make the Beta converge to the real beta

Unbiasedness means

The expected value of the beta will on average be correct. Not an overestimate or underestimate

What are the assumptions on the parameters

* Consitency: increase sample = converge to true population value * Unbiased: Expected B will equal true B

What do we use instead of R2

PseudoR2 (Mcfdden)

is there a reason to use LPM over Probit, Lobit?

- more easy to interpret | - it can be discussed if there are not extreme prop values

how to interpret probit coeff

A positive coefficient means that an increase in the predictor leads to an increase in the predicted probability. A negative coefficient means that an increase in the predictor leads to a decrease in the predicted probability

What are the tests for parameters

z - test: one parameter likelihood ratio test: several parameters

What are the interpretation for the marginal effects in the three models

LPM assumes a that the distribution

Marginal effects of Probit and Logit

Use Probability Distribution Function to find it

When can LPM be used

The basic insight is that the linear probability model can be used whenever the relationship between probability and log odds is approximately linear over the range of modeled probabilities.

rule of thumb for when to use LPM versus logit

- if the probabilities are extreme, like yes/no, close to 0 or 1, logit is better - if they are more moderate, like between 0,20 and 0,8, LPM can be used —then the linear and logistic models fit about equally well, and the linear model should be favored for its ease of interpretation.

LPM is bad with

very large or very small probabilities

Binary Dep for kids Flashcards

(44 cards)