Module 7 Count Data Flashcards

1
Q

Characteristics of count data

A
  • Dependent variable is nonnegative integer
  • Often, few small discrete values
  • Zeros present
  • Right skew
  • Heteroskedastic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the Count Data Models

A
  • Poisson
  • Negative binomial
  • Zero inflated
  • Poisson (zip)
  • Negative binomial (zinb)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Poisson Distribution characteristics

A
  • Where 𝜆 is the mean outcome
  • The variance is also 𝜆
  • Equidispersion – when the mean of a distribution is equal to the variance
    As the mean gets larger, the Poisson approxiates the normal distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Issues with Poisson

A

Overdispersion => solved by robust command in stata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Negative Binomial

A

NB regression estimates an overdispersion parameter 𝛼
if 𝛼 = 0, use Poisson
if 𝛼 > 0, use NB because variance is greater than mean
NB will likely have a more precise estimate, smaller CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Poisson and NB interpretations

A

Regression outputs are semielasticities, may need to exponentiate!
Margins output is interpreted in level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When to use Zero-Inflated Models

A
  • Excess zeros in dependent variable because Poisson and NB underpredict zeros
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Two methods to predict a zero

A

Logit/Poisson or NB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Inflation options

A
  • On a constant
  • On some or all X vars
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interpretation of ZIP

A

Still in semielasticities (%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why shouldnt we multiply ZIP inflated on a CONSTANT coefficient interpretations with the mean?

A

Because of the zero-inflated factor, multiplying the mean would overstate the positive average marginal effects and understate the negative average marginal effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to interpret ZIP CONSTANT margins

A

In level, the margins command incorporates the infaltion factor for us

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to interpret ZIP inflated on X vars?

A
  • Still in semielasticities
  • HOWEVER, interpreting the coefficients is not enough because we need to look at the inflation facts too
  • Simply discussing the y semielasticities would overstate the positive effects and understate negative effects.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to interpret the inflate coefficients?

A

The inflate coefficients are semielasticities on the probability of nonuse
Probably need to exponentiate!
Cannot interpret on their own

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

ZIP X var margins interpretation

A

In level!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ZINB inflated on a CONSTANT

A

Interpretations are still semielastic!

17
Q

ZIP vs ZINB selection

A

Look at chi2 p-value
If chi2 value is low (<0.05), reject the H0 therefore rejecting ZIP
low chi2 (p < 0.05) => use ZINB
high chi2 (p > 0.1) => use ZIP

18
Q

ZINB inflated on some X vars

A

Inflating on all X vars will not converge
Interpretatiosn still in semielasticities

19
Q

Model selection: Alpha

A

Models Chi^2 p-value Conclusion
Standard (uninflated) 0.001 NB outperforms Poisson
Inflated on Constant 0.035 NB outperforms Poisson (at 95% confidence level)
Inflated on Birth Control 0.416 Poisson outperforms NB

20
Q

Model selection: Correlations

A

Higher correlations => better fit

21
Q

Information criterions: AIC BIC

A

The samller ICs are preferred

22
Q
A