Module 7 Count Data Flashcards
Characteristics of count data
- Dependent variable is nonnegative integer
- Often, few small discrete values
- Zeros present
- Right skew
- Heteroskedastic
What are the Count Data Models
- Poisson
- Negative binomial
- Zero inflated
- Poisson (zip)
- Negative binomial (zinb)
Poisson Distribution characteristics
- Where 𝜆 is the mean outcome
- The variance is also 𝜆
- Equidispersion – when the mean of a distribution is equal to the variance
As the mean gets larger, the Poisson approxiates the normal distribution
Issues with Poisson
Overdispersion => solved by robust command in stata
Negative Binomial
NB regression estimates an overdispersion parameter 𝛼
if 𝛼 = 0, use Poisson
if 𝛼 > 0, use NB because variance is greater than mean
NB will likely have a more precise estimate, smaller CI
Poisson and NB interpretations
Regression outputs are semielasticities, may need to exponentiate!
Margins output is interpreted in level
When to use Zero-Inflated Models
- Excess zeros in dependent variable because Poisson and NB underpredict zeros
Two methods to predict a zero
Logit/Poisson or NB
Inflation options
- On a constant
- On some or all X vars
Interpretation of ZIP
Still in semielasticities (%)
Why shouldnt we multiply ZIP inflated on a CONSTANT coefficient interpretations with the mean?
Because of the zero-inflated factor, multiplying the mean would overstate the positive average marginal effects and understate the negative average marginal effects
How to interpret ZIP CONSTANT margins
In level, the margins command incorporates the infaltion factor for us
How to interpret ZIP inflated on X vars?
- Still in semielasticities
- HOWEVER, interpreting the coefficients is not enough because we need to look at the inflation facts too
- Simply discussing the y semielasticities would overstate the positive effects and understate negative effects.
How to interpret the inflate coefficients?
The inflate coefficients are semielasticities on the probability of nonuse
Probably need to exponentiate!
Cannot interpret on their own
ZIP X var margins interpretation
In level!